[AMBER] Error Message while using GPU

From: Sivanandam Magudeeswaran <sivanandamphy.gmail.com>
Date: Sun, 28 Jun 2020 11:23:34 +0530

Dear Amber User,
                                 Recently we have installed Amber 20 in
Fujitsu HPC machine. While running amber jobs using pmemd.cuda_DPFP.MPI
routine with 4 nodes (each node has 24 processors), it gives the following
error. Please give your valuable suggestions to rectify that error.

default, for Open MPI 4.0 and later, infiniband ports on a deviceare not
used by default. The intent is to use UCX for these devices.You can
override this policy by setting the btl_openib_allow_ib MCA parameterto
true. Local host: comp-gpu0 Local adapter: mlx5_0
Local port:
There was an error initializing an OpenFabrics device. Local host:
comp-gpu0 Local device:
exceeded for step 52; vmax = 5669.1427vlimit exceeded for step 52;
vmax = 89983.7545vlimit exceeded for step 52; vmax = 394.1534vlimit
exceeded for step 52; vmax =
was invoked on rank 89 in communicator MPI_COMM_WORLDwith errorcode 1.NOTE:
invoking MPI_ABORT causes Open MPI to kill all MPI processes.You may or may
not see output from other processes, depending onexactly when Open MPI
95 more processes have sent help message help-mpi-btl-openib.txt / ib port
not selected[comp-gpu0.local:226093] Set MCA parameter
"orte_base_help_aggregate" to 0 to see all help / error
messages[comp-gpu0.local:226093] 95 more processes have sent help message
help-mpi-btl-openib.txt / error in device init[comp-gpu0.local:226093] 1
more process has sent help message help-mpi-api.txt / mpi-abort*
*Guest Lecturer *

*Department of Physics*
*School of Physical Sciences*
*Periyar University*
*Salem-636 011*
*Mobile- 9965582730.*
AMBER mailing list
Received on Sat Jun 27 2020 - 23:00:03 PDT
Custom Search