Dear Experts,
I am trying to run constant pH remd in sander.MPI or pmemd.MPI. But I get
the following error message.
--------------------------------------------------------------------------
The library attempted to open the following supporting CUDA libraries,
but each of them failed. CUDA-aware support is disabled.
libcuda.so.1: cannot open shared object file: No such file or directory
libcuda.dylib: cannot open shared object file: No such file or directory
/usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
directory
/usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
directory
If you are not interested in CUDA-aware support, then run with
--mca mpi_cuda_support 0 to suppress this message. If you are interested
in CUDA-aware support, then try setting LD_LIBRARY_PATH to the location
of libcuda.so.1 to get passed this issue.
--------------------------------------------------------------------------
[hm017][[20768,1],0][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],7][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],9][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],10][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],22][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],24][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],19][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],27][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],2][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],34][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],8][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],15][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],3][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],26][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],23][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],35][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
[hm017][[20768,1],1][btl_openib_component.c:1670:init_one_device] error
obtaining device attributes for mlx5_0 errno says No space left on device
WARNING: There was an error initializing an OpenFabrics device.
Local host: hm017
Local device: mlx5_0
--------------------------------------------------------------------------
Running multipmemd version of pmemd Amber18
Total processors = 40
Number of groups = 10
At line 167 of file constantph.F90 (unit = 23, file = '003/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
At line 167 of file constantph.F90 (unit = 23, file = '002/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
At line 167 of file constantph.F90 (unit = 23, file = '001/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
At line 167 of file constantph.F90 (unit = 23, file = '010/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
At line 167 of file constantph.F90 (unit = 23, file = '004/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
At line 167 of file constantph.F90 (unit = 23, file = '005/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
At line 167 of file constantph.F90 (unit = 23, file = '008/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
At line 167 of file constantph.F90 (unit = 23, file = '007/cpin.rep.033')
Fortran runtime error: Index 1 out of range for namelist variable stateinf
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status,
thus causing
the job to be terminated. The first process to do so was:
Process name: [[20768,1],4]
Exit code: 2
--------------------------------------------------------------------------
[hm017:427747] 39 more processes have sent help message
help-mpi-common-cuda.txt / dlopen failed
[hm017:427747] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages
[hm017:427747] 39 more processes have sent help message
help-mpi-btl-openib.txt / error in device init
How do I resolve it?
--
*With regards,*
*Dulal Mondal,*
*Research Scholar,*
*Department of Chemistry,*
*IIT Kharagpur, Kharagpur 721302.*
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Aug 25 2024 - 23:30:01 PDT