[AMBER] Error in tests for pmemd.cuda.MPI

From: Freddy Bernal via AMBER <amber.ambermd.org>
Date: Mon, 06 Feb 2023 10:36:26 +0100

Dear AMBER community,

I have installed Amber22 with MPI and CUDA support in a workstation (Ubuntu), but after running the tests as described in the Manual (chapter 22.6), I got a large group of errors as shown below. What could I do? I would be very grateful if you could help me.

Best regards,

Freddy

=====
#Begin GTI Tests
------------------------------------
Running CUDA GTI Lambda Replica exchange Scheduling.
cd gti/lambda_remd/multi-window/ && NREP=4 ./Run DPFP
./Run: 22: Title: not found
./Run: 29: CleanFiles: not found
lambda values: 0,00000000000000000000 0,33333333333333333334 0,66666666666666666668 1,00000000000000000000
Invalid MIT-MAGIC-COOKIE-1 key
 Running multipmemd version of pmemd Amber22
    Total processors = 4
    Number of groups = 4

At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '4-windows/mdin.0,00000000000000000000')
Fortran runtime error: Cannot match namelist object name 00000000000000000000

Error termination. Backtrace:
At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '4-windows/mdin.0,33333333333333333334')
Fortran runtime error: Cannot match namelist object name 33333333333333333334

Error termination. Backtrace:
At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '4-windows/mdin.0,66666666666666666668')
Fortran runtime error: Cannot match namelist object name 66666666666666666668

Error termination. Backtrace:
At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '4-windows/mdin.1,00000000000000000000')
Fortran runtime error: Cannot match namelist object name 00000000000000000000

Error termination. Backtrace:
#0 0x7f9ce0e36d21 in ???
#1 0x7f9ce0e37869 in ???
#2 0x7f9ce0e3854f in ???
#3 0x7f9ce107665d in ???
#4 0x7f9ce107fae7 in ???
#5 0x7f9ce107fd0c in ???
#6 0x7f9ce107fe67 in ???
#7 0x558e7b6e82e0 in ???
#8 0x558e7b7ed444 in ???
#9 0x558e7b7cb947 in ???
#10 0x558e7b6b349e in ???
#11 0x7f9ce0c2b082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x558e7b6cf89d in ???
#13 0xffffffffffffffff in ???
#0 0x7f1816685d21 in ???
#1 0x7f1816686869 in ???
#2 0x7f181668754f in ???
#3 0x7f18168c565d in ???
#4 0x7f18168ceae7 in ???
#5 0x7f18168ced0c in ???
#6 0x7f18168cee67 in ???
#7 0x55e8cef5f2e0 in ???
#8 0x55e8cf064444 in ???
#9 0x55e8cf042947 in ???
#10 0x55e8cef2a49e in ???
#11 0x7f181647a082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x55e8cef4689d in ???
#13 0xffffffffffffffff in ???
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
#0 0x7f486575fd21 in ???
#1 0x7f4865760869 in ???
#2 0x7f486576154f in ???
#3 0x7f486599f65d in ???
#4 0x7f48659a8ae7 in ???
#5 0x7f48659a8d0c in ???
#6 0x7f48659a8e67 in ???
#7 0x55dd693522e0 in ???
#8 0x55dd69457444 in ???
#9 0x55dd69435947 in ???
#10 0x55dd6931d49e in ???
#11 0x7f4865554082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x55dd6933989d in ???
#13 0xffffffffffffffff in ???
#0 0x7fa59161fd21 in ???
#1 0x7fa591620869 in ???
#2 0x7fa59162154f in ???
#3 0x7fa59185f65d in ???
#4 0x7fa591868ae7 in ???
#5 0x7fa591868d0c in ???
#6 0x7fa591868e67 in ???
#7 0x55755f7212e0 in ???
#8 0x55755f826444 in ???
#9 0x55755f804947 in ???
#10 0x55755f6ec49e in ???
#11 0x7fa591414082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x55755f70889d in ???
#13 0xffffffffffffffff in ???
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[17085,1],0]
  Exit code: 2

==============================================================
cd gti/lambda_remd/multi-window/ && NREP=5 ./Run DPFP
./Run: 22: Title: not found
./Run: 29: CleanFiles: not found
lambda values: 0,00000000000000000000 0,25000000000000000000 0,50000000000000000000 0,75000000000000000000 1,00000000000000000000
Invalid MIT-MAGIC-COOKIE-1 key
 Running multipmemd version of pmemd Amber22
    Total processors = 5
    Number of groups = 5

At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '5-windows/mdin.0,25000000000000000000')
Fortran runtime error: Cannot match namelist object name 25000000000000000000

Error termination. Backtrace:
At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '5-windows/mdin.0,75000000000000000000')
Fortran runtime error: Cannot match namelist object name 75000000000000000000

Error termination. Backtrace:
At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '5-windows/mdin.0,00000000000000000000')
Fortran runtime error: Cannot match namelist object name 00000000000000000000

Error termination. Backtrace:
At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '5-windows/mdin.1,00000000000000000000')
Fortran runtime error: Cannot match namelist object name 00000000000000000000

Error termination. Backtrace:
At line 975 of file /home/tga_user/AmberInstallation/amber22_src/src/pmemd/src/mdin_ctrl_dat.F90 (unit = 5, file = '5-windows/mdin.0,50000000000000000000')
Fortran runtime error: Cannot match namelist object name 50000000000000000000

Error termination. Backtrace:
#0 0x7fea4ae6ad21 in ???
#1 0x7fea4ae6b869 in ???
#2 0x7fea4ae6c54f in ???
#3 0x7fea4b0aa65d in ???
#4 0x7fea4b0b3ae7 in ???
#5 0x7fea4b0b3d0c in ???
#6 0x7fea4b0b3e67 in ???
#7 0x5607396a92e0 in ???
#8 0x5607397ae444 in ???
#0 0x7f36ac404d21 in ???
#1 0x7f36ac405869 in ???
#2 0x7f36ac40654f in ???
#3 0x7f36ac64465d in ???
#4 0x7f36ac64dae7 in ???
#5 0x7f36ac64dd0c in ???
#6 0x7f36ac64de67 in ???
#7 0x5643a1c992e0 in ???
#8 0x5643a1d9e444 in ???
#9 0x5643a1d7c947 in ???
#10 0x5643a1c6449e in ???
#0 0x7fc33f427d21 in ???
#1 0x7fc33f428869 in ???
#2 0x7fc33f42954f in ???
#3 0x7fc33f66765d in ???
#4 0x7fc33f670ae7 in ???
#5 0x7fc33f670d0c in ???
#6 0x7fc33f670e67 in ???
#7 0x5558ebeae2e0 in ???
#8 0x5558ebfb3444 in ???
#9 0x5558ebf91947 in ???
#10 0x5558ebe7949e in ???
#9 0x56073978c947 in ???
#10 0x56073967449e in ???
#11 0x7f36ac1f9082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x5643a1c8089d in ???
#13 0xffffffffffffffff in ???
#11 0x7fea4ac5f082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x56073969089d in ???
#13 0xffffffffffffffff in ???
#11 0x7fc33f21c082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x5558ebe9589d in ???
#13 0xffffffffffffffff in ???
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
#0 0x7fb28a581d21 in ???
#1 0x7fb28a582869 in ???
#2 0x7fb28a58354f in ???
#3 0x7fb28a7c165d in ???
#4 0x7fb28a7caae7 in ???
#5 0x7fb28a7cad0c in ???
#6 0x7fb28a7cae67 in ???
#7 0x5651db2372e0 in ???
#8 0x5651db33c444 in ???
#9 0x5651db31a947 in ???
#10 0x5651db20249e in ???
#11 0x7fb28a376082 in __libc_start_main
        at ../csu/libc-start.c:308
#12 0x5651db21e89d in ???
#13 0xffffffffffffffff in ???
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[17189,1],0]
  Exit code: 2
--------------------------------------------------------------------------
./Run: 47: error: not found
possible FAILURE: file 5-windows/rem.log does not exist.





_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Feb 06 2023 - 02:00:02 PST
Custom Search