[AMBER] quick.out problem when running TI sander.quick.cuda.MPI

From: Montalvillo, Fernando via AMBER <amber.ambermd.org>
Date: Tue, 2 Apr 2024 18:01:45 +0000

Hi,

I am trying to run QM/MM TI with sander.quick.cuda.MPI.

I have been able to run MM TI wtih sander.MPI on the same setup and it runs fine. Also, QM/MM with sander.quick.cuda.MPI works if I use it for either state independently but not when I try to do TI on both of them.

I am assuming that it is because both systems are printing to the same quick.out file. I have tried to use the outprefix keyword to print different output files but regardless of that it still prints the same info for both quick.out despite having different file names.

Thanks beforehand for your help.


I get this error:

 Running multisander version of sander Amber22
    Total processors = 2
    Number of groups = 2

[compute-2-01-03:1576725:0:1576725] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7735c610)
==== backtrace (tid:1576725) ====
 0 /opt/ohpc/pub/mpi/ucx-ohpc/1.11.2/lib/libucs.so.0(ucs_handle_error+0x254) [0x7f25fe2a9b94]
 1 /opt/ohpc/pub/mpi/ucx-ohpc/1.11.2/lib/libucs.so.0(+0x27d4c) [0x7f25fe2a9d4c]
 2 /opt/ohpc/pub/mpi/ucx-ohpc/1.11.2/lib/libucs.so.0(+0x27ff8) [0x7f25fe2a9ff8]
 3 /lib64/libpthread.so.0(+0x12cf0) [0x7f26419b3cf0]
 4 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(__quick_sad_guess_module_MOD_get_sad_density_matrix+0x1b9) [0x7f2643051db9]
 5 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(initialguess_+0x67) [0x7f264315b567]
 6 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(getmol_+0x12c) [0x7f264315b9ac]
 7 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(+0xb33b1) [0x7f26430393b1]
 8 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(__quick_api_module_MOD_get_quick_energy_gradients+0x453) [0x7f2643039f73]
 9 sander.quick.cuda.MPI(__quick_module_MOD_get_quick_qmmm_forces+0x717) [0x8325fc]
10 sander.quick.cuda.MPI(qm_mm_+0x2479) [0x7c8089]
11 sander.quick.cuda.MPI(force_+0x67c) [0x61bafc]
12 sander.quick.cuda.MPI(runmin_+0x7de) [0x693ace]
13 sander.quick.cuda.MPI(sander_+0x90cb) [0x67e625]
14 sander.quick.cuda.MPI() [0x6747cf]
15 sander.quick.cuda.MPI(main+0x34) [0x6748a7]
16 /lib64/libc.so.6(__libc_start_main+0xe5) [0x7f263f4e3d85]
17 sander.quick.cuda.MPI(_start+0x2e) [0x46143e]
=================================

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7f263fcef0c0 in ???
#1 0x7f263fcee2f5 in ???
#2 0x7f26419b3cef in ???
#3 0x7f2643051db9 in ???
#4 0x7f264315b566 in ???
#5 0x7f264315b9ab in ???
#6 0x7f26430393b0 in ???
#7 0x7f2643039f72 in ???
#8 0x8325fb in ???
#9 0x7c8088 in ???
#10 0x61bafb in ???
#11 0x693acd in ???
#12 0x67e624 in ???
#13 0x6747ce in ???
#14 0x6748a6 in ???
#15 0x7f263f4e3d84 in ???
#16 0x46143d in ???
#17 0xffffffffffffffff in ???
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 1576725 on node compute-2-01-03 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------


Also, here is an example of my input file:
Minimization of everything
 &cntrl
  imin=1,
  maxcyc=2000,
  ntmin=2,
  ntb=1,
  ntp=0,
  ntf=1,
  ntc=1,
  ntwe=1,
  ntpr=1,
  ntwr=1,
  cut=12.0,
  icfe = 1, ifsc = 1, clambda = 0.5, scalpha = 0.5, scbeta = 12.0,
  logdvdl = 0,
  scmask = ':657.CU',
  ifqnt = 1, ! Activate QM/MM
/
&qmmm
  iqmatoms=1220,1221,1222,1223,1224,1225,1226,1227,1228,1229,1230,1285,1286,1287,1288,1289,1290,1291,1292,1293,1294,1295,1296,4657,4658,4659,4660,4661,4682,4683,4684,4685,4686,10116 ! Selection of QM region
  qmcharge=1, ! Total charge of the QM region
  qm_ewald=0,
  qm_theory='quick', ! Specifies the QUICK API
  qmmm_int=1, ! Specifies electrostatic embedding
  qmshake=1,
  verbosity=1
/
&quick
  method = 'B3LYP', ! Method for QM calculation
  basis = 'A-PVDZ', ! Basis set for QM calculation
  debug=1,
/


And here is an example of what my slurm file:
#!/bin/bash

#SBATCH --job-name=QM-Bound # Job name
#SBATCH --output=error.out
#SBATCH --nodes=1 # Total number of nodes requested
#SBATCH --ntasks=2 # Number of mpi tasks requested
#SBATCH --cpus-per-task=1
#SBATCH --mem=128G # Number of mpi tasks requested
#SBATCH --gres=gpu:2 # Number of gpus requested
#SBATCH -t 48:00:00 # Run time (hh:mm:ss) - 48 hours
#SBATCH --partition=torabifard

module load openmpi4
module load cuda
source /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/amber.sh
export PATH=/mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22_src/bin:$PATH
export LD_LIBRARY_PATH=/mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22_src/lib:$LD_LIBRARY_PATH

#mpirun -np 2 sander.MPI -ng 2 -groupfile min-lipids.group

mpirun -np 2 sander.quick.cuda.MPI -ng 2 -groupfile min-qm-mm.group

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Apr 02 2024 - 11:30:02 PDT
Custom Search