Re: [AMBER] quick.out problem when running TI sander.quick.cuda.MPI

From: Goetz, Andreas via AMBER <amber.ambermd.org>
Date: Wed, 3 Apr 2024 03:14:59 +0000

Hi Fernando,

We have not tested QM/MM TI with sander.quick.cuda.MPI. I am not sure if it would work.

Can you please send me your input files so we can test this? We will report back to the mailing list after testing. Thanks.

All the best,
Andy


Dr. Andreas W. Goetz
Associate Research Scientist
San Diego Supercomputer Center
Tel: +1-858-822-4771
Email: agoetz.sdsc.edu
Web: www.awgoetz.de

On Apr 2, 2024, at 11:01 AM, Montalvillo, Fernando via AMBER <amber.ambermd.org> wrote:

Hi,

I am trying to run QM/MM TI with sander.quick.cuda.MPI.

I have been able to run MM TI wtih sander.MPI on the same setup and it runs fine. Also, QM/MM with sander.quick.cuda.MPI works if I use it for either state independently but not when I try to do TI on both of them.

I am assuming that it is because both systems are printing to the same quick.out file. I have tried to use the outprefix keyword to print different output files but regardless of that it still prints the same info for both quick.out despite having different file names.

Thanks beforehand for your help.


I get this error:

Running multisander version of sander Amber22
   Total processors = 2
   Number of groups = 2

[compute-2-01-03:1576725:0:1576725] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7735c610)
==== backtrace (tid:1576725) ====
0 /opt/ohpc/pub/mpi/ucx-ohpc/1.11.2/lib/libucs.so.0(ucs_handle_error+0x254) [0x7f25fe2a9b94]
1 /opt/ohpc/pub/mpi/ucx-ohpc/1.11.2/lib/libucs.so.0(+0x27d4c) [0x7f25fe2a9d4c]
2 /opt/ohpc/pub/mpi/ucx-ohpc/1.11.2/lib/libucs.so.0(+0x27ff8) [0x7f25fe2a9ff8]
3 /lib64/libpthread.so.0(+0x12cf0) [0x7f26419b3cf0]
4 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(__quick_sad_guess_module_MOD_get_sad_density_matrix+0x1b9) [0x7f2643051db9]
5 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(initialguess_+0x67) [0x7f264315b567]
6 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(getmol_+0x12c) [0x7f264315b9ac]
7 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(+0xb33b1) [0x7f26430393b1]
8 /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/lib/libquick_mpi_cuda.so(__quick_api_module_MOD_get_quick_energy_gradients+0x453) [0x7f2643039f73]
9 sander.quick.cuda.MPI(__quick_module_MOD_get_quick_qmmm_forces+0x717) [0x8325fc]
10 sander.quick.cuda.MPI(qm_mm_+0x2479) [0x7c8089]
11 sander.quick.cuda.MPI(force_+0x67c) [0x61bafc]
12 sander.quick.cuda.MPI(runmin_+0x7de) [0x693ace]
13 sander.quick.cuda.MPI(sander_+0x90cb) [0x67e625]
14 sander.quick.cuda.MPI() [0x6747cf]
15 sander.quick.cuda.MPI(main+0x34) [0x6748a7]
16 /lib64/libc.so.6(__libc_start_main+0xe5) [0x7f263f4e3d85]
17 sander.quick.cuda.MPI(_start+0x2e) [0x46143e]
=================================

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7f263fcef0c0 in ???
#1 0x7f263fcee2f5 in ???
#2 0x7f26419b3cef in ???
#3 0x7f2643051db9 in ???
#4 0x7f264315b566 in ???
#5 0x7f264315b9ab in ???
#6 0x7f26430393b0 in ???
#7 0x7f2643039f72 in ???
#8 0x8325fb in ???
#9 0x7c8088 in ???
#10 0x61bafb in ???
#11 0x693acd in ???
#12 0x67e624 in ???
#13 0x6747ce in ???
#14 0x6748a6 in ???
#15 0x7f263f4e3d84 in ???
#16 0x46143d in ???
#17 0xffffffffffffffff in ???
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 1576725 on node compute-2-01-03 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------


Also, here is an example of my input file:
Minimization of everything
&cntrl
 imin=1,
 maxcyc=2000,
 ntmin=2,
 ntb=1,
 ntp=0,
 ntf=1,
 ntc=1,
 ntwe=1,
 ntpr=1,
 ntwr=1,
 cut=12.0,
 icfe = 1, ifsc = 1, clambda = 0.5, scalpha = 0.5, scbeta = 12.0,
 logdvdl = 0,
 scmask = ':657.CU',
 ifqnt = 1, ! Activate QM/MM
/
&qmmm
 iqmatoms=1220,1221,1222,1223,1224,1225,1226,1227,1228,1229,1230,1285,1286,1287,1288,1289,1290,1291,1292,1293,1294,1295,1296,4657,4658,4659,4660,4661,4682,4683,4684,4685,4686,10116 ! Selection of QM region
 qmcharge=1, ! Total charge of the QM region
 qm_ewald=0,
 qm_theory='quick', ! Specifies the QUICK API
 qmmm_int=1, ! Specifies electrostatic embedding
 qmshake=1,
 verbosity=1
/
&quick
 method = 'B3LYP', ! Method for QM calculation
 basis = 'A-PVDZ', ! Basis set for QM calculation
 debug=1,
/


And here is an example of what my slurm file:
#!/bin/bash

#SBATCH --job-name=QM-Bound # Job name
#SBATCH --output=error.out
#SBATCH --nodes=1 # Total number of nodes requested
#SBATCH --ntasks=2 # Number of mpi tasks requested
#SBATCH --cpus-per-task=1
#SBATCH --mem=128G # Number of mpi tasks requested
#SBATCH --gres=gpu:2 # Number of gpus requested
#SBATCH -t 48:00:00 # Run time (hh:mm:ss) - 48 hours
#SBATCH --partition=torabifard

module load openmpi4
module load cuda
source /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/amber.sh
export PATH=/mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22_src/bin:$PATH
export LD_LIBRARY_PATH=/mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22_src/lib:$LD_LIBRARY_PATH

#mpirun -np 2 sander.MPI -ng 2 -groupfile min-lipids.group

mpirun -np 2 sander.quick.cuda.MPI -ng 2 -groupfile min-qm-mm.group

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
https://urldefense.com/v3/__http://lists.ambermd.org/mailman/listinfo/amber__;!!Mih3wA!DUbtyvxfdByOYHNGoeXVVUjiHGN6hoGT2HBwNY0efIoCpQbFl3rJfHOrkeVp-zOKwUj7CdN_jygdOQ$

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Apr 02 2024 - 20:30:01 PDT
Custom Search