Re: [AMBER] Quick QM/MM fails with GPUs but not with CPUs from Montalvillo, Fernando via AMBER on 2024-03-06 (Amber Archive Mar 2024)

From: Montalvillo, Fernando via AMBER <amber.ambermd.org>
Date: Wed, 6 Mar 2024 23:40:02 +0000

Hi Madu,

Thanks for your quick response. Here is what it prints when using sander.quick.cuda:

| TASK STARTS ON: Wed Mar 6 16:08:53 2024
| INPUT FILE : quick.in
| OUTPUT FILE: quick.out
| DATA FILE : quick.dat
| BASIS SET PATH: /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/AmberTools/src/quick/basis

|------------ GPU INFORMATION ---------------
| CUDA ENABLED DEVICE : 1
| CUDA DEVICE IN USE : 0
| CUDA DEVICE NAME : NVIDIA GeForce RTX 3090
| CUDA DEVICE PM : 82
| CUDA DEVICE CORE FREQ(GHZ) : 1.70
| CUDA DEVICE MEMORY SIZE (MB): 24259
| SUPPORTING CUDA VERSION : 8.6
|--------------------------------------------

. Read Job And Atom

It seems like I was doing something wrong before or I cannot understand why I cannot run sander.quick.cuda.MPI anymore?

mpirun -np 4 sander.quick.cuda.MPI -O -i 01_Min1.in -o 01_Min1.out -p afCopAE1.prmtop -c Min0_lipids.rst -r Min1.rst -ref Min0_lipids.rst

It goes into queue but it does not print a quick.out and the mdout does not print any step (it is like it is frozen)

However, if I run directly without mpi run -np 4:

sander.quick.cuda.MPI -O -i 01_Min1.in -o 01_Min1.out -p afCopAE1.prmtop -c Min0_lipids.rst -r Min1.rst -ref Min0_lipids.rst

I get this quick.out (so the executable should be totally fine, right?, the problem should be my mpi?):

| TASK STARTS ON: Wed Mar 6 16:36:11 2024
| INPUT FILE : quick.in
| OUTPUT FILE: quick.out
| DATA FILE : quick.dat
| BASIS SET PATH: /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/AmberTools/src/quick/basis

| - MPI Enabled -
| TOTAL RANKS = 1
| MASTER NAME = compute-2-01-01

|------------ GPU INFORMATION -------------------------------
| CUDA ENABLED DEVICES : 1
|
| -- MPI RANK 0 --
| CUDA DEVICE IN USE : 0
| CUDA DEVICE NAME : NVIDIA GeForce RTX 3090
| CUDA DEVICE PM : 82
| CUDA DEVICE CORE FREQ(GHZ) : 1.70
| CUDA DEVICE MEMORY SIZE (MB): 24259
| SUPPORTING CUDA VERSION : 8.6
|------------------------------------------------------------

. Read Job And Atom

I also tried to run sander.quick.cuda and increase the number of gpus in the #sbatch command. It is able to detect 4 enable devices but of course it just chooses one.
________________________________
From: Manathunga Mudiyanselage, Madushanka <manathun.msu.edu>
Sent: Wednesday, March 6, 2024 3:58 PM
To: Montalvillo, Fernando <Fernando.Montalvillo.UTDallas.edu>
Cc: Goetz, Andreas <awgoetz.ucsd.edu>; AMBER Mailing List <amber.ambermd.org>
Subject: Re: [AMBER] Quick QM/MM fails with GPUs but not with CPUs

Hi,

You are launching sander.quick.cuda.MPI executable correctly. Each MPI process should be assigned a GPU.

I have run the sander.MPI before so I know I am exporting all the librairies correctly, but it seems my job only runs on CPUs. Do you know what I am doing wrong?

This is odd. Can you check the quick output file and confirm this? GPU information is printed at the very beginning. You should see something similar to this: https://pastebin.com/tKMhvft0

Best regards,
Madu Manathunga

On Mar 6, 2024, at 1:45 PM, Montalvillo, Fernando <Fernando.Montalvillo.UTDallas.edu> wrote:

Andy,

Thanks for your response! Based on this information I decided to upload the LANL2DZ basis set which has no f functions and it is one of the most recommended for TM ions. Seems to be working so far! So again, thank you!

Also, I am not sure if this is a good question. But I don't know how to run sander.quick.cuda.MPI with GPUs. It seems I can only request CPUs. The HPC I uses SLURM manager.

#SBATCH --ntasks=4 # Number of mpi tasks requested
#SBATCH --gres=gpu:4 # Number of gpus requested
I also requested the maximum amount of memory of the node, just in case,

That should request 4 GPUs and 4 CPUs and then,

mpirun -np 4 sander.quick.cuda.MPI -O .... (rest of the options)

I have run the sander.MPI before so I know I am exporting all the librairies correctly, but it seems my job only runs on CPUs. Do you know what I am doing wrong? Or could you let me know how you did it for your publication with Vinicius W. Cruzeiro, et al.?

#!/bin/bash

#SBATCH --job-name=QM-LANL # Job name
#SBATCH --output=error.out
#SBATCH --nodes=1 # Total number of nodes requested
#SBATCH --ntasks=4 # Number of mpi tasks requested
#SBATCH --gres=gpu:4 # Number of gpus requested
#SBATCH -t 48:00:00 # Run time (hh:mm:ss) - 48 hours
#SBATCH --partition=torabifard
#SBATCH --mail-user=fxm200013.utdallas.edu<mailto:mail-user=fxm200013.utdallas.edu>
#SBATCH --mail-type=all

module load cuda
source /mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22/amber.sh
export PATH=/mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22_src/bin:$PATH
export LD_LIBRARY_PATH=/mfs/io/groups/torabifard/Amber20-mpi-Fernando/amber22_src/lib:$LD_LIBRARY_PATH

mpirun -np 4 sander.quick.cuda.MPI -O -i 01_Min1.in -o 01_Min1.out -p afCopAE1.prmtop -c Min0_lipids.rst -r Min1.rst -ref Min0_lipids.rst

Best regards,
Fernando

________________________________
From: Goetz, Andreas <awgoetz.ucsd.edu>
Sent: Tuesday, March 5, 2024 3:44 PM
To: Montalvillo, Fernando <Fernando.Montalvillo.UTDallas.edu>; AMBER Mailing List <amber.ambermd.org>
Cc: Manathunga Mudiyanselage, Madushanka <manathun.msu.edu>
Subject: Re: [AMBER] Quick QM/MM fails with GPUs but not with CPUs

Hi Fernando,

The QUICK version that ships with AmberTools 23 does not support f functions. 6-31G(d), cc-pVTZ and def2-SVPD basis sets contain f functions for Cu. The error message that should be generated has inadvertently been deactivated. You would have to use a basis set that does not contain f functions.

F functions will be supported with the next release, QUICK 24.03 and AmberTools 24. Initially only for closed shells (e.g. Cu+ but not Cu2+) on GPUs, closed and open shells on CPUs.

All the best,
Andy

—
Dr. Andreas W. Goetz
Associate Research Scientist
San Diego Supercomputer Center
Tel: +1-858-822-4771
Email: agoetz.sdsc.edu
Web: www.awgoetz.de

On Mar 5, 2024, at 8:58 PM, Montalvillo, Fernando via AMBER <amber.ambermd.org> wrote:

Hi,

I am trying to run some QM/MM calculations with Sander.quick.cuda, but the energy minimization of the QM region is being troublesome.

I did a MM minimization of the lipids and solvent molecules of my system (restraining protein atoms and Cu ion) due to lipids always having very bad clashes.

The next minimization is already with QM/MM. I have used multiple basis set with HF, B3LYP and O3LYP and the results are as follows:

Regardless of HF or the other DFT methods I have used, QM energy explodes when using GPUs and higher accuracy basis sets suchs as 6-31G(d), cc-pVDZ, def2-SVPD. But if I use lower accuracy methods such as 6-31G or 3-21G, it can run on GPUs and energies look fine (QM atoms don't seem to be manipulated during the minimization only MM atoms).

Using CPUs, I can run with the more accurate basis sets, but it is slow, and the QM atoms also don't seem to be manipulated during the minimization. So, when I use the restart file to continue next step with GPUs (maybe heating or another minimization), the energies explode again despite using the same method and basis set that had been used in the minimization with CPUs.

Can you point at what I am doing wrong or what should I check? This is my first QM/MM simulation, so I don't know what I am doing.

Thanks for your invaluable help and time.

Fernando
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
https://urldefense.com/v3/__http://lists.ambermd.org/mailman/listinfo/amber__;!!Mih3wA!GOclVpA2BKKnXDQFyIBlRRVPGA4eH66BOYV4wji5yPhpoZlwiJtuaiiZmnya62nsM1iekNbT8PeTQw$

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Mar 06 2024 - 16:00:02 PST