Re: [AMBER] AMBER-ORCA Interface from James Kress via AMBER on 2024-10-28 (Amber Archive Oct 2024)

From: James Kress via AMBER <amber.ambermd.org>
Date: Mon, 28 Oct 2024 12:50:26 -0400

When ORCA was updated to v6, significant changes were made in the text of
the output file(s) and the files generated.

I'd suggest you post your problem on the ORCA forum

https://orcaforum.kofo.mpg.de/

Jim Kress

-----Original Message-----
From: Ramdhan,Peter A via AMBER <amber.ambermd.org>
Sent: Monday, October 28, 2024 9:52 AM
To: Martin Juhás <juhasm.faf.cuni.cz>; AMBER Mailing List
<amber.ambermd.org>
Subject: Re: [AMBER] AMBER-ORCA Interface

Hi Martin,

Thank you for the help! That did work, however now I am running into another
issue. I am trying to now use ORCA 6.0 coupled with AMBER 24. Here is my
input:

ASMD simulation
&cntrl
   imin = 0, nstlim = 238, dt = 0.001,
   ntx = 1, temp0 = 310,tempi=310,
   ntt = 3, gamma_ln=5.0,
   ntc = 2, ntf = 2, ntb =1,
   ntwx = 5, ntwr = 5, ntpr = 5,
   cut = 8.0, ig=-1, ioutfm=1,
   irest = 0, jar=1,
   ifqnt=1, ! Turn on QM/MM
/
&qmmm

qmmask=':MOL.C5,C6,C8,H8,C10,H9,S1|:CYP.SG,CB|:HEM.FE,O1,NA,NB,NC,ND,C1C,C2C
,C3C,C4C,CHD,HHD,C1D,C2D,C3D,C4D,CHA,HHA,C1A,C2A,C3A,C4A,CHB,HHB,C1B,C2B,C3B
,C4B,CHC,HHC', ! QM region, specifying residues 1 and 465
  qmmm_int=1, !
  qm_theory='EXTERN', !
  qmcharge=-2,
  spin=4,
  qmshake=0,
  qm_ewald = 0,
  qm_pme=0,
/
&orc
  method = 'b3lyp',
  basis = '6-31G**',
  num_threads=16,
  maxcore=3000,
/
&wt type='DUMPFREQ', istep1=5 /
&wt type='END' /
DISANG=dist.RST.dat.1
DUMPAVE=asmd_24.work.dat.1
LISTIN=POUT
LISTOUT=POUT

My slurm file:

#!/bin/bash

#SBATCH --job-name=ASMD_stage24
#SBATCH --mail-type=END,FAIL
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=4GB
#SBATCH --time=4-00:00:00

echo "Running orca test calculation on a with 16 CPU cores"
echo "Date = $(date)"
echo "Hostname = $(hostname -s)"

echo "Working Directory = $(pwd)"
echo ""
echo "Number of Nodes Allocated = $SLURM_JOB_NUM_NODES"
echo "Number of Tasks Allocated = $SLURM_NTASKS"
echo "Number of Cores/Task Allocated = $SLURM_CPUS_PER_TASK"
echo ""

# Load required modules
module purge
module load cuda/12.4.1 gcc/12.2.0 openmpi/4.1.6 amber/24

# Set ORCA path and ensure mpirun is accessible export
orcadir=/blue/lic/pramdhan1/orca-6.0.0
export PATH=$orcadir:$PATH
export PATH=/apps/mpi/cuda/12.4.1/gcc/12.2.0/openmpi/4.1.6/bin:$PATH
export
LD_LIBRARY_PATH=/apps/mpi/cuda/12.4.1/gcc/12.2.0/openmpi/4.1.6/lib:$LD_LIBRA
RY_PATH

# Suppress CUDA-aware support warning in OpenMPI export
OMPI_MCA_opal_warn_on_missing_libcuda=0

# Check ORCA and mpirun paths
which mpirun
which orca

echo $PATH

echo $LD_LIBRARY_PATH

# Run Amber with sander in the local scratch directory $AMBERHOME/bin/sander
-O -i asmd_24.1.mdin -o asmd_24.1.out \
  -p "com.parm7" \
  -c "readySMD.ncrst" \
  -r asmd_24.1.ncrst -x asmd_24.1.nc \
  -ref "readySMD.ncrst" \
  -inf asmd_24.1.info

My error:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!! FATAL ERROR ENCOUNTERED !!!
!!! ----------------------- !!!
!!! I/O OPERATION FAILED !!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!! FATAL ERROR ENCOUNTERED !!!
!!! ----------------------- !!!
!!! I/O OPERATION FAILED !!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!! FATAL ERROR ENCOUNTERED !!!
!!! ----------------------- !!!
!!! I/O OPERATION FAILED !!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned a non-zero exit
code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus
causing the job to be terminated. The first process to do so was:

  Process name: [[36284,1],3]
  Exit code: 64
--------------------------------------------------------------------------
[file orca_tools/qcmsg.cpp, line 394]:
  .... aborting the run

It performs one calculation and returns the above error. I'm not sure if
this is an ORCA 6.0 issue or an mpirun issue. There are other sections
within one calculation that utilize mpirun, however, it works fine there.
It's just towards the ending, prior to when a new calculation would start,
that it fails.

Sincerely,

Peter Ramdhan

________________________________
From: Martin Juhás <juhasm.faf.cuni.cz>
Sent: Friday, October 11, 2024 2:29 PM
To: Ramdhan,Peter A <pramdhan1.ufl.edu>; AMBER Mailing List
<amber.ambermd.org>
Subject: Re: AMBER-ORCA Interface

[External Email]
Hi, this looks like a problem with the MPI and/or UCX library. Try running
somr test job only with orca using more than 1 cpu if that works.

Best,

Martin

Odoslané z aplikácie Outlook pre Android<https://aka.ms/AAb9ysg>
________________________________
From: Ramdhan,Peter A via AMBER <amber.ambermd.org>
Sent: Friday, October 11, 2024 3:08:47 PM
To: AMBER Mailing List <amber.ambermd.org>
Subject: [AMBER] AMBER-ORCA Interface

[EXTERNAL EMAIL]

Hi everyone,

I have a question about using QM with ORCA as the external program for
AMBER. When I perform this calculation in serial, it works fine, however,
when using parallel it fails after a couple of steps. Does anyone have
experience with this?

Here is my mdin file:

&cntrl
imin = 0, ! Perform MD, not minimization
irest = 1, ! Restart simulation from previous run
ntx = 5, ! Coordinates and velocities from the restart file
nstlim =100, ! Number of MD steps = 1 ps
dt = 0.0005, ! Time step in picoseconds
cut = 8.0, ! Non-bonded cutoff in angstroms

ntr = 0, ! No positional restraints
restraint_wt = 0.0, ! Weight of restraint (no restraints applied)

ntb = 2, ! Constant pressure periodic boundary conditions
ntp = 1, ! Isotropic position scaling (NPT ensemble)
barostat = 1, ! Berendsen pressure control

ntc = 2, ! SHAKE on bonds involving hydrogen
ntf = 2, ! Bond interactions with hydrogens excluded

ntt = 3, ! Langevin thermostat
gamma_ln = 5.0, ! Collision frequency for Langevin dynamics
tempi = 310, ! Initial temperature
temp0 = 310, ! Target temperature

ioutfm = 1, ! Write binary trajectory file
ntpr = 1, ! Print energy information every 500 steps
ntwx = 1, ! Write coordinates to trajectory file every 500 steps
ntwr = 1, ! Write restart file every 500 steps

ifqnt=1, ! Turn on QM/MM
/

&qmmm

qmmask=':CYP.SG,CB|:HEM.FE,O1,NA,NB,NC,ND,C1C,C2C,C3C,C4C,CHD,HHD,C1D,C2D,C3
D,C4D,CHA,HHA,C1A,C2A,C3A,C4A,CHB,HHB,C1B,C2B,C3B,C4B,CHC,HHC', ! QM
region, specifying residues 1 and 465
qmmm_int=1, !
qm_theory='EXTERN', !
qmcharge=-2,
spin=4,
qmshake=0,
qm_ewald = 0,
qm_pme=0,
/
&orc
method = 'bp86',
basis = 'sv(p)',
num_threads=8,
maxcore=2000,
/

&wt
type='END'
&end

And here is my slurm file:

#!/bin/bash
#SBATCH --job-name=clop_qm
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=4GB
#SBATCH --partition=gpu
#SBATCH --gres=gpu:a100:1
#SBATCH --time=4-00:00:00
#SBATCH --output=job.%j.out
#SBATCH --error=job.%j.err

module purge
ml gcc
#ml openmpi/4.1.1
export PATH=/apps/gcc/12.2.0/openmpi/4.1.1/orca/5.0.4:$PATH
export
LD_LIBRARY_PATH=/apps/gcc/12.2.0/openmpi/4.1.1/orca/5.0.4:$LD_LIBRARY_PATH
export PATH=/apps/mpi/gcc/12.2.0/openmpi/4.1.1/bin:$PATH
export
LD_LIBRARY_PATH=/apps/mpi/gcc/12.2.0/openmpi/4.1.1/lib:$LD_LIBRARY_PATH
source $AMBERHOME/amber.sh

$AMBERHOME/bin/sander -O -i step8_qm.mdin -o step8_qm.out -p com.parm7 -c
step6.ncrst -r step7_qm.ncrst -x step7_qm.nc -ref step6.ncrst -inf
step7_qm.info

I am encountering this error after a couple of steps:

------------------------- --------------------
FINAL SINGLE POINT ENERGY -2762.899076990582
------------------------- --------------------

[1728651922.027067] [c0800a-s17:1471414:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471416/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.028197] [c0800a-s17:1471414:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471416/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.028735] [c0800a-s17:1471414:0] mm_sysv.c:59 UCX ERROR
shmat(shmid=3145743) failed: Invalid argument
[1728651922.028742] [c0800a-s17:1471414:0] mm_ep.c:189 UCX ERROR
mm ep failed to connect to remote FIFO id 0x30000f: Shared memory error
[1728651922.025391] [c0800a-s17:1471415:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471416/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.026517] [c0800a-s17:1471415:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471416/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.027067] [c0800a-s17:1471415:0] mm_sysv.c:59 UCX ERROR
shmat(shmid=3145743) failed: Invalid argument
[1728651922.027073] [c0800a-s17:1471415:0] mm_ep.c:189 UCX ERROR
mm ep failed to connect to remote FIFO id 0x30000f: Shared memory error
[1728651922.032593] [c0800a-s17:1471417:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471426/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.033773] [c0800a-s17:1471417:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471426/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.034341] [c0800a-s17:1471417:0] mm_sysv.c:59 UCX ERROR
shmat(shmid=3145746) failed: Invalid argument
[1728651922.034347] [c0800a-s17:1471417:0] mm_ep.c:189 UCX ERROR
mm ep failed to connect to remote FIFO id 0x300012: Shared memory error
[1728651922.030778] [c0800a-s17:1471419:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471426/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.031930] [c0800a-s17:1471419:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471426/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.032472] [c0800a-s17:1471419:0] mm_sysv.c:59 UCX ERROR
shmat(shmid=3145746) failed: Invalid argument
[1728651922.032479] [c0800a-s17:1471419:0] mm_ep.c:189 UCX ERROR
mm ep failed to connect to remote FIFO id 0x300012: Shared memory error
[1728651922.025267] [c0800a-s17:1471420:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471426/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.026408] [c0800a-s17:1471420:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471426/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.026952] [c0800a-s17:1471420:0] mm_sysv.c:59 UCX ERROR
shmat(shmid=3145746) failed: Invalid argument
[1728651922.026959] [c0800a-s17:1471420:0] mm_ep.c:189 UCX ERROR
mm ep failed to connect to remote FIFO id 0x300012: Shared memory error
[1728651922.032055] [c0800a-s17:1471413:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471416/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.033186] [c0800a-s17:1471413:0] mm_posix.c:234 UCX ERROR
open(file_name=/proc/1471416/fd/71 flags=0x0) failed: No such file or
directory
[1728651922.033720] [c0800a-s17:1471413:0] mm_sysv.c:59 UCX ERROR
shmat(shmid=3145743) failed: Invalid argument
[1728651922.033727] [c0800a-s17:1471413:0] mm_ep.c:189 UCX ERROR
mm ep failed to connect to remote FIFO id 0x30000f: Shared memory error

ORCA finished by error termination in SCF gradient Calling Command: mpirun
-np 8 /apps/gcc/12.2.0/openmpi/4.1.1/orca/5.0.4/orca_scfgrad_mpi
orc_job.scfgrad.inp orc_job [file orca_tools/qcmsg.cpp, line 465]:
  .... aborting the run

Am I not allocating enough memory? According to the job file, it uses about
200-300 mb per step and if maxcore is mb allocated per cpu (ntasks is 8 and
cpus per task is 1 so 8 total) then I figured it would be enough if my
maxcore is 2000.

Sincerely,

Peter Ramdhan, PharmD

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Upozornění: Není-li v této zprávě výslovně uvedeno jinak, má tato e-mailová
zpráva nebo její přílohy pouze informativní charakter. Tato zpráva ani její
přílohy v žádném ohledu Univerzitu Karlovu k ničemu nezavazují. Text této
zprávy nebo jejích příloh není návrhem na uzavření smlouvy, ani přijetím
případného návrhu na uzavření smlouvy, ani jiným právním jednáním směřujícím
k uzavření jakékoliv smlouvy a nezakládá předsmluvní odpovědnost Univerzity
Karlovy. Obsahuje-li tento e-mail nebo některá z jeho příloh osobní údaje,
dbejte při jeho dalším zpracování (zejména při archivaci) souladu s pravidly
evropského nařízení GDPR.

Disclaimer: If not expressly stated otherwise, this e-mail message
(including any attached files) is intended purely for informational purposes
and does not represent a binding agreement on the part of Charles
University. The text of this message and its attachments cannot be
considered as a proposal to conclude a contract, nor the acceptance of a
proposal to conclude a contract, nor any other legal act leading to
concluding any contract; nor does it create any pre-contractual liability on
the part of Charles University. If this e-mail or any of its attachments
contains personal data, please be aware of data processing (particularly
document management and archival policy) in accordance with Regulation (EU)
2016/679 of the European Parliament and of the Council on GDPR.

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Oct 28 2024 - 10:00:02 PDT