[AMBER] Early termination of parallel MD

From: Valentina Romano <valentina.romano.unibas.ch>
Date: Mon, 30 Jun 2014 07:52:24 +0000

Dear Amber users

I want to run a MD in parallel.
The input file is:

#!/bin/bash -l
#$ -N PknGAde_md
#$ -l membycore=1G
#$ -l runtime=50:00:00
#$ -pe ompi 32
#$ -cwd
##$ -o $HOME/queue/stdout
##$ -e $HOME/queue/stderr

module load ictce/6.2.5

export AMBERHOME=/import/bc2/home/schwede/romanov/amber12-amd

#echo "Got $NSLOTS processors."
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md01.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_equil.rst -r PknGAde_md01.rst -x PknGAde_md01.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md02.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md01.rst -r PknGAde_md02.rst -x PknGAde_md02.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md03.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md02.rst -r PknGAde_md03.rst -x PknGAde_md03.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md04.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md03.rst -r PknGAde_md04.rst -x PknGAde_md04.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md05.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md04.rst -r PknGAde_md05.rst -x PknGAde_md05.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md06.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md05.rst -r PknGAde_md06.rst -x PknGAde_md06.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md07.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md06.rst -r PknGAde_md07.rst -x PknGAde_md07.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md08.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md07.rst -r PknGAde_md08.rst -x PknGAde_md08.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md09.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md08.rst -r PknGAde_md09.rst -x PknGAde_md09.mdcrd
mpirun -v -np $NSLOTS pmemd.MPI -O -i PknGAde_md.in -o PknGAde_md10.out -p ../PknGAde_params/PknGHAdeH_ion_wt.prmtop -c PknGAde_md09.rst -r PknGAde_md10.rst -x PknGAde_md10.mdcrd

Where PknGAde_md.in is:

  tempi=300.0, temp0=300.0,
  ntt=3, gamma_ln=1.0,
  nstlim=500000, dt=0.002,
  ntpr=500, ntwx=500, ntwr=1000

Since I want to run a 10ns MD, each PknGAde_md.in is of 50000 steps (dt=0.002) and it is run 10 times.

When I run the script for the MD in parallel, it works fine for the first step. Afterwards the second steps did not start and I do not understand why.
I did not get any error messages and it looks to me that the input for the parallel job is not correct and the job stops after the first step (first 500000 steps).

Any suggestion?

