[AMBER] Run Amber as a job array

From: Anna Sekuła <anna.sekula.ikifp.edu.pl>
Date: Thu, 23 Apr 2020 16:32:56 +0200

Is it possible to submit Amber job on a cluster as a Slurm job array? I
was trying to do it with MMPBSA.py.MPI and my jobs partially failed. I
don't really understand why only some of them failed. Also, each time I
try relaunching my jobs, the error messages appear to be quite
different. I'm using the following script to analyse couple of ns of my
MD simulation.


My batch script:
#!/bin/bash
#SBATCH --array=48-62
#SBATCH --job-name=mmpbsa
#SBATCH --ntasks-per-node=20
#SBATCH --time=24:00:00
NUMBER=$SLURM_ARRAY_TASK_ID
mpirun -np 20 MMPBSA.py.MPI -O \
     -i  mmpbsa.in \
     -o  bss-fum-tol-${NUMBER}ns.dat \
     -sp com-bss-fum-tol-wat.prmtop \
     -cp com-bss-fum-tol.prmtop \
     -rp rec-bss-fum.prmtop \
     -lp lig-toluen.prmtop \
     -y  $SCRATCH/20.03.18-30_md/bss-fum-tol-md${NUMBER}.crd


Some (truncated) error messages:

forrtl: severe (24): end-of-file during read, unit 24, file
_MMPBSA_complex.mdcrd.19
Image              PC                Routine Line        Source
libifcore.so.5     00002AF4B5C02947  Unknown Unknown  Unknown
libifcore.so.5     00002AF4B5C3BA33  Unknown Unknown  Unknown
sander             000000000050679B  Unknown Unknown  Unknown
sander             0000000000503263  Unknown Unknown  Unknown
sander             00000000004F81D2  Unknown Unknown  Unknown
sander             000000000046A4CE  Unknown Unknown  Unknown
libc.so.6          00002AF4B6B4E3D5  Unknown Unknown  Unknown
sander             000000000046A3C9  Unknown Unknown  Unknown
...
CalcError: /net/software/local/amber/amber16/bin/sander failed with
prmtop com-bss-fum-tol.prmtop!
Error occured on rank 0.
Exiting. All files have been retained.
     self.prmtop))
CalcError: /net/software/local/amber/amber16/bin/sander failed with
prmtop com-bss-fum-tol.prmtop!
Error occured on rank 4.
Exiting. All files have been retained.
     calc.run(rank, stdout=stdout, stderr=stderr)
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/calculation.py",
line 157, in run
     self.prmtop))
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
     calc.run(rank, stdout=stdout, stderr=stderr)
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/calculation.py",
line 157, in run
     self.prmtop))
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 4

==============================================

Beginning PB calculations with /net/software/local/amber/amber16/bin/sander
   calculating complex contribution...
   calculating receptor contribution...
   calculating ligand contribution...
   File "/net/software/local/amber/amber16/bin/MMPBSA.py.MPI", line 108,
in <module>
     app.parse_output_files()
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/main.py",
line 930, in parse_output_files
     self.using_chamber)}
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/amber_outputs.py",
line 708, in __init__
     AmberOutput._read(self)
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/amber_outputs.py",
line 343, in _read
     self._get_energies(output_file)
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/amber_outputs.py",
line 737, in _get_energies
     self.data['VDWAALS'].append(float(words[2]))
ValueError: could not convert string to float: *************
Error occured on rank 0.
Exiting. All files have been retained.
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

==============================================

Beginning PB calculations with /net/software/local/amber/amber16/bin/sander
   calculating complex contribution...
   calculating receptor contribution...
   calculating ligand contribution...

Timing:
Total setup time:                           0.063 min.
Creating trajectories with cpptraj:         0.154 min.
Total calculation time:                    11.878 min.

Total GB calculation time:                  0.513 min.
Total PB calculation time:                 11.253 min.

Statistics calculation & output writing:    0.000 min.
Total time taken:                          12.150 min.
   File "/net/software/local/amber/amber16/bin/MMPBSA.py.MPI", line 110,
in <module>
     app.finalize()
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/main.py",
line 681, in finalize
     self.remove(self.INPUT['keep_files'])
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/main.py",
line 869, in remove
     utils.remove(flag, mpi_size=self.mpi_size, fnpre=self.pre)
   File
"/net/software/local/amber/amber16/lib/python2.7/site-packages/MMPBSA_mods/utils.py",
line 123, in remove
     os.remove(fil)
OSError: [Errno 2] No such file or directory: '_MMPBSA_pb.mdin'
Error occured on rank 0.
Exiting. All files have been retained.
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Apr 23 2020 - 08:00:02 PDT
Custom Search