Re: [AMBER] MMPBSA.py.MPI and MMPBSA.py

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 20 Jun 2012 15:53:32 -0400

PB takes significantly more memory than GB calculations. Did you run a
single frame somewhere to look at the memory requirement of PB calculations?

I would also suggest separating your jobs (separate GB and PB simulations),
so you can at least get the GB results. The amount of memory required
depends on how large your system is. Normal mode calculations require the
most memory, PB requires a bit less (but still a lot of memory for large
systems and fine grids), and GB requires the least.

There's no setting to explicitly limit the used memory in Amber as far as I
know (the PB developers may correct me here if I'm wrong), so you will need
to do a little investigating on your own. I believe all of the output
files record how much space is allocated for the calculation, so you should
be able to determine the memory requirement from that.

HTH,
Jason

On Wed, Jun 20, 2012 at 3:24 PM, Delwar Hossain <hossaind2004.yahoo.com>wrote:

> Hi Amber users,
> I try to use the following scripts for calculation of delG value. Some job
> finished sucessfully. But the system administrator just deleted my
> remaining job. I included their mail which explain why they kill my job. I
> therefore request you to help me what parameter I need to change in my
> scripts.
> Script:
> #!/bin/bash
> #PBS -q default
> #PBS -N Amber
> #PBS -j oe
> #PBS -l nodes=1:ppn=8
> #PBS -l walltime=500:00:00
> #PBS -l pmem=1GB
> #PBS -M
> #PBS -m abe
> #
> export AMBERHOME=/usr/local/amber12-pgi
> #
> cd /chpchome/dhossain/SHARED/DHPROJECT/AMBER/L112A-GBimp/TEST
> #
> $AMBERHOME/bin/MMPBSA.py -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -cp
> M11-L112A.gas.prmtop -rp M11.gas.prmtop -lp L112A.gas.prmtop -y *.mdcrd
>
> The administrator explanation:
> Hi Delwar,
> Did you change any of the parameters in Amber before resubmitting your
> jobs? I will need to kill job 6720 running on node cluster3-95 as it is
> running like the ones we saw this morning.
>
> Please understand that we cannot let the systems be affected or fail due
> to erroneous jobs. I did kindly inform you that unfortunately your program
> was killed to protect our computing resource and asked you to change the
> Amber parameters so that your simulation fits to our computing resource.
>
> In this case the job system needed more memory than what the compute
> nodes physically have and that produced swap. This is not a parameter that
> is in the queue submission script, but within Amber. We are not experts of
> Amber. It may be quicker and you will get better support for this specific
> issue if you contact Amber support and ask what parameter to change to
> reduce amount of memory in order to run your simulation on our cluster.
>
> FYI: Each compute node of cluster3 has 48GB physical memory for all eight
> processors (6GB per processor) - we need to reserve at least several GB for
> system.
>
> As you know, we are trying to do our best to provide computing resource as
> much as possible.
>
> Secondly, when I use parallel versition using the following script I got
> the following error message:
> #!/bin/sh
> #PBS -N rasraf_parallel
> #PBS -o parallel.out
> #PBS -e parallel.err
> #PBS -m abe
> #PBS -M
> #PBS -q default
> #PBS -l nodes=4:ppn=8
> #PBS -l pmem=2GB
> module load amber12-gcc
> SANDER=MMPBSA.py.MPI
> MDIN=mmpbsa.in
> OUTPUT=FINAL_RESULTS_MMPBSA.dat
> CPTOP=M11-bec.gas.prmtop
> RPTOP=M11.gas.prmtop
> LPTOP=beclin.gas.prmtop
> MDCRD=M11-bec_md1.mdcrd
> PROG=progress.log
> #
> cd $PBS_O_WORKDIR
> #
> #
> export NUM_PROCS=`cat $PBS_NODEFILE | wc -l`
> #
> mpirun --mca mpi_paffinity_alone 1 -np $NUM_PROCS -machinefile
> $PBS_NODEFILE -x MX_RCACHE=0 --mca pml cm $SANDER -i $MDIN -o $OUTPUT -cp
> $CPTOP -rp $RPTOP -lp $LPTOP -y $MDCRD > $PROG 2>&1
> #
>
> progress.log
> Running MMPBSA.MPI on 32 processors
> Reading command-line arguments and input files...
> Loading and checking parameter files for compatibility...
> mmpbsa_py_energy found! Using /usr/local/amber12-gcc/bin/mmpbsa_py_energy
> cpptraj found! Using /usr/local/amber12-gcc/bin/cpptraj
> ptraj found! Using /usr/local/amber12-gcc/bin/ptraj
> Preparing trajectories for simulation...
> 50 frames were processed by cpptraj for use in calculation.
> Beginning GB calculations with /usr/local/amber12-gcc/bin/mmpbsa_py_energy
> calculating complex contribution...
> calculating receptor contribution...
> calculating ligand contribution...
> Beginning PB calculations with /usr/local/amber12-gcc/bin/mmpbsa_py_energy
> calculating complex contribution...
> CalcError: /usr/local/amber12-gcc/bin/mmpbsa_py_energy failed with prmtop
> M11-bec.gas.prmtop!
> Error occured on rank 1.
> Exiting. All files have been retained.
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
> with errorcode 1.
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 1 with PID 25991 on
> node cluster3-126.chpc.ndsu.nodak.edu exiting without calling "finalize".
> This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> Thank you.
> With best regards
> Delwar
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 20 2012 - 13:00:04 PDT
Custom Search