Re: [AMBER] MMPBSA errors: "failed with prmtop" and "failed when querying netcdf trajectory" from Vlad Cojocaru on 2013-11-12 (Amber Archive Nov 2013)

From: Vlad Cojocaru <vlad.cojocaru.mpi-muenster.mpg.de>
Date: Tue, 12 Nov 2013 16:41:41 +0100

Ok .. Thanks a lot ..
I got it right in the end ...

Maybe one more question .. There is a huge difference between the VIRT
and RES memory usage output by "top" while running a single frame MMPBSA
analysis ..

I get something like 7500 mb VIRT and 3700 RES ...

It should be the RES I should count, isn't it ?

Thanks again
Vlad

On 11/12/2013 04:22 PM, Jason Swails wrote:
> On Tue, 2013-11-12 at 14:36 +0100, Vlad Cojocaru wrote:
>> Hi Jason,
>>
>> I do not have so much control over the cluster as its not a local cluster..
>>
>> I was indeed using the 256 cores as requested (at least to my knowledge
>> cannot do it differently on this machine) .... Well, it seems that I
>> don't fully understand how MMPBSA deals with the memory ... So, I was
>> thinking that the memory usage per job should not change depending on
>> the number of cores since the number of frames analyzed per core
>> decreases with the increase of the number of cores ...
> MMPBSA.py analyzes frames sequentially. If you are running in serial,
> there is never more than 1 frame being analyzed at a time (and therefore
> only one frame in memory). So regardless of how many frames are being
> analyzed, the memory consumption will not change.
>
> In parallel with N threads, MMPBSA.py.MPI splits up the whole trajectory
> into N equal-sized (or as close as possible) smaller trajectories which
> are each then analyzed sequentially. As a result, with N threads you
> are analyzing N frames at a time, and therefore using N times the memory
> used in serial.
>
> The alternative, which would give far poorer scaling, would be to
> analyze each frame using the number of requested cores, which would in
> turn depend on the parallelizability of the requested algorithm. For GB
> this is OK, but for PB it is quite limiting. The approach of
> parallelizing over frames takes advantage of the embarrassingly parallel
> property of MM/PBSA calculations and is why you can get nearly ideal
> scaling up to ca. nframes/2 processors.
>
>> Obviously, my thinking is flawed as from what you are saying the memory
>> requirements increase with the number of cores ...
>>
>> So, if I get the memory usage for a single frame on a single core, can I
>> actually calculate how much memory I need for lets say 10000 frames on
>> 128 cores ?
>>
>> I will do some single core, single frames tests now ..
> As I said above, the memory requirements depend on how many frames are
> being analyzed concurrently---not how many frames are being analyzed
> total. With 128 cores, you are analyzing 128 frames at once, so you
> have to make sure you have enough memory for that. If each node has,
> say, 32 GB of memory for 16 cores, you will need to ask for all 16
> cores, but run no more than 4 threads (which will use all 32 GB of RAM)
> on that node. [I would actually err on the side of caution and only run
> 3 threads per node to allow up to 8 GB of overrun for each thread.]
>
> Many queuing systems also allow memory to be requested as a resource,
> which means you can specify how much memory you want made available to
> your job per processor. Other clusters may require you to use a full
> node, so setting per-process memory limits wouldn't make as much sense.
> This is where the cluster documentation helps significantly.
>
> Good luck,
> Jason
>

-- 
Dr. Vlad Cojocaru
Max Planck Institute for Molecular Biomedicine
Department of Cell and Developmental Biology
Röntgenstrasse 20, 48149 Münster, Germany
Tel: +49-251-70365-324; Fax: +49-251-70365-399
Email: vlad.cojocaru[at]mpi-muenster.mpg.de
http://www.mpi-muenster.mpg.de/research/teams/groups/rgcojocaru
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Tue Nov 12 2013 - 08:00:02 PST