Re: [AMBER] cpptraj.MPI versus cpptraj

From: Daniel Roe <>
Date: Fri, 20 Dec 2019 08:22:39 -0500


The answer is to benchmark, benchmark, benchmark.

How much speedup you can get with cpptraj.MPI depends a lot on how
many nodes you're using, what your IO bandwidth and network bandwidth
is, what your underlying filesystem is, etc. On the systems I was
testing, I found that the IO became saturated at 2 processes per node
(basically using more processes on a node than there were sockets led
to less efficiency): see discussion in for more
details. Of course, that's not to say you can't use more processes per
node, just that the efficiency drops. When not using all available
cores, if you're using an OpenMP-enabled action (like e.g. hbond or
rdf) you can use the remaining cores for OpenMP threads, although that
doesn't help in this particular case.

So maybe benchmark on a small subset of the trajectory (100 frames)
using 2, 4, 6 processes, but be aware that if it's a NetCDF
trajectory, the second time you run through it will likely be faster
due to caching.

Hope this helps,


My recommendation if you don't want to benchmark is

On Thu, Dec 19, 2019 at 2:52 PM Debarati DasGupta
<> wrote:
> Dear Users,
> I have ~ 30000 distance based calculations I have to perform using AMBER18 cpptraj package. I am definitely using the cpptraj.MPI version as its multi threaded and will be faster than on a single processor cpptraj job.
> Any idea as to how many cores should work best, i.e. should I choose 8 or 12. Will choosing 12 drastically make my calculations faster? Is there any route which will work better. My trajectories are approx. 2 microseconds.
> $MPI_HOME/bin/mpiexec -n 8 cpptraj.MPI -i $input
> Thanks
> _______________________________________________
> AMBER mailing list

AMBER mailing list
Received on Fri Dec 20 2019 - 05:30:03 PST
Custom Search