Hi,
Thanks for the response Carlos.
The GB 304 atoms TRPCage GPU benchmark runs fine
on 8 nodes with 2 gpus each when the 32x limit is removed.
For context, this is a scaling study of the underlying communication
layers not an Amber use case.
scott
On Fri, Apr 30, 2021 at 09:33:32AM -0400, Carlos Simmerling wrote:
> it seems like 304 atoms spread over 16 GPUs is not an efficient case... is
> this something other than single MD? I would think the best you could do
> would be a single GPU. Even then it's not using the GPU efficiently, see
> for example Dave Cerutti's work to do multiple MD runs of trp-cage on a
> single GPU.
> Carlos
>
> On Thu, Apr 29, 2021 at 9:27 PM Scott Brozell <sbrozell.comcast.net> wrote:
> > The GB 304 atoms TRPCage GPU benchmark emits
> > 'Must have 32x more atoms than processors!'
> > when run on 8 nodes with 2 gpus each.
> >
> > Based on comments in gb_parallel.F90, this looks arbitrary.
> > Has anyone tried the benchmark after disabling this check ?
> >
> >
> > Alternatively, do we have a canned protocol for creating a distant
> > dimer ? This would appear to be easy work for leap's translate
> > and combine commands. This is for benchmarking but i still want to
> > be cagey enough to avoid blow up, eg, spacetime folding as in a binary
> > neutron star death spiral: https://apod.nasa.gov/apod/ap171016.html
> > so advice on the distance, other comments, and suggestions are welcome.
> >
> > That benchmark appears to stem from
> > http://ambermd.org/tutorials/basic/tutorial3/index.php
> > ?
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 04 2021 - 13:00:02 PDT