Dear AMBER users:
I have a system with ~ 200,000 atoms that scales quite well on 4 GPUs on a
DGX machine with Amber16. I now have access to a different node for testing
purposes that has 3 Tesla P100 GPUs. I find that 1 GPU gives 21 ns/day, 2
GPUs give 31 ns/day and 3 GPUs give 21 ns/day. Strange thing is that 2 GPUs
gives a consistent speed when I use GPUs 0,1 or 1,2 or 0,2 -- leading me to
think that there is PCI-based peer-to-peer across all 3 GPUs (though I
don't know how to verify that). So then why does performance drop off with
3 GPUs? I don't currently have the ability to re-test with 3 GPUs on a DGX,
though I will look into testing that, since it could give a definitve
I'm wondering whether there is something obviously inherent to the code
that doesn't like 3 GPUs (vs. 2 or 4)? Any thoughts?
Thank you for your help,
AMBER mailing list
Received on Mon Apr 03 2017 - 20:00:03 PDT