Just to clarify, can you confirm the following points:
- you have two CUDA devices in your machine (they report via nvidia-smi)
- you run two entirely independent pmemd.cuda processes (not pmemd.cuda.MPI)
- you have tried to make sure that both pmemd.cuda processes use 
different CUDA devices (e.g. via setting your devices to process 
exclusive mode or via using the CUDA_VISIBLE_DEVICES env variable)
- you have verified that both pmemd.cuda processes actually use 
different CUDA devices (verified via nvidia-smi and/or via mdout file, 
there are also indirect indicators such as temperature)
Thanks,
Jan-Philip
On 01/28/2014 05:37 PM, Guanglei Cui wrote:
> Dear AMBER users,
>
> I am doing some benchmark on a node with two M2090 cards. For my test
> system (~26K atoms and NVT), I'll get 36.7ns/day (pmemd.cuda on 1 GPU) and
> 43.5ns/day (pmemd.cuda.MPI on both GPUs). So it makes sense to run two
> separate simulations, 1 on each GPU. From what I read, amber12 GPU code
> should perform almost equally well in such situations. However, I observe a
> performance drop (almost 50%). I have limited experience with the code. I
> wonder if someone could give me some hints as to what might be causing the
> performance degradation. I don't have a lot details on the hardware specs
> of the node, but I can ask if certain factors are more important.
>
> Thanks in advance!
>
> Best regards,
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jan 28 2014 - 09:00:03 PST