Re: [AMBER] Simulations using pmemd.cuda from Ross Walker on 2014-05-07 (Amber Archive May 2014)

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 07 May 2014 08:44:36 -0700

Hi James,

Jason's info is good - I personally work off the memory usage rather than
messing with a hacked nvidia-smi. I would suggest though taking a read of
the following pages:

http://ambermd.org/gpus/ - this is for AMBER 14 although I have not
completely finished updating it yet.

http://ambermd.org/gpus12/ - This contains an archive of the AMBER 12 info.

Take a read of them - I think they will help you understand things better.

Note, if you running just single GPU jobs you can set the following as
root:

nvidia-smi -c 3
nvidia-smi -pm 1

And then pmemd.cuda will automatically pick an unused GPU each time you
run it and will quit with an error if they are all in use - it prevents
AMBER 14 from running in parallel though (for that you need to set
nvidia-smi -c 0 to allow peer to peer communication - for AMBER 12 it
makes no difference).

All the best
Ross

On 5/7/14, 4:34 AM, "Jason Swails" <jason.swails.gmail.com> wrote:

>
>On May 7, 2014, at 6:39 AM, James Starlight <jmsstarlight.gmail.com>
>wrote:
>
>> Also I wounder to knmow about possible ways to monitor loading of each
>>GPU
>> while performing simulations (Its strange but devise-info script found
>> http://ambermd.org/gpus/#Running does not allocate any GPUs ):
>>
>> [snip]
>> own.drunk_telecaster ~/Desktop/check_CUDA $ nvidia-smi -a
>> ==============NVSMI LOG==============
>>
>> Timestamp : Wed May 7 14:38:41 2014
>> Driver Version : 331.67
>>
>> Attached GPUs : 2
>> GPU 0000:01:00.0
>> Product Name : GeForce GTX TITAN
>> Display Mode : N/A
>> Display Active : N/A
>> Persistence Mode : Disabled
>> Accounting Mode : N/A
>> Accounting Mode Buffer Size : N/A
>> Driver Model
>> Current : N/A
>> Pending : N/A
>> Serial Number : N/A
>> GPU UUID :
>> [snip]
>> FB Memory Usage
>> Total : 6143 MiB
>> Used : 393 MiB
>> Free : 5750 MiB
>> [snip]
>> Temperature
>> Gpu : 80 C
>> [snip]
>> could some one detect something unusual in these logs?
>
>The temperature of 80 C is typical for TITANs under load -- other than
>memory usage (>300MB used also indicates a process is running) and
>temperature/fan speed, nvidia-smi does not print much information for
>Geforce cards. There is a "bug" in NVidia's NVML library that causes
>nvidia-smi to think none of the introspective properties are supported by
>these cards.
>
>You can patch NVML with a hack that tricks the library into realizing the
>properties _are_ supported via this fix:
>https://github.com/CFSworks/nvml_fix
>
>Unless you have a strong desire for more comprehensive nvidia-smi output,
>it's probably not worthwhile.
>
>HTH,
>Jason
>
>--
>Jason M. Swails
>BioMaPS,
>Rutgers University
>Postdoctoral Researcher
>
>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 07 2014 - 09:00:04 PDT