Re: [AMBER] AMBER14 pmemd.cuda.MPI : *** in mdout files

From: Ross Walker <ross.rosswalker.co.uk>
Date: Fri, 3 Apr 2015 22:18:58 -0700

Hi Joshi,

I may have a potential fix for you. However, I need some key information first to confirm this is the issue I think it is. Can you send me the following:

1) nvidia-smi

2) cat /proc/cpuinfo

3) lspci -t -v

Thanks.

All the best
Ross

> On Mar 31, 2015, at 11:33 PM, Ross Walker <ross.rosswalker.co.uk> wrote:
>
> Hi Joshi,
>
> Can you send me the output of the following commands on your machine please:
>
> 1) nvidia-smi
>
> 2) cat /proc/cpuinfo
>
> Thanks,
>
> All the best
> Ross
>
>> On Mar 31, 2015, at 9:13 PM, Himanshu Joshi <himanshuphy87.gmail.com> wrote:
>>
>> Ok Ross,
>> I am using the latest version with all the recent updates.
>> Here is the detail.
>> GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
>> | Version 14.0.1
>> |
>> | 06/20/2014
>>
>> Thanks for the helpful suggestion.
>>
>> On Wed, Apr 1, 2015 at 9:09 AM, Ross Walker <ross.rosswalker.co.uk> wrote:
>>
>>> Ok - this matches behavior I have seen on other kepler based systems
>>> (notably K80s) when running multiple GPUs. My current theory is that one of
>>> the more recent updates must have broken things. I'll investigate some more
>>> when I get time. In the meantime I suggest just running single GPU runs.
>>>
>>> All the best
>>> Ross
>>>
>>>> On Mar 31, 2015, at 8:32 PM, Himanshu Joshi <himanshuphy87.gmail.com>
>>> wrote:
>>>>
>>>> Hello Ross,
>>>> I ran the with pmemd.cuda and found it printing correctly.
>>>> So the problem lies somwhere in pmemd.cuda.MPI.
>>>>
>>>> Thanks for the help.
>>>>
>>>> On Tue, Mar 31, 2015 at 11:36 PM, Ross Walker <ross.rosswalker.co.uk>
>>> wrote:
>>>>
>>>>> Hi Joshi,
>>>>>
>>>>> If you run with just pmemd.cuda instead of cuda.MPI do you see different
>>>>> behavior?
>>>>>
>>>>> All the best
>>>>> Ross
>>>>>
>>>>>> On Mar 31, 2015, at 10:58 AM, Himanshu Joshi <himanshuphy87.gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Dear Friends,
>>>>>> I am running pmemd.cuda.MPI (AMBER14) in Tesla K40s gpu cards for an
>>>>>> explicit water DNA system with SPFP precision model. After running few
>>>>>> steps sucessfully, I am getting **** for temperature and energy values
>>> as
>>>>>> typed below followed by the md input file.
>>>>>>
>>>>>>
>>>>>> NSTEP = 45000 TIME(PS) = 54230.000 TEMP(K) = 300.15 PRESS =
>>>>>> 0.0
>>>>>> Etot = -1102980.4479 EKtot = 216143.9219 EPtot =
>>>>>> -1319124.3698
>>>>>> BOND = 8297.0959 ANGLE = 17306.9385 DIHED =
>>>>>> 22929.3885
>>>>>> 1-4 NB = 8627.8031 1-4 EEL = -71177.7576 VDWAALS =
>>>>>> 134764.0741
>>>>>> EELEC = -1439871.9123 EHBOND = 0.0000 RESTRAINT =
>>>>>> 0.0000
>>>>>>
>>>>>
>>> ------------------------------------------------------------------------------
>>>>>>
>>>>>> check COM velocity, temp: NaN NaN(Removed)
>>>>>>
>>>>>> NSTEP = 50000 TIME(PS) = 54240.000 TEMP(K) = NaN PRESS =
>>>>>> 0.0
>>>>>> Etot = NaN EKtot = NaN EPtot =
>>>>>> **************
>>>>>> BOND = 0.0000 ANGLE = 4770613.6638 DIHED =
>>>>>> 0.0000
>>>>>> 1-4 NB = ************** 1-4 EEL = ************** VDWAALS =
>>>>>> **************
>>>>>> EELEC = ************** EHBOND = 0.0000 RESTRAINT =
>>>>>> 0.0000
>>>>>>
>>>>>
>>> ------------------------------------------------------------------------------
>>>>>> Here is my md input file
>>>>>>
>>>>>> Production run with constant volume
>>>>>> &cntrl
>>>>>> ntx=6, irest=1,
>>>>>> nmropt=0,
>>>>>> ntrx=1, ntxo = 2,
>>>>>> ntpr=5000, ntwx=5000, ntwr=10000,
>>>>>> ntwv=0,
>>>>>> ntf=2, ntb=1,
>>>>>> cut=9.0, nsnb=10,
>>>>>> nstlim=1000000, nscm=10000,
>>>>>> t=0.0, dt=0.002,
>>>>>> ntt=1, tautp=0.5
>>>>>> ntp=0, pres0=1.0, taup=0.5, comp=44.6
>>>>>> tempi=300.0, temp0=300.0,
>>>>>> ipol=0, ntc=2, ioutfm = 1
>>>>>> /
>>>>>> &ewald
>>>>>> /
>>>>>>
>>>>>> Though, I am running another NPT run using Langevin dynamics using
>>>>>> anisotropic pressure coupling and MC barostate with same pmemd.cuda.MPI
>>>>> and
>>>>>> it is running fine.
>>>>>>
>>>>>> Is it something known problem or I am missing something, please
>>> comment.
>>>>>> Thanks in advance for your valuable time.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *With Regards,HIMANSHU JOSHI Graduate Scholar, Center for Condense
>>> Matter
>>>>>> TheoryDepartment of Physics IISc.,Bangalore India 560012*
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>> *With Regards,HIMANSHU JOSHI Graduate Scholar, Center for Condense Matter
>>>> TheoryDepartment of Physics IISc.,Bangalore India 560012*
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>
>>
>>
>> --
>>
>>
>>
>> *With Regards,HIMANSHU JOSHI Graduate Scholar, Center for Condense Matter
>> TheoryDepartment of Physics IISc.,Bangalore India 560012*
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Apr 03 2015 - 22:30:03 PDT
Custom Search