Re: [AMBER] cudaMemcpy GpuBuffer error in pmemd.cuda

From: Ross Walker <ross.rosswalker.co.uk>
Date: Thu, 20 Sep 2012 10:08:57 -0700

Hi Fabricio

Can you post your input files? I looked through the threads and couldn't
find any. Without these it is a little difficult to help. I suspect the
problem is some kind of illegal (or marginal) bonding or other subtle
problem with the initial topology file. Sander tends to be much more
tolerant of slightly dodgy topologies than the CUDA code, it also has
little to no error checking of things that should not really be allowed,
or are assumed should not occur, e.g. hydrogens with multiple bonds that
are shaken, atoms bonded to themselves, etc.

Did you take a look at the starting structure that gives the limit error -
is there anything obviously wrong with it? - This is going to take quite a
bit of debugging to figure out what is going on. I would start by trying
to look very carefully at what is going wrong with the initial structure.

Note, the GPU code is deterministic so if you use the same random see you
will get identical (forever!!!) results which is what you see. You could
try with different random seeds and see if it still blows up but I think
the problem is much deeper routed than a simple instability. Also note
that you will not be able to make sander match the GPU code - you can
converge it to the same ensemble but not get it to give identical
trajectories. You can make NVE match for a few thousand steps before
roundoff error causes divergence but with anything that uses random
numbers it will be impossible, even with the same random seed, since the
actual random number generators used in sander and in pmemd.cuda are
different. Hence the same random see will not unfortunately generate the
same sequence of random numbers.

Post your input files and I'll try to take a look.

All the best
Ross


On 9/20/12 8:39 AM, "Fabrício Bracht" <bracht.iq.ufrj.br> wrote:

>Hello.
>I have submit the same calculation using the same ig = 10703 twice
>with pmemd.cuda in order to see if I could reproduce the error. The
>simulation stopped at the exact same time. I used the restrt file of
>this run to submit a new calculation using sander (serial) and got the
>following error message:
>
>vlimit exceeded for step 2; vmax = 26.1201
>
> Coordinate resetting (SHAKE) cannot be accomplished,
> deviation is too large
> NITER, NIT, LL, I and J are : 0 3 2691 5317 5318
>
> Note: This is usually a symptom of some deeper
> problem with the energetics of the system.
>
>I guess thet 5317 and 5318 are the atoms involved. These atoms are
>part of the water molecule that is bound to the zinc atom. Could this
>"symptom of some deeper problem with the energetics of the system"
>mean that the parameters for the active site are not ok?
>Thank you
>Fabrício
>
>2012/9/18 Fabrício Bracht <bracht.iq.ufrj.br>:
>> Hello. The simulation which ended with the error message "cudaMemcpy
>> GpuBuffer error in pmemd.cuda" has not failed with sander.MPI. I have
>> run the same simulation using the same ig = 10703. Shouldn't I expect
>> the simulation to fail also with sander.MPI at the same stage?
>> Thank you
>> Fabrício
>>
>> 2012/9/17 Fabrício Bracht <bracht.iq.ufrj.br>:
>>> The distance between the Zn and the O in the water molecule is
>>> consistent throughout the entire simulation. Though it is a bit closer
>>> than I would expect it to be, it stands around 1.8 A in average (the
>>> reference value in the frcmod file is 2.2 A).
>>>
>>> Fabrício
>>>
>>> 2012/9/17 M. L. Dodson <mldodson.comcast.net>:
>>>> This may be wildly off base, but is the water still at a distance from
>>>> the Zn consistent with the distance you found when you parameterized
>>>> the system? What I am getting at is: has the Zn-associated water
>>>>diffused
>>>> away? What is the distance in the last restart file before the
>>>>simulation
>>>> stopped compared to the parameterized distance?
>>>>
>>>> Bud Dodson
>>>>
>>>> On Sep 17, 2012, at 3:46 PM, Fabrício Bracht wrote:
>>>>
>>>>> Hello. I have already had long discussions regarding this system in
>>>>> specific (http://archive.ambermd.org/201208/0174.html). The last
>>>>> discussion ended with my email:
>>>>>> I have deleted the bonds between the hydrogen atoms for this
>>>>>> particular water molecule (bound to the zinc atom). Used the
>>>>>>flexible
>>>>>> water model and now the angle parameter for this water molecule is
>>>>>>in
>>>>>> use. The md simulation has worked and the problem seems to be
>>>>>>solved.
>>>>>> Are there any extra advices or perhaps some reading material I could
>>>>>> use to better understand the difference between using the flexible
>>>>>> water model or not?
>>>>>> Thank you again
>>>>>> Best regards
>>>>>> Fabrício
>>>>> I have not found the link for this particular discussion yet. The
>>>>> system is composed of an enzime with a zinc atom in its catalytic
>>>>> site. I have successfully run several hundred ns for a very very
>>>>> sismilar system, in which bound to the zinc atom is a hydroxyl group.
>>>>> This system now has a water molecule bound to the zinc atom. In my
>>>>> last discussion I have found that the only way for me to continue
>>>>> using the angle force parameters for this particular water molecule
>>>>> (which were the parameters that the MTK++ procedure gave), was to
>>>>>turn
>>>>> the flexible water model on. Now, the system has no apparent
>>>>> structural problems (bad geometry or collapsing hydrogen atoms etc),
>>>>> and yet, I still get the following error when I try to run
>>>>>pmemd.cuda.
>>>>>
>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>>>>
>>>>> The error does not stop, however, the simulation from being restarted
>>>>> from the point where it has ended. I am running right now the same
>>>>> simulation under sander.MPI. But the system is quite large and I do
>>>>> not expect the results are in until the end of this week. By changing
>>>>> the seed I can, sometimes, run the entire simulation without any
>>>>>error
>>>>> messages. But then again, sometimes, it only takes a few steps for
>>>>>the
>>>>> simulation to stop. In previous discussions, I have discovered that
>>>>> the error is precision model independent (using either SPFP or SPDP
>>>>>or
>>>>> DPDP). The parameters for the similar system with the hydroxyl ion
>>>>>are
>>>>> very similar to this one, except for some specific bond constants
>>>>> (like the ZN-HOH bond and the H --- O bond) and, of course, the
>>>>> charge distribution. But other than that, the two systems are very
>>>>> alike.
>>>>> Any help here would be appreciated
>>>>> Fabrício
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>> --
>>>> M. L. Dodson
>>>> Business email: activesitedynamics-at-gmail-dot-com
>>>> Personal email: mldodson-at-comcast-dot-net
>>>> Gmail: mlesterdodson-at-gmail-dot-com
>>>> Phone: eight_three_two-five_63-386_one
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Sep 20 2012 - 10:30:03 PDT
Custom Search