Re: [AMBER] cudaMemcpy GpuBuffer error in pmemd.cuda

From: Fabrício Bracht <bracht.iq.ufrj.br>
Date: Thu, 20 Sep 2012 18:36:13 -0300

Hello Ross. I have sent an email to the amber mailing list with the
files attached. I do not know if you block emails with attached files
or not. Could you please confirm if the email arrived or not?
Thank you
Fabrício

2012/9/20 Ross Walker <ross.rosswalker.co.uk>:
> Hi Fabricio
>
> Can you post your input files? I looked through the threads and couldn't
> find any. Without these it is a little difficult to help. I suspect the
> problem is some kind of illegal (or marginal) bonding or other subtle
> problem with the initial topology file. Sander tends to be much more
> tolerant of slightly dodgy topologies than the CUDA code, it also has
> little to no error checking of things that should not really be allowed,
> or are assumed should not occur, e.g. hydrogens with multiple bonds that
> are shaken, atoms bonded to themselves, etc.
>
> Did you take a look at the starting structure that gives the limit error -
> is there anything obviously wrong with it? - This is going to take quite a
> bit of debugging to figure out what is going on. I would start by trying
> to look very carefully at what is going wrong with the initial structure.
>
> Note, the GPU code is deterministic so if you use the same random see you
> will get identical (forever!!!) results which is what you see. You could
> try with different random seeds and see if it still blows up but I think
> the problem is much deeper routed than a simple instability. Also note
> that you will not be able to make sander match the GPU code - you can
> converge it to the same ensemble but not get it to give identical
> trajectories. You can make NVE match for a few thousand steps before
> roundoff error causes divergence but with anything that uses random
> numbers it will be impossible, even with the same random seed, since the
> actual random number generators used in sander and in pmemd.cuda are
> different. Hence the same random see will not unfortunately generate the
> same sequence of random numbers.
>
> Post your input files and I'll try to take a look.
>
> All the best
> Ross
>
>
> On 9/20/12 8:39 AM, "Fabrício Bracht" <bracht.iq.ufrj.br> wrote:
>
>>Hello.
>>I have submit the same calculation using the same ig = 10703 twice
>>with pmemd.cuda in order to see if I could reproduce the error. The
>>simulation stopped at the exact same time. I used the restrt file of
>>this run to submit a new calculation using sander (serial) and got the
>>following error message:
>>
>>vlimit exceeded for step 2; vmax = 26.1201
>>
>> Coordinate resetting (SHAKE) cannot be accomplished,
>> deviation is too large
>> NITER, NIT, LL, I and J are : 0 3 2691 5317 5318
>>
>> Note: This is usually a symptom of some deeper
>> problem with the energetics of the system.
>>
>>I guess thet 5317 and 5318 are the atoms involved. These atoms are
>>part of the water molecule that is bound to the zinc atom. Could this
>>"symptom of some deeper problem with the energetics of the system"
>>mean that the parameters for the active site are not ok?
>>Thank you
>>Fabrício
>>
>>2012/9/18 Fabrício Bracht <bracht.iq.ufrj.br>:
>>> Hello. The simulation which ended with the error message "cudaMemcpy
>>> GpuBuffer error in pmemd.cuda" has not failed with sander.MPI. I have
>>> run the same simulation using the same ig = 10703. Shouldn't I expect
>>> the simulation to fail also with sander.MPI at the same stage?
>>> Thank you
>>> Fabrício
>>>
>>> 2012/9/17 Fabrício Bracht <bracht.iq.ufrj.br>:
>>>> The distance between the Zn and the O in the water molecule is
>>>> consistent throughout the entire simulation. Though it is a bit closer
>>>> than I would expect it to be, it stands around 1.8 A in average (the
>>>> reference value in the frcmod file is 2.2 A).
>>>>
>>>> Fabrício
>>>>
>>>> 2012/9/17 M. L. Dodson <mldodson.comcast.net>:
>>>>> This may be wildly off base, but is the water still at a distance from
>>>>> the Zn consistent with the distance you found when you parameterized
>>>>> the system? What I am getting at is: has the Zn-associated water
>>>>>diffused
>>>>> away? What is the distance in the last restart file before the
>>>>>simulation
>>>>> stopped compared to the parameterized distance?
>>>>>
>>>>> Bud Dodson
>>>>>
>>>>> On Sep 17, 2012, at 3:46 PM, Fabrício Bracht wrote:
>>>>>
>>>>>> Hello. I have already had long discussions regarding this system in
>>>>>> specific (http://archive.ambermd.org/201208/0174.html). The last
>>>>>> discussion ended with my email:
>>>>>>> I have deleted the bonds between the hydrogen atoms for this
>>>>>>> particular water molecule (bound to the zinc atom). Used the
>>>>>>>flexible
>>>>>>> water model and now the angle parameter for this water molecule is
>>>>>>>in
>>>>>>> use. The md simulation has worked and the problem seems to be
>>>>>>>solved.
>>>>>>> Are there any extra advices or perhaps some reading material I could
>>>>>>> use to better understand the difference between using the flexible
>>>>>>> water model or not?
>>>>>>> Thank you again
>>>>>>> Best regards
>>>>>>> Fabrício
>>>>>> I have not found the link for this particular discussion yet. The
>>>>>> system is composed of an enzime with a zinc atom in its catalytic
>>>>>> site. I have successfully run several hundred ns for a very very
>>>>>> sismilar system, in which bound to the zinc atom is a hydroxyl group.
>>>>>> This system now has a water molecule bound to the zinc atom. In my
>>>>>> last discussion I have found that the only way for me to continue
>>>>>> using the angle force parameters for this particular water molecule
>>>>>> (which were the parameters that the MTK++ procedure gave), was to
>>>>>>turn
>>>>>> the flexible water model on. Now, the system has no apparent
>>>>>> structural problems (bad geometry or collapsing hydrogen atoms etc),
>>>>>> and yet, I still get the following error when I try to run
>>>>>>pmemd.cuda.
>>>>>>
>>>>>> cudaMemcpy GpuBuffer::Download failed unspecified launch failure
>>>>>>
>>>>>> The error does not stop, however, the simulation from being restarted
>>>>>> from the point where it has ended. I am running right now the same
>>>>>> simulation under sander.MPI. But the system is quite large and I do
>>>>>> not expect the results are in until the end of this week. By changing
>>>>>> the seed I can, sometimes, run the entire simulation without any
>>>>>>error
>>>>>> messages. But then again, sometimes, it only takes a few steps for
>>>>>>the
>>>>>> simulation to stop. In previous discussions, I have discovered that
>>>>>> the error is precision model independent (using either SPFP or SPDP
>>>>>>or
>>>>>> DPDP). The parameters for the similar system with the hydroxyl ion
>>>>>>are
>>>>>> very similar to this one, except for some specific bond constants
>>>>>> (like the ZN-HOH bond and the H --- O bond) and, of course, the
>>>>>> charge distribution. But other than that, the two systems are very
>>>>>> alike.
>>>>>> Any help here would be appreciated
>>>>>> Fabrício
>>>>>>
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>> --
>>>>> M. L. Dodson
>>>>> Business email: activesitedynamics-at-gmail-dot-com
>>>>> Personal email: mldodson-at-comcast-dot-net
>>>>> Gmail: mlesterdodson-at-gmail-dot-com
>>>>> Phone: eight_three_two-five_63-386_one
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>_______________________________________________
>>AMBER mailing list
>>AMBER.ambermd.org
>>http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Sep 20 2012 - 15:00:03 PDT
Custom Search