Re: [AMBER] CUDA and restraint

From: Charles Lin <clin92.ucsd.edu>
Date: Fri, 29 Apr 2016 11:52:35 +0000

Can you check if your simulation exhibits proper behavior without the restraints for GPU?

-Charlie
________________________________________
From: Domenico Marson [domenico87.gmail.com]
Sent: Friday, April 29, 2016 2:35 AM
To: AMBER Mailing List
Subject: Re: [AMBER] CUDA and restraint

Hello Bill, thank for your answer!

I tried on the 2 different K20c I've got available, the simulation stops at
different timesteps (ntpr=1):
GPU1: step 43
GPU2: step 42
GPU2.rerun: 47

The drivers and cuda I have on the machine are:
nvidia driver version: 352.79
cuda version 7.5.17

Other simulations run fine on the GPU, either with AMBER and LAMMPS!

Regards,
Domenico

On Fri, Apr 29, 2016 at 10:28 AM, Bill Ross <ross.cgl.ucsf.edu> wrote:

> Also it would be interesting to run on other GPUs of same or different
> model.
>
> The things that come to mind are a compiler bug, a GPU firmware bug, or
> a flaky GPU.
>
> Bill
>
> On 4/29/16 1:25 AM, Bill Ross wrote:
> > Does it blow up at the same place each time on the GPU? E.g. tail the
> > .out files.
> >
> > Bill
> >
> >
> > On 4/29/16 1:19 AM, Domenico Marson wrote:
> >> Thank you Charles for your answer!
> >> Unfortunately, as I stated in my previous message, running in the CPU
> for 7
> >> ns the system seems to behave quite fine, there aren't noticeable
> anomalies
> >> in the energies or physical variables.
> >> I can't understand why only on the GPU is the system explodes.
> >>
> >> Regards,
> >> Domenico
> >>
> >> On Thu, Apr 28, 2016 at 6:27 PM, Charles Lin <clin92.ucsd.edu> wrote:
> >>
> >>> If the CPU simulations are also getting more unfavorable energies with
> >>> each step there may still be something undesirable in your simulation
> (box
> >>> size, overlap in LJ radius, etc) causing it to explode. My guess
> (which
> >>> could also be wrong) would be to weaken the restraints you're using.
> It
> >>> may be that the your molecule strongly prefers being in a distances
> <r1 or
> >>>> r4 which that portion of the curve follows a linear path so the more
> it
> >>> deviates out the higher the force and energy contributions will be. A
> much
> >>> higher force constant can likely cascade into a quick acceleration of
> your
> >>> system. Preferably for restraints you want the distance to stay
> within r1
> >>> and r4.
> >>>
> >>> The error itself is likely saying that something in your simulation is
> >>> exploding and its having issues downloading the data off GPU memory to
> CPU
> >>> memory.
> >>>
> >>> Charlie
> >>>
> >>> ________________________________________
> >>> From: Domenico Marson [domenico87.gmail.com]
> >>> Sent: Thursday, April 28, 2016 3:31 AM
> >>> To: amber.ambermd.org
> >>> Subject: [AMBER] CUDA and restraint
> >>>
> >>> Hello everyone, I'm copying a message I sent one week ago without
> >>> receiving any answer, I know it's a busy time of the year, with the
> >>> upcoming release, probably it went unnoticed!
> >>>
> >>> In the meantime I've been running 7 ns of simulation on CPU and the
> >>> trajectory seems fine both in the data and visualizing it!
> >>>
> >>> Copied text:
> >>>
> >>> Hello everyone,
> >>>
> >>> I'm sorry to bother you just before the release of the "nextgen"
> >>> Amber, but I have some trouble with pmemd.cuda.
> >>>
> >>> I'm trying to run a system with cartesian restraint applied to all the
> >>> atoms of a nanoparticle in explicit TIP3P water,
> >>> while with nmropt I'm restraining the distance of each of 56 different
> >>> atoms on the surface of this nanoparticle.
> >>> To achieve this I restrain the distance between the COM of the
> >>> nanoparticle and each atoms on its surface.
> >>> So, to restrain the distances I have a restraint file with, for N in
> >>> [1, 56] and dist1-4 varying:
> >>> &rst
> >>> iat= -1, N,
> >>> r1=dist1 r2=dist2, r3=dist3, r4=dist4,
> >>> rk2=10.00, rk3=10.00,
> >>> ialtd=0,
> >>>
> >>>
> igr1=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,
> >>> /
> >>>
> >>> I have amber patched (compiled) with all the patches available as
> >>> today, and my GPU is a Tesla K20c with Driver Version 352.79 and cuda
> >>> 7.5.
> >>>
> >>> I performed minimisation, heating to 300 K (20 ps) and a fist
> >>> equilibration of density (50 ps) on the CPU without any problem.
> >>> Energy are fine and also the trajectory/behaviour seems fine.
> >>> Than I wanted to continue on the GPU but, no matter what combination
> >>> of ntc=2 + ntf=1 or ntc=2 + ntf=2, my simulation blows up just in a
> >>> few steps with error:
> >>> "cudaMemcpy GpuBuffer::Download failed an illegal memory access was
> >>> encountered".
> >>> I tried to output every frames in the mdout and trajectory, but I
> >>> can't see the reason why It's blowing up.
> >>> I see only my nanoparticle "exploding" step by step, and the restraint
> >>> and bond energies increasing to reach "******" in the first 5-10
> >>> steps.
> >>> Moreover, If I continue with the same settings on the CPU no problem
> >>> arises (at least not in a reasonable time, I have only 6 cores
> >>> available).
> >>>
> >>> I know many patches were released for COM restraint on GPU, maybe
> >>> something else is missing? Or I'm just trying too much?
> >>> Thank you all for your help!
> >>>
> >>> Regards,
> >>> Domenico
> >>>
> >>> --
> >>> Domenico Marson, Ph.D.
> >>> Department of Engineering and Architecture (DEA) Postdoctoral Fellow
> >>> Molecular Simulation Engineering (MOSE) Laboratory
> >>>
> >>> University of Trieste
> >>>
> >>> Skype: domenicomars
> >>>
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>
> >>
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



--
*Domenico Marson, Ph.D.*
Department of Engineering and Architecture (DEA) Postdoctoral Fellow
Molecular Simulation Engineering (MOSE) Laboratory
University of Trieste
Skype: domenicomars
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Apr 29 2016 - 05:00:03 PDT
Custom Search