The simulation runs fine without the restraints, on the GPU for at least 28
ns without any problem!
I updated the Nvidia drivers to version 361.42, ran a "yum update",
recompiled AMBER and the problem is still here!
On Fri, Apr 29, 2016 at 1:52 PM, Charles Lin <clin92.ucsd.edu> wrote:
> Can you check if your simulation exhibits proper behavior without the
> restraints for GPU?
>
> -Charlie
> ________________________________________
> From: Domenico Marson [domenico87.gmail.com]
> Sent: Friday, April 29, 2016 2:35 AM
> To: AMBER Mailing List
> Subject: Re: [AMBER] CUDA and restraint
>
> Hello Bill, thank for your answer!
>
> I tried on the 2 different K20c I've got available, the simulation stops at
> different timesteps (ntpr=1):
> GPU1: step 43
> GPU2: step 42
> GPU2.rerun: 47
>
> The drivers and cuda I have on the machine are:
> nvidia driver version: 352.79
> cuda version 7.5.17
>
> Other simulations run fine on the GPU, either with AMBER and LAMMPS!
>
> Regards,
> Domenico
>
> On Fri, Apr 29, 2016 at 10:28 AM, Bill Ross <ross.cgl.ucsf.edu> wrote:
>
> > Also it would be interesting to run on other GPUs of same or different
> > model.
> >
> > The things that come to mind are a compiler bug, a GPU firmware bug, or
> > a flaky GPU.
> >
> > Bill
> >
> > On 4/29/16 1:25 AM, Bill Ross wrote:
> > > Does it blow up at the same place each time on the GPU? E.g. tail the
> > > .out files.
> > >
> > > Bill
> > >
> > >
> > > On 4/29/16 1:19 AM, Domenico Marson wrote:
> > >> Thank you Charles for your answer!
> > >> Unfortunately, as I stated in my previous message, running in the CPU
> > for 7
> > >> ns the system seems to behave quite fine, there aren't noticeable
> > anomalies
> > >> in the energies or physical variables.
> > >> I can't understand why only on the GPU is the system explodes.
> > >>
> > >> Regards,
> > >> Domenico
> > >>
> > >> On Thu, Apr 28, 2016 at 6:27 PM, Charles Lin <clin92.ucsd.edu> wrote:
> > >>
> > >>> If the CPU simulations are also getting more unfavorable energies
> with
> > >>> each step there may still be something undesirable in your simulation
> > (box
> > >>> size, overlap in LJ radius, etc) causing it to explode. My guess
> > (which
> > >>> could also be wrong) would be to weaken the restraints you're using.
> > It
> > >>> may be that the your molecule strongly prefers being in a distances
> > <r1 or
> > >>>> r4 which that portion of the curve follows a linear path so the more
> > it
> > >>> deviates out the higher the force and energy contributions will be.
> A
> > much
> > >>> higher force constant can likely cascade into a quick acceleration
> of
> > your
> > >>> system. Preferably for restraints you want the distance to stay
> > within r1
> > >>> and r4.
> > >>>
> > >>> The error itself is likely saying that something in your simulation
> is
> > >>> exploding and its having issues downloading the data off GPU memory
> to
> > CPU
> > >>> memory.
> > >>>
> > >>> Charlie
> > >>>
> > >>> ________________________________________
> > >>> From: Domenico Marson [domenico87.gmail.com]
> > >>> Sent: Thursday, April 28, 2016 3:31 AM
> > >>> To: amber.ambermd.org
> > >>> Subject: [AMBER] CUDA and restraint
> > >>>
> > >>> Hello everyone, I'm copying a message I sent one week ago without
> > >>> receiving any answer, I know it's a busy time of the year, with the
> > >>> upcoming release, probably it went unnoticed!
> > >>>
> > >>> In the meantime I've been running 7 ns of simulation on CPU and the
> > >>> trajectory seems fine both in the data and visualizing it!
> > >>>
> > >>> Copied text:
> > >>>
> > >>> Hello everyone,
> > >>>
> > >>> I'm sorry to bother you just before the release of the "nextgen"
> > >>> Amber, but I have some trouble with pmemd.cuda.
> > >>>
> > >>> I'm trying to run a system with cartesian restraint applied to all
> the
> > >>> atoms of a nanoparticle in explicit TIP3P water,
> > >>> while with nmropt I'm restraining the distance of each of 56
> different
> > >>> atoms on the surface of this nanoparticle.
> > >>> To achieve this I restrain the distance between the COM of the
> > >>> nanoparticle and each atoms on its surface.
> > >>> So, to restrain the distances I have a restraint file with, for N in
> > >>> [1, 56] and dist1-4 varying:
> > >>> &rst
> > >>> iat= -1, N,
> > >>> r1=dist1 r2=dist2, r3=dist3, r4=dist4,
> > >>> rk2=10.00, rk3=10.00,
> > >>> ialtd=0,
> > >>>
> > >>>
> >
> igr1=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,
> > >>> /
> > >>>
> > >>> I have amber patched (compiled) with all the patches available as
> > >>> today, and my GPU is a Tesla K20c with Driver Version 352.79 and cuda
> > >>> 7.5.
> > >>>
> > >>> I performed minimisation, heating to 300 K (20 ps) and a fist
> > >>> equilibration of density (50 ps) on the CPU without any problem.
> > >>> Energy are fine and also the trajectory/behaviour seems fine.
> > >>> Than I wanted to continue on the GPU but, no matter what combination
> > >>> of ntc=2 + ntf=1 or ntc=2 + ntf=2, my simulation blows up just in a
> > >>> few steps with error:
> > >>> "cudaMemcpy GpuBuffer::Download failed an illegal memory access was
> > >>> encountered".
> > >>> I tried to output every frames in the mdout and trajectory, but I
> > >>> can't see the reason why It's blowing up.
> > >>> I see only my nanoparticle "exploding" step by step, and the
> restraint
> > >>> and bond energies increasing to reach "******" in the first 5-10
> > >>> steps.
> > >>> Moreover, If I continue with the same settings on the CPU no problem
> > >>> arises (at least not in a reasonable time, I have only 6 cores
> > >>> available).
> > >>>
> > >>> I know many patches were released for COM restraint on GPU, maybe
> > >>> something else is missing? Or I'm just trying too much?
> > >>> Thank you all for your help!
> > >>>
> > >>> Regards,
> > >>> Domenico
> > >>>
> > >>> --
> > >>> Domenico Marson, Ph.D.
> > >>> Department of Engineering and Architecture (DEA) Postdoctoral Fellow
> > >>> Molecular Simulation Engineering (MOSE) Laboratory
> > >>>
> > >>> University of Trieste
> > >>>
> > >>> Skype: domenicomars
> > >>>
> > >>> _______________________________________________
> > >>> AMBER mailing list
> > >>> AMBER.ambermd.org
> > >>> http://lists.ambermd.org/mailman/listinfo/amber
> > >>>
> > >>> _______________________________________________
> > >>> AMBER mailing list
> > >>> AMBER.ambermd.org
> > >>> http://lists.ambermd.org/mailman/listinfo/amber
> > >>>
> > >>
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
>
>
> --
> *Domenico Marson, Ph.D.*
> Department of Engineering and Architecture (DEA) Postdoctoral Fellow
> Molecular Simulation Engineering (MOSE) Laboratory
>
> University of Trieste
>
> Skype: domenicomars
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
--
*Domenico Marson, Ph.D.*
Department of Engineering and Architecture (DEA) Postdoctoral Fellow
Molecular Simulation Engineering (MOSE) Laboratory
University of Trieste
Skype: domenicomars
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Apr 29 2016 - 07:00:03 PDT