Re: [AMBER] RTX 2020 Super GPU Random Memory Errors from Stephan Schott on 2020-03-21 (Amber Archive Mar 2020)

From: Stephan Schott <schottve.hhu.de>
Date: Sun, 22 Mar 2020 00:44:27 +0100

Hi George,
Indeed those errors are not very informative, but also farily common with
membrane systems. Maybe David can find something wrong there, but some tips
that usually help is to minimize using CPU code, rather than GPU. Sometimes
and rather randomly an atom could get "lost". Also increasing the number of
atoms included in the nonbonded pairlist with skinnb in the first
equilibrations steps helps in some cases. For the you can add something
like this at the end of your input file (default is 2A):
&ewald
skinnb = 5
&end

El sáb., 21 mar. 2020 a las 23:58, David Cerutti (<dscerutti.gmail.com>)
escribió:

> If your random seed is set to -1, that is one possible source of the
> randomness (bases the PRNG on wall clock time). But I suspect that there
> is something else amiss with your system. Perhaps a strained bond or clash
> that is giving SHAKE problems. Can you reply (just to me) with your
> topology and inpcrd?
>
> Dave
>
>
> On Sat, Mar 21, 2020 at 3:43 PM Giorgos Lambrinidis <
> lambrinidis.pharm.uoa.gr> wrote:
>
> > Dear Amber Users
> >
> > I am facing a strange problem, regarding MSI RTX 2080 Super GPU.
> >
> > I am working on a transmembrane GPCR protein with 69385 atoms including
> > the Lipids and water molecules. I have created the system using the
> > Amber Tutorial 16 for Lipid14 ForceField.
> >
> > I run the equilibration protocol + the production simulation on a
> > computer with the following characteristics:
> >
> > AMD Ryzen 7 2700 Eight-Core Processor, 24GB RAM, GeForce GTX 1060 with
> > 6GB, and nvidia driver 418,43, GNU compilers and cuda 10.1. I am using
> > Amber 18 with Ambertools19.
> >
> > Few days ago, I bought a new GPU, MSI RTX 2080 Super 8GB and I installed
> > on the following system:
> >
> > Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz, 16GB RAM, with nvidia driver
> > 435,21, GNU compilers and cuda 10.1 . I am using Amber 18 with
> > Ambertools19.
> >
> > When I run the same GPCR protein with 69385 on the new GPU I get
> > randomly the following error:
> >
> > “cudamemcpy gpubuffer::download failed an illegal memory access was
> > encounterer”
> >
> > The job terminates but the next steps (hold or production based on Amber
> > Tutorial 16) are running normally until the next error etc.
> >
> > I know that this kind of error is very general. I am open to suggestions
> > how to determine if the error is because of the hardware, or in the
> > compilation process.
> >
> > I tried a bigger system produced by CHARMM-GUI for amber, and the
> > equilibration + production was run without any error.
> >
> > As I said the error is generating randomly. If I repeat the same job
> > with the same parameters I will get the error in a different step (on
> > hold or production step)
> >
> > I can share input files if necessary.
> >
> > Thank you in advance
> >
> > Dr. George Lamprinidis
> >
> > --
> > ---------------------------------------------
> > Dr George Lambrinidis
> > Researcher & Laboratory Assistant Staff
> > School of Health Sciences
> > Faculty of Pharmacy
> > National & Kapodistrian University of Athens
> > Greece
> > tel: +30 2107274304
> > +30 2107274521
> > fax: +30 2107274747
> > e-mail: lambrinidis.pharm.uoa.gr
> > geolampr.gmail.com
> > ---------------------------------------------
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>

-- 
Stephan Schott Verdugo
Biochemist
Heinrich-Heine-Universitaet Duesseldorf
Institut fuer Pharm. und Med. Chemie
Universitaetsstr. 1
40225 Duesseldorf
Germany
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Sat Mar 21 2020 - 17:00:01 PDT