Re: [AMBER] AMBER11, pmemd.cuda: the system crashed.

From: Ross Walker <ross.rosswalker.co.uk>
Date: Mon, 12 Dec 2011 20:09:36 -0800

Hi Chinh,

Can you send me (offlist) all of your input files please along with details
of your computer system. OS, NVIDIA compiler and driver version, hardware
spec (especially the GPU version). I need to be able to replicate this in
order to investigate what is going wrong.

Please also include the output from the run that gave what looked like a
disk error.

Thank you.

All the best
Ross

> -----Original Message-----
> From: Chinh Su Tran To [mailto:chinh.sutranto.gmail.com]
> Sent: Monday, December 12, 2011 8:04 PM
> To: AMBER Mailing List
> Subject: Re: [AMBER] AMBER11, pmemd.cuda: the system crashed.
>
> Dear Dr. Walker,
>
> As you suggested, we ran the check for the hard disk, but both were
> clean!
> I tried to run the same code using pmemd only, and it was fine, i.e. no
> crash, no error.
>
> But then I returned using pmemd.cuda, it happened again (crashed).
>
> There were also 2 problems that I noticed when I was using pmemd.cuda:
>
> 1. When I used *iwrap=0* (in the input file as below), it showed
> "segmentation
> fault" immediately. I knew that it was an old error I encountered (but
> I
> wanted to try it to detect the pmemd.cuda only).
> 2. Then I switched it* iwrap=1* with some modification in the
> *gpu.cpp*(the solution that I found in the AMBER forum), it crashed. (
> *However, it also crashed before these modifications)*
>
> Please help. We did not know what was wrong.
>
> The input is:
>
> &cntrl
> imin=0,
> * iwrap=1 => I also tried iwrap=0*
> irest=0,
> ntx=1,
> ntb=1,
> cut=10.0,
> ntr=0,
> ntc=2,
> ntf=2,
> tempi=500.0,
> temp0=500.0,
> ntt=3,
> gamma_ln=1.0,
> nstlim=3000000, dt=0.002,
> ntpr=1500, ntwx=1500,ntwr=1000
> /
>
>
> Thank you.
> Chinsu
>
>
>
>
> On Tue, Dec 6, 2011 at 1:01 PM, Ross Walker <ross.rosswalker.co.uk>
> wrote:
>
> > Hi Chinsu,
> >
> > This looks like a hard drive failure to me (or pending hard drive
> failure).
> > Please try things with the CPU version of the code and see what
> happens. I
> > can't see how this could be generated by the GPU code. You might want
> to
> > try
> > booting the machine in single user (or recovery mode) and running an
> fsck
> > on
> > the file system. You could also try running a smartctl check on the
> hard
> > drive to see what it's diagnostics are reporting.
> >
> > All the best
> > Ross
> >
> > > -----Original Message-----
> > > From: Chinh Su Tran To [mailto:chinh.sutranto.gmail.com]
> > > Sent: Monday, December 05, 2011 7:33 PM
> > > To: AMBER Mailing List
> > > Subject: [AMBER] AMBER11, pmemd.cuda: the system crashed.
> > >
> > > Dear AMBER users,
> > >
> > > I was running pmemd.cuda using Amber11 on a GPU which is installed
> in a
> > > workstation.
> > > The process was of 2 steps of short minimizations and a 6ns of
> heating
> > > the
> > > protein (270 residues) at 500K.
> > >
> > > The minimizations were fine, but when i ran the heating, my system
> > > "crashed". The errors are as below:
> > >
> > > *[2932683.628873] EXT4-fs (sda1): previous I/O error to superblock
> > > detected*
> > > *
> > > [2932683.646789] EXT4-fs (device sda1): ext4_find_entry:933:
> > > inode#9699494:
> > > comm init: reading directory lblock 0**
> > > *
> > > *
> > > *
> > > I re-booted the system, and tried to run it again. The same errors
> > > came.
> > > My input file is:
> > >
> > > &cntrl
> > > imin=0,
> > > * iwrap=1 => I also tried iwrap=0*
> > > irest=0,
> > > ntx=1,
> > > ntb=1,
> > > cut=10.0,
> > > ntr=0,
> > > ntc=2,
> > > ntf=2,
> > > tempi=500.0,
> > > temp0=500.0,
> > > ntt=3,
> > > gamma_ln=1.0,
> > > nstlim=3000000, dt=0.002,
> > > ntpr=1500, ntwx=1500,ntwr=1000
> > > /
> > > *
> > > *
> > > *
> > > *
> > > We tried to find out what was going on, but we did not know where
> the
> > > crash
> > > came from?
> > > Please help.
> > >
> > > Thank you.
> > >
> > > Regards,
> > > Chinsu
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Dec 12 2011 - 20:30:04 PST
Custom Search