RE: AMBER: amber 9 sander crashed with "forrtl: severe (174): SIGSEGV, segmentation fault occurred"

From: Ross Walker <ross.rosswalker.co.uk>
Date: Mon, 17 Dec 2007 15:20:23 -0800

Hi James,

With Shake (NTF=2, NTC=2) you should be okay with 2fs. Note if you are not
using shake then you need 1fs or preferably less. However, most triangulated
water models are designed entirely around shake so you should not be running
them without shake.

Checking through input file in the email earlier I see something else that
may be a problem:

pres0 = 0.7

Do you really want to equilibrate to a pressure of 0.7 bar? This would seem
to be incorrect to me and could be what is causing the instability.

All the best
Ross

/\
\/
|\oss Walker

| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
| http://www.rosswalker.co.uk | PGP Key available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

> -----Original Message-----
> From: owner-amber.scripps.edu
> [mailto:owner-amber.scripps.edu] On Behalf Of Shuzhi Wang
> Sent: Monday, December 17, 2007 14:50
> To: amber.scripps.edu
> Subject: Re: AMBER: amber 9 sander crashed with "forrtl:
> severe (174): SIGSEGV, segmentation fault occurred"
>
> Hi Dr. Walker,
>
> Thank you very much for your help. I have been doing the tests you
> suggested last week and I think I found the problem. The problem
> actually comes from the too large a time step I used. When I
> change dt
> from 2 fs to 1 fs, everything works. So we can rule out the
> compiler bug
> at least, which would be much more serious.
>
> Shuzhi "James" Wang
>
> Please see below for my response:
>
> >Hi Shuzhi
> >
> >My first question is a simple one. Have you run the test
> cases both in
> >serial and in parallel? If so do they all pass? Do other
> simulations all
> >run
> >fine?
>
> >
> >You need to do this step before we can debug any further
> since from what
> >you
> >have said so far it suggests that it may be hardware
> problems - possible
> >interconnect failure if it only happens in parallel - or possibly a
> >compiler
> >bug.
> >
> >Have you tried PMEMD? Does the same problem occur in both
> PMEMD and in
> >sander.MPI?
> >
> >Also if you set ntpr=1 and ntwx=1 what happens? Does it still fail?
>
> >It may
> >be possible that you have a bad structure - sometimes this
> only shows up
> >when you switch to constant pressure. If you run with ntwx=1
> and ntpr=1 you
> >may be able to see the structure start to blow up before
> some division by
> >zero or similar infinite energy problem is leading to the segfault.
> >However,
> >the fact it runs okay in amber 8 and 7 suggests it is most probably a
> >compiler bug issue and running the test cases might help identify it.
> >
>
> ----- Original Message -----
> From: "Ross Walker" <ross.rosswalker.co.uk>
> To: <amber.scripps.edu>
> Sent: Tuesday, December 11, 2007 2:32 PM
> Subject: RE: AMBER: amber 9 sander crashed with "forrtl: severe (174):
> SIGSEGV, segmentation fault occurred"
>
>
> > Hi Shuzhi
> >
> > My first question is a simple one. Have you run the test
> cases both in
> > serial and in parallel? If so do they all pass? Do other
> simulations all
> > run
> > fine?
> >
> > You need to do this step before we can debug any further
> since from what
> > you
> > have said so far it suggests that it may be hardware
> problems - possible
> > interconnect failure if it only happens in parallel - or possibly a
> > compiler
> > bug.
> >
> > Have you tried PMEMD? Does the same problem occur in both
> PMEMD and in
> > sander.MPI?
> >
> > Also if you set ntpr=1 and ntwx=1 what happens? Does it
> still fail? It may
> > be possible that you have a bad structure - sometimes this
> only shows up
> > when you switch to constant pressure. If you run with
> ntwx=1 and ntpr=1
> > you
> > may be able to see the structure start to blow up before
> some division by
> > zero or similar infinite energy problem is leading to the segfault.
> > However,
> > the fact it runs okay in amber 8 and 7 suggests it is most
> probably a
> > compiler bug issue and running the test cases might help
> identify it.
> >
> > Good luck,
> > Ross
> >
> > /\
> > \/
> > |\oss Walker
> >
> > | HPC Consultant and Staff Scientist |
> > | San Diego Supercomputer Center |
> > | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
> > | http://www.rosswalker.co.uk | PGP Key available on request |
> >
> > Note: Electronic Mail is not secure, has no guarantee of
> delivery, may not
> > be read every day, and should not be used for urgent or
> sensitive issues.
> >
> >> -----Original Message-----
> >> From: owner-amber.scripps.edu
> >> [mailto:owner-amber.scripps.edu] On Behalf Of Shuzhi Wang
> >> Sent: Tuesday, December 11, 2007 13:07
> >> To: amber.scripps.edu
> >> Cc: Shuzhi Wang
> >> Subject: AMBER: amber 9 sander crashed with "forrtl: severe
> >> (174): SIGSEGV, segmentation fault occurred"
> >>
> >> Dear all,
> >>
> >> (Sorry for the long email. but my problem is complicated
> and i cannot
> >> shorten this.)
> >>
> >> I am a new user of Amber, and I bumped into a very
> >> frustrating problem
> >> in my first try of running Amber 9: SANDER keeps crashing after an
> >> uncertain number of steps with the error message as follows:
> >> ----------error message with output context---------------
> >> NSTEP = 17800 TIME(PS) = 37.800 TEMP(K) =
> >> 285.13 PRESS =
> >> -656.4
> >> Etot = -2390.0295 EKtot = 1023.2938 EPtot =
> >> -3413.3233
> >> BOND = 1.2793 ANGLE = 0.4961 DIHED
> >> =
> >> 0.0002
> >> 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS
> >> =
> >> 209.2466
> >> EELEC = -3624.3456 EHBOND = 0.0000 RESTRAINT
> >> =
> >> 0.0000
> >> EKCMT = 506.8383 VIRIAL = 996.4304 VOLUME =
> >> 34547.6103
> >> Density
> >> =
> >> 0.5226
> >> Ewald error estimate: 0.3956E-03
> >>
> >> --------------------------------------------------------------
> >> ----------------
> >>
> >> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> >> Image PC Routine Line
> >> Source
> >> sander 0000000000548A0C Unknown
> >> Unknown Unknown
> >> sander 00000000004FAB86 Unknown
> >> Unknown Unknown
> >> sander 00000000006BE194 Unknown
> >> Unknown Unknown
> >> sander 00000000004DBE6B Unknown
> >> Unknown Unknown
> >> sander 00000000004ADF9E Unknown
> >> Unknown Unknown
> >> sander 00000000004AA218 Unknown
> >> Unknown Unknown
> >> sander 0000000000404062 Unknown
> >> Unknown Unknown
> >> libc.so.6 0000003BA081D8A4 Unknown
> >> Unknown Unknown
> >> sander 0000000000403FA9 Unknown
> >> Unknown Unknown
> >>
> >> NSTEP = 17900 TIME(PS) = 37.900 TEMP(K) =
> NaN PRESS
> >> = NaN
> >> Etot = NaN EKtot = NaN EPtot
> >> = NaN
> >> BOND = 1.5918 ANGLE = 0.6282 DIHED
> >> =
> >> 0.2988
> >> 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS
> >> = NaN
> >> EELEC = NaN EHBOND = 0.0000 RESTRAINT
> >> =
> >> 0.0000
> >> EKCMT = 532.8891 VIRIAL = NaN VOLUME =
> >> 34531.6889
> >> Density
> >> =
> >> 0.5228
> >> Ewald error estimate: NaN
> >>
> >> --------------------------------------------------------------
> >> ----------------
> >>
> >> The whole situation is as follows:
> >>
> >> I want to run a NVT MD at 300 K on a nitrate ion in a 600
> POL3 water
> >> cubic box with periodic boundary conditions. I first generated the
> >> prmtop and inpcrd files using Leap. I minimized the system
> first, and
> >> then heated it up from 0K to 300K using NVT MD. In the third
> >> step, I did
> >> a NPT MD at 300 K to get the correct density (~1g/cc). It
> was at this
> >> step when I found the problem. The input file is attached
> >> below together
> >> with the command to start the simulation:
> >> ---------------input-----------------
> >> NO3-.(H2O)600: 100ps MD NPT
> >> &cntrl
> >> imin = 0,
> >> irest = 1, ntx = 7,
> >> ntb = 2, pres0 = 0.7, ntp = 1, taup = 5.0,
> >> ipol = 0,
> >> cut = 12.0,
> >> ntc = 2, ntf = 2,
> >> tempi = 300.0, temp0 = 300.0,
> >> ntt = 3, gamma_ln = 1.0,
> >> nstlim = 100000, dt = 0.001
> >> ntpr = 100, ntwx = 100, ntwr = 1000
> >> /
> >> ---------bash script to run sander--------------
> >> sander -O -i nit_600pol3_cube_md2.in -o nit_600pol3_cube_md2.out -p
> >> nit_600pol3_
> >> cube.prmtop -c nit_600pol3_cube_md1.rst -r
> >> nit_600pol3_cube_md2.rst -x
> >> nit_600po
> >> l3_cube_md2.mdcrd
> >>
> >>
> >> I searched the mail archive and only found a similar problem about
> >> DIVCON, which has already been corrected by a bugfix of
> amber 9. this
> >> amber 9 was compiled using intel fortran compiler 10.0.023. all bug
> >> fixes for amber 9 had been applied before compilation.
> >>
> >> i tried the following things:
> >> 1) changing the parameters, which didn't help at all. amber still
> >> crashed, although not exactly after the same number steps.
> >> 2) doing the same simulation on H2O in 600 POL water box
> (i.e. a 601
> >> POL3 water box), in which the same problem occurred.
> >> 3) using amber 8 (compiled with intel fortran compiler v9)
> >> and amber 7
> >> (compiled with some other fortran compiler, but i don't know
> >> which one),
> >> and amber 7 worked and finished the simulation, but it was
> >> slower than
> >> amber 9, cannot do NTT=3 temperature scaling, and there was
> >> no parallel
> >> sander i can use. amber 8 displayed the same problem as amber 9.
> >>
> >> i wonder if anyone can kindly help me out of this frustrating
> >> situation.
> >>
> >> thanks,
> >> Shuzhi "James" Wang
> >> --------------------------------------------------------------
> >> ---------
> >> The AMBER Mail Reflector
> >> To post, send mail to amber.scripps.edu
> >> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
> >>
> >
> >
> >
> --------------------------------------------------------------
> ---------
> > The AMBER Mail Reflector
> > To post, send mail to amber.scripps.edu
> > To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
> >
> --------------------------------------------------------------
> ---------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>


-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed Dec 19 2007 - 06:07:23 PST
Custom Search