Re: AMBER: parallel pmemd with intel 9 fc

From: Robert Duke <rduke.email.unc.edu>
Date: Sat, 12 Aug 2006 14:58:59 -0400

Bala -
Have you tried setting the stack size to unlimited? Failure to do this is the common reason for seg faults in pmemd 8; in pmemd 9 I actually whack the stacksize limit in the code. Anyway, from the c shell you need a "limit stacksize unlimited" command in your .cshrc, and from a Bourne or bash shell you need a "ulimit -s unlimited" (in .bashrc for bash). If this does not work, there could be other issues, but so far we have had no memory exceptions due to bugs in the code (or at least as far as I can recollect). This stack problem is a notable nuisance with the ifort compiler and itanium machines, but it can also occur with pentium/opteron.
Regards - Bob Duke

  ----- Original Message -----
  From: bala
  To: ambermail
  Sent: Saturday, August 12, 2006 11:56 AM
  Subject: AMBER: parallel pmemd with intel 9 fc


  Dear Amber users,

   

  I am using Amber8 and doing simulations in a cluster using "pmemd". I am using 64 processors. I am submitting jobs through bsub command. The simulation gets stopped inbetween. I have pasted the errors I got in three different runs of the same job. I checked with Intel website for the Runtime errors [for the error code forrtl: severe (174): SIGSEGV, segmentation fault occurred]. It is given that this could happen if the program attempts an invalid memory reference. Kindly suggest me how to get rid of this problem.

   

  My input file is given below:

  &cntrl

    imin = 0, irest = 1, ntx = 7,

    ntb = 2, pres0 = 1.0, ntp = 1,

    cut = 10, ntr = 0,

    ntc = 2, ntf = 2,

    tempi = 300.0, temp0 = 300.0,

    ntt = 3, gamma_ln = 1.0,

    nstlim = 250000, dt = 0.002,

    ntpr = 100, ntwx = 100

   /

  Error files

  Error-file 1:

  forrtl: severe (174): SIGSEGV, segmentation fault occurred

  Image PC Routine Line Source

  libvapi.so 0000002A96BF74AF Unknown Unknown Unknown

  srun: error: n31: task49: Exited with exit code 174

  srun: Terminating job

  srun: error: n1: task0: Exited with exit code 174

   ---------------------------------------------------------------------------------

  Error-file 2:

  forrtl: severe (174): SIGSEGV, segmentation fault occurred

  Image PC Routine Line Source

  pmemd 00000000004480F6 Unknown Unknown Unknown

  pmemd 000000000044A114 Unknown Unknown Unknown

  pmemd 000000000045F1E3 Unknown Unknown Unknown

  pmemd 00000000004051B6 Unknown Unknown Unknown

  libc.so.6 0000002A95E20197 Unknown Unknown Unknown

  pmemd 00000000004050EA Unknown Unknown Unknown

  srun: error: n26: task41: Exited with exit code 174

  srun: Terminating job

  srun: error: n1: task0: Exited with exit code 174

  -----------------------------------------------------------------------------------

  Error-file 3:

  forrtl: severe (174): SIGSEGV, segmentation fault occurred

  Image PC Routine Line Source

  pmemd 00000000004480F6 Unknown Unknown Unknown

  pmemd 000000000044A114 Unknown Unknown Unknown

  pmemd 000000000045F1E3 Unknown Unknown Unknown

  pmemd 00000000004051B6 Unknown Unknown Unknown

  libc.so.6 0000002A95E20197 Unknown Unknown Unknown

  pmemd 00000000004050EA Unknown Unknown Unknown

  srun: error: n26: task41: Exited with exit code 174

  srun: Terminating job

  srun: error: n1: task0: Exited with exit code 174

   

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Sun Aug 13 2006 - 06:07:22 PDT
Custom Search