Re: [AMBER] help with TIP4P and mpi pmemd

From: Hashem Taha <hashemt.gmail.com>
Date: Fri, 4 Dec 2009 14:44:44 -0700

Hi Bob and Dave,

Thanks for your help. It was a SHAKE problem. I missed this in my input. All
is well now. Thanks

HT

On Fri, Dec 4, 2009 at 2:26 PM, Robert Duke <rduke.email.unc.edu> wrote:

> Okay, could be a blowup, as Dave Case suggests. Set ntpr to 1 and look at
> the per-step data. Run it in uniprocessor mode so you can get a reliable
> stderr. I suspect it is something about your input, but that includes
> prmtop/inpcrd, and I don't have those. It could be an as-yet undiscovered
> problem in the darden initialization code (since it happens in both sander
> and pmemd), but I think this is a lot less likely than some problem with the
> model/run conditions, since tip4p water has been used pretty extensively by
> at least two groups I know of. If you want to send all your inputs to me, I
> will consider doing a bit of debugging (don't send to list, just to me).
>
> Regards - Bob
> ----- Original Message ----- From: "Hashem Taha" <hashemt.gmail.com>
> To: "AMBER Mailing List" <amber.ambermd.org>
> Sent: Friday, December 04, 2009 4:01 PM
>
> Subject: Re: [AMBER] help with TIP4P and mpi pmemd
>
>
> yes pmemd 10. And yes, all the bugfixes have been applied. Moreover, this
>> problem also affects sander.
>>
>> On Thu, Dec 3, 2009 at 10:58 PM, Robert Duke <rduke.email.unc.edu> wrote:
>>
>> Are we talking pmemd 10? If so, has bugfix 8 been applied?
>>> Regards - Bob
>>>
>>> ----- Original Message ----- From: "Hashem Taha" <hashemt.gmail.com>
>>> To: "AMBER Mailing List" <amber.ambermd.org>
>>> Sent: Thursday, December 03, 2009 9:26 PM
>>> Subject: Re: [AMBER] help with TIP4P and mpi pmemd
>>>
>>>
>>>
>>> Hi Bob,
>>>
>>>>
>>>> I have tried this tip4p system before with the same molecule, and it
>>>> worked
>>>> fine (using serial sander, parallel sander and parallel pmemd). The same
>>>> exact input files were used in this case. There are no comment lines in
>>>> the
>>>> input file before &cntrl.
>>>>
>>>> I tried running the same job using a serial version of sander but I
>>>> encountered the same problem. I've recompiled sander using gcc with
>>>> debugging flags and this is what I get when I run sander in GDB:
>>>>
>>>> (gdb) run -O -i minwat.in -o minwat.out -p alpha_ara_ome_tip4p.top -c
>>>> alpha_ara_ome_tip4p.crd -r minwat.rst -ref alpha_ara_ome_tip4p.crd
>>>> Starting program: /home/john/amber10/bin/sander -O -i minwat.in -o
>>>> minwat.out -p alpha_ara_ome_tip4p.top -c alpha_ara_ome_tip4p.crd -r
>>>> minwat.rst -ref alpha_ara_ome_tip4p.crd
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> 0x00000000004bb74e in nb_adjust_ ()
>>>> (gdb) backtrace
>>>> #0 0x00000000004bb74e in nb_adjust_ ()
>>>> #1 0x00000000004bdd42 in ewald_force_ ()
>>>> #2 0x00000000005f8259 in force_ ()
>>>> #3 0x0000000000483797 in runmin_ ()
>>>> #4 0x00000000004734e3 in sander () at _sander.f:1296
>>>> #5 0x0000000000470124 in MAIN__ () at _multisander.f:291
>>>> #6 0x0000000000a2c6ae in main ()
>>>>
>>>> I don't have much experience with gdb but from the looks of it the error
>>>> is
>>>> originating from nb_adjust().
>>>>
>>>> I've tried recompiling sander and pmemd with different MPI libraries
>>>> (openmpi and mpich2) and no MPI, with and without MKL and using gfortran
>>>> and
>>>> ifort, all the these combinations resulted in a SIGSEGV fault error.
>>>> Although, I only added the debug flags to the gfortran/no parallel
>>>> version.
>>>>
>>>>
>>>>
>>>> On Thu, Dec 3, 2009 at 3:37 PM, Robert Duke <rduke.email.unc.edu>
>>>> wrote:
>>>>
>>>> Have you done this (tip4p) before? Try your prmtop/inpcrd/mdin with
>>>>
>>>>> single
>>>>> processor sander, then single processor pmemd, and then pmemd mpi. I
>>>>> bet
>>>>> you have setup problems, or pmemd build problems, but this will sort
>>>>> that
>>>>> out. I will let others expond on setting up an extra points simulation
>>>>> if
>>>>> that is the problem. As an aside, why did you modify the elec and vdw
>>>>> screening parms for 1-4 interactions, scnb and scee. This is I believe
>>>>> generally not recommended, but maybe you are doing something I don't
>>>>> know
>>>>> about... Also, do you really have two comment lines in front of
>>>>> &cntrl?
>>>>> I
>>>>> have never tried that, maybe it is inconsequential but I don't know...
>>>>> (because there are multiple reading passes, namelist i/o combined with
>>>>> group
>>>>> i/o, I would not do anything nonstandard. May work fine, but namelist
>>>>> read
>>>>> errors can be really obscure, especially in parallel - one reason to
>>>>> switch
>>>>> to a single processor test case if something wierd happens.
>>>>> Regards - Bob Duke
>>>>> ----- Original Message ----- From: "Hashem Taha" <hashemt.gmail.com>
>>>>> To: <amber.ambermd.org>
>>>>> Sent: Thursday, December 03, 2009 5:16 PM
>>>>> Subject: [AMBER] help with TIP4P and mpi pmemd
>>>>>
>>>>>
>>>>> I have a problem with trying to run some jobs using TIP4P water as the
>>>>>
>>>>> solvent. I have tried running the same exact files with TIP3P water
>>>>>> and
>>>>>> the
>>>>>> calculations started and completed perfectly. However, upon changing
>>>>>> from
>>>>>> TIP3P to TIP4P, my calculations would stop without reason. the file
>>>>>> that
>>>>>> I
>>>>>> am trying to run is just a water minimization and it results in the
>>>>>> following errors. The input file is also included below. The
>>>>>> calculations
>>>>>> start but after a few steps they come to a halt. Any help would be
>>>>>> appreciated, and if you require further information please let me
>>>>>> know...
>>>>>>
>>>>>> HT
>>>>>>
>>>>>> the errors are:
>>>>>>
>>>>>> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>>>>>>
>>>>>>
>>>>>>> Image PC Routine Line
>>>>>>>
>>>>>> Source
>>>>>> pmemd 000000000048265A Unknown Unknown
>>>>>> Unknown
>>>>>> pmemd 00000000004777C3 Unknown Unknown
>>>>>> Unknown
>>>>>> pmemd 00000000004AA1D5 Unknown Unknown
>>>>>> Unknown
>>>>>> pmemd 00000000004CA1CE Unknown Unknown
>>>>>> Unknown
>>>>>> pmemd 000000000040744C Unknown Unknown
>>>>>> Unknown
>>>>>> libc.so.6 0000003F4D81D8B4 Unknown Unknown
>>>>>> Unknown
>>>>>> pmemd 0000000000407359 Unknown Unknown
>>>>>> Unknown
>>>>>> rank 7 in job 55 compute-0-8.local_45343 caused collective abort of
>>>>>> all
>>>>>> ranks
>>>>>> exit status of rank 7: killed by signal 9
>>>>>>
>>>>>> the input file...
>>>>>>
>>>>>> Constant Volume Minimization
>>>>>> # Control section
>>>>>> &cntrl
>>>>>> ntwx = 50, ntpr = 1, ntwr = 1,
>>>>>> scnb = 1.0, scee = 1.0, nsnb = 25, dielc = 1, cut = 8.0,
>>>>>> ntb = 1,
>>>>>> maxcyc = 1000, ntmin = 0, dx0 = 0.01, drms = 0.0001,
>>>>>> ntp = 0,
>>>>>> ibelly = 0, ntr = 1,
>>>>>> imin = 1,
>>>>>> &end
>>>>>> Group Input for restrained atoms
>>>>>> 5.0
>>>>>> RES 1 2
>>>>>> END
>>>>>> END
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>> _______________________________________________
>>>>>
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Dec 04 2009 - 14:00:02 PST
Custom Search