Re: [AMBER] Amber LES.MPI crash

From: Kirill Nuzhdin <knuzhdin.nd.edu>
Date: Wed, 02 Jan 2013 14:54:37 -0500

Thank you for the reply! Initially I was concerned more about program
behavior: to my opinion a program should not crash under any
circumstances (even when used in unintended way), and if it does - there
is something seriously wrong with it. Will try to experiment with the tests.


On 1/2/2013 12:04 PM, Carlos Simmerling wrote:
> i think you need to ask people about the method you're using, not the
> program. Since email subject lines are so important, send a new email with
> a subject related to the method you're using and the error you get. With
> your current subject, only people that know about LES will read your email
> in detail, and your problem seems to actually relate to PIMD.
>
> the LES test case (it won't test PIMD in the example below) tells you that
> you need to use a different number of processors to have it work. You could
> try doing that- but again, it's not the PIMD test.
>
>
> On Wed, Jan 2, 2013 at 11:08 AM, Kirill Nuzhdin <knuzhdin.nd.edu> wrote:
>
>> Does anyone know if I'm using sander.LES.MPI in a correct way?
>> Are there any reasonable examples? The only example of Malonaldehyde
>> from the Amber12 manual (and the test suite) is somewhat brief and useless.
>>
>> As for test cases, I found logs for tests, where it looks like LES.MPI
>> tests are skipped:
>> =============================
>> export TESTsanderLES=/opt/crc/amber/amber12/intel/bin/sander.LES.MPI;
>> make -k test.sander.LES
>> make[3]: Entering directory
>> `/afs/crc.nd.edu/x86_64_linux/amber/amber12/intel/test'
>> cd LES_noPME && ./Run.LESmd
>> SANDER: LES MD gas phase
>> DO_PARALLEL set to mpirun -np 16
>> too many processors for this test, exiting (Max = 12)
>> ============================================================
>> cd LES_noPME && ./Run.LESmd.rdiel
>> SANDER: LES MD gas phase rdiel
>> DO_PARALLEL set to mpirun -np 16
>> too many processors for this test, exiting (Max = 12)
>> =============================
>>
>>
>> On 12/26/2012 1:39 PM, Carlos Simmerling wrote:
>>> Did the test case pass?
>>> On Dec 26, 2012 12:27 PM, "Kirill Nuzhdin" <knuzhdin.nd.edu> wrote:
>>>
>>>> trying to run sander.LES.MPI like that:
>>>> mpiexec -n 4 $AMBERHOME/bin/sander.LES.MPI -ng 4 -groupfile
>>>> gf_Hqspcfw.pimd > sander_Hqspcfw.pimd.out
>>>>
>>>> where
>>>>
>>>>
>>>> gf_Hqspcfw.pimd:
>>>> =============================
>>>> -O -i Hqspcfw.pimd.in -p Hqspcfw.pimd.prmtop -c spcfw.pimd.rst.1 -o
>>>> bead.pimd1.out -r bead.pimd1.rst -x bead.pimd1.crd -v bead.pimd1.vel
>>>> -inf bead.pimd1.info -pimdout rpmd.pimd.out
>>>> -O -i Hqspcfw.pimd.in -p Hqspcfw.pimd.prmtop -c spcfw.pimd.rst.2 -o
>>>> bead.pimd2.out -r bead.pimd2.rst -x bead.pimd2.crd -v bead.pimd2.vel
>>>> -inf bead.pimd2.info -pimdout rpmd.pimd.out
>>>> -O -i Hqspcfw.pimd.in -p Hqspcfw.pimd.prmtop -c spcfw.pimd.rst.3 -o
>>>> bead.pimd3.out -r bead.pimd3.rst -x bead.pimd3.crd -v bead.pimd3.vel
>>>> -inf bead.pimd3.info -pimdout rpmd.pimd.out
>>>> -O -i Hqspcfw.pimd.in -p Hqspcfw.pimd.prmtop -c spcfw.pimd.rst.4 -o
>>>> bead.pimd4.out -r bead.pimd4.rst -x bead.pimd4.crd -v bead.pimd4.vel
>>>> -inf bead.pimd4.info -pimdout rpmd.pimd.out
>>>> =============================
>>>>
>>>>
>>>> Hqspcfw.pimd.in:
>>>> =============================
>>>> &cntrl
>>>> ipimd = 4
>>>> ntx = 1, irest = 0
>>>> ntt = 0
>>>> jfastw = 4
>>>> nscm = 0
>>>> temp0 = 300.0, temp0les = -1.
>>>> dt = 0.0002, nstlim = 10
>>>> cut = 7.0
>>>> ntpr = 1, ntwr = 5, ntwx = 1, ntwv = 1
>>>> /
>>>> =============================
>>>>
>>>>
>>>> non-MPI, LES version running with Hqspcfw.pimd.in, Hqspcfw.pimd.prmtop
>>>> and spcfw.pimd.rst.* is fine!
>>>>
>>>> while sander.LES.MPI (as soon as any of the four tasks from the group
>>>> file is done) is crashing with the following error:
>>>>
>>>> =============================
>>>> *** glibc detected *** /opt/crc/amber/amber12/intel/bin/sander.LES.MPI:
>>>> munmap_chunk(): invalid pointer: 0x00000000206132b0 ***
>>>> ======= Backtrace: =========
>>>> /lib64/libc.so.6(cfree+0x166)[0x31060729d6]
>>>> /afs/
>>>>
>> crc.nd.edu/x86_64_linux/intel/12.0/lib/intel64/libifcore.so.5(for__free_vm+0x1b)[0x2b0a7266249b]
>>>> /afs/
>> crc.nd.edu/x86_64_linux/intel/12.0/lib/intel64/libifcore.so.5(for__deallocate_lub+0x13a)[0x2b0a726309da]
>>>> /afs/
>> crc.nd.edu/x86_64_linux/intel/12.0/lib/intel64/libifcore.so.5(for_close+0x448)[0x2b0a72608b08]
>>>>
>> /opt/crc/amber/amber12/intel/bin/sander.LES.MPI(close_dump_files_+0x122)[0x588ad2]
>> /opt/crc/amber/amber12/intel/bin/sander.LES.MPI(sander_+0xa961)[0x506ac5]
>>>> /opt/crc/amber/amber12/intel/bin/sander.LES.MPI(MAIN__+0x1c4b)[0x4fc0cb]
>>>> /opt/crc/amber/amber12/intel/bin/sander.LES.MPI(main+0x3c)[0x46cbec]
>>>> /lib64/libc.so.6(__libc_start_main+0xf4)[0x310601d994]
>>>> /opt/crc/amber/amber12/intel/bin/sander.LES.MPI[0x46caf9]
>>>> =============================
>>>>
>>>> how to avoid that?
>>>>
>>>> Thank you!
>>

-- 
Best regards,
Kirill Nuzhdin
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jan 02 2013 - 12:00:03 PST
Custom Search