Re: [AMBER] sander mpi error when unfolding the protein

From: Chinh Su Tran To <chinh.sutranto.gmail.com>
Date: Mon, 10 Oct 2011 10:45:31 +0800

Dear Amber users,

According to the error about the rst file NOT generated, I got confused.
Please help explain me a bit.

As informed previously by mpi sander written that:

>> > Unit 30 Error on OPEN:
>> > ./ndm_heat4.rst

because the file ndm_heat3.rst was not generated. However, as I checked the
results, it shows:

ndm_heat3.mdcrd *with the size *1319492081 (~1.3 Gb)
ndm_heat3.out 1047322
ndm_heat3.rst 1982053

Note that the size are the same as the ndm_heat2's files accordingly. But
the heat2 was fine.
Then I visualized the mdcrd and rst files on VMD. Only the heat3.mdcrd
showed up (with 2000 frames), but the heat3.rst did not (0 frame). The image
is attached.

We doubted that the problem came from MPI of sander because it showed "the
mpi exit without calling finalize" when the job ran on 4 nodes (8 ppn).
Hence, this time the job only run on 1 node. There was no such error this
time, but still nothing shown in the heat3.rst!!!!

I tried to proceed to heat4 running, and this is the error:

forrtl: severe (64): input conversion error, unit 9, file
/gpfs/home/sutr0003/NDM1/NDM-simulation/heat-explicit/./ndm_heat3.rst
Image PC Routine Line Source
sander.MPI 000000000093AF9D Unknown Unknown Unknown
sander.MPI 0000000000939AA5 Unknown Unknown Unknown
sander.MPI 00000000008D6EF9 Unknown Unknown Unknown
sander.MPI 0000000000877A0D Unknown Unknown Unknown
sander.MPI 000000000087725A Unknown Unknown Unknown
sander.MPI 000000000089E96B Unknown Unknown Unknown
sander.MPI 000000000053C26E Unknown Unknown Unknown
sander.MPI 00000000004FE6A9 Unknown Unknown Unknown
sander.MPI 00000000004DE8CA Unknown Unknown Unknown
sander.MPI 00000000004D9A17 Unknown Unknown Unknown
sander.MPI 000000000042F3AC Unknown Unknown Unknown
libc.so.6 0000003B4481D974 Unknown Unknown
 Unknown
sander.MPI 000000000042F2B9 Unknown Unknown Unknown
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 26040 on
node compute247 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------


Please help. May I know what was wrong?

Thank you.

Regards,
Chinh


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

heat6ns-3_md.png
(image/png attachment: heat6ns-3_md.png)

Received on Sun Oct 09 2011 - 20:00:03 PDT
Custom Search