Re: Re: Re: [AMBER] sander.MPI parallel run problem

From: Dechang Li <li.dc06.gmail.com>
Date: Mon, 6 Apr 2009 12:09:04 +0100



        They all occupy the cpu source (cpu load up to > 90%), so I kill them.

Actually, all of them accumulated time, as you seen below:

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
9279 user1 25 0 37928 6088 1592 R 101 0.1 11:03.15 /hpexport2/home/user1/
9281 user1 25 0 38864 6128 1592 R 101 0.1 11:03.07 /hpexport2/home/user1/
9253 user1 16 0 37928 6088 1592 R 99 0.1 10:45.95 /hpexport2/home/user1/
9280 user1 25 0 37928 6068 1592 R 99 0.1 11:03.23 /hpexport2/home/user1/
9207 user1 16 0 38864 6128 1592 R 97 0.1 10:44.25 /hpexport2/home/user1/
9230 user1 16 0 37928 6068 1592 R 97 0.1 10:41.93 /hpexport2/home/user1/
9202 user1 16 0 35872 4848 2184 R 91 0.1 10:09.37 /hpexport2/home/user1/
                                                   ~~~~~~~~
                                                                                                   Do you mean this time?

>why kill them? did you check to see how many are actually accumulating time?
>use something like the unix "top" command. it's not clear that there is a
>problem.
>
>
>2009/4/6 Dechang Li <li.dc06.gmail.com>
>
>> Dear Dr. Simmerling,
>>
>> Thank you for your reply!
>> I check the .out file, it displayed "Running AMBER/MPI version on 4
>> nodes".
>> If there some things not actually doing work, however, when I killed either
>> one
>> of them, the simulation down immediately.
>>
>>
>>
>>
>>
>> >how many nodes does sander say it is using (it really means mpi threads,
>> not
>> >nodes)? also, are all 7 collecting cpu time? some may be things not
>> actually
>> >doing work, such as one that starts up the other mpi jobs.
>> >
>> >2009/4/6 Dechang Li <li.dc06.gmail.com>
>> >
>> >> Dear all,
>> >>
>> >> I run the parallel simulation using sander.MPI in a cluster.
>> >> The command I used was:
>> >>
>> >> mpirun -np 4 -machinefile myhosts ...
>> >>
>> >> I required 4 cpus to do the parallel simulation, but finally
>> >> there were 7 programes run at the node:
>> >>
>> >> 9279 user1 25 0 37928 6088 1592 R 101 0.1 11:03.15
>> >> /hpexport2/home/user1/amber9/exe/sander.MPI cn27 56237 -p4amslave
>> >> -p4yourname cn27
>> >> 9281 user1 25 0 38864 6128 1592 R 101 0.1 11:03.07
>> >> /hpexport2/home/user1/amber9/exe/sander.MPI cn27 56237 -p4amslave
>> >> -p4yourname cn27
>> >> 9253 user1 16 0 37928 6088 1592 R 99 0.1 10:45.95
>> >> /hpexport2/home/user1/amber9/exe/sander.MPI cn27 56237 -p4amslave
>> >> -p4yourname cn27
>> >> 9280 user1 25 0 37928 6068 1592 R 99 0.1 11:03.23
>> >> /hpexport2/home/user1/amber9/exe/sander.MPI cn27 56237 -p4amslave
>> >> -p4yourname cn27
>> >> 9207 user1 16 0 38864 6128 1592 R 97 0.1 10:44.25
>> >> /hpexport2/home/user1/amber9/exe/sander.MPI cn27 56237 -p4amslave
>> >> -p4yourname cn27
>> >> 9230 user1 16 0 37928 6068 1592 R 97 0.1 10:41.93
>> >> /hpexport2/home/user1/amber9/exe/sander.MPI cn27 56237 -p4amslave
>> >> -p4yourname cn27
>> >> 9202 user1 16 0 35872 4848 2184 R 91 0.1 10:09.37
>> >> /hpexport2/home/user1/amber9/exe/sander.MPI -O -i
>> >> /hpexport2/home/user1/abc/water/
>> >>
>> >>
>> >> What is the problem?
>> >>
>> >> Best regards,
>> >> 2009-4-6
>> >>
>> >>
>> >>
>> >> =========================================
>> >> Dechang Li, Ph.D Candidate
>> >> Department of Engineering Mechanics
>> >> Tsinghua University
>> >> Beijing 100084
>> >> P.R. China
>> >>
>> >> Tel: +86-10-62773574(O)
>> >> Email: lidc02 at mails.tsinghua.edu.cn
>> >> =========================================
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER.ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >_______________________________________________
>> >AMBER mailing list
>> >AMBER.ambermd.org
>> >http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber

= = = = = = = = = = = = = = = = = = = =
                        

¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡ÖÂ
Àñ£¡
 
                                 
¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡Dechang Li
¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡Li.DC06.gmail.com
¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡¡2009-04-06

Received on Wed Apr 08 2009 - 01:08:14 PDT
Custom Search