Re: [AMBER] If this is normal for 3 or 4 sander.MPI jobs appearing on one node

From: Carlos Simmerling <carlos.simmerling.gmail.com>
Date: Tue, 2 Feb 2010 14:51:57 -0500

this is not about sander, but about your queue software. if you request 16
procs and 4 nodes, you are right that there should be 4 per node. in many
cases this is fine and desirable. the questions are:
1) why does the admin think this is incorrect?
2) why are they not getting ~100% cpu time?


On Tue, Feb 2, 2010 at 2:46 PM, MUHAMMAD IMTIA SHAFIQ <
imtiazshafiq.gmail.com> wrote:

> Dear All,
>
> In our cluster we have 36 processors on 9 nodes. I am running my sander.MPI
> run with mpirun specifying 16 processors.
>
> I have received an email from our network administrator that I am running
> 3 to 4 sander.MPI jobs on each node and according to him this is not
> correct and I am preventing others jobs
>
> Here is output of showq
> 414039 mis9 Running 16 2:03:42:46 Sun Jan 31
> 19:24:12
> 1 Active Job 16 of 36 Processors Active (44.44%)
> 4 of 9 Nodes Active (44.44%)
>
>
> I have SSH to the cluster and nodes and seen that 3 or 4 sander.MPI jobs
> are running on different nodes with a CPU usage of about 30% to 45% .
>
> As per my knowledge and understating it seems to be normal as I have
> specified 16 processors for sander.MPI run so on a single node (having 4
> processors) it is excepted to have 4 jobs. So I have no issues as I am
> getting correct output files according to the tutorial and every thing seems
> fine to me.
>
> Please guide me if this is normal for 3 or 4 sander.MPI jobs appearing on
> one node or something wrong ? If there is something wrong please suggest me
> how to correct it.
>
> Here is a screenshot of TOP command
>
> Tasks: 80 total, 4 running, 76 sleeping, 0 stopped, 0 zombie
>
> Cpu(s): 30.0% us, 7.2% sy, 0.0% ni, 59.5% id, 0.0% wa, 0.1% hi, 3.2%
> si
>
> Mem: 3990416k total, 1573272k used, 2417144k free, 152048k buffers
>
> Swap: 3911788k total, 0k used, 3911788k free, 375496k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 6137 mis9 16 0 354m 207m 3880 S 51 5.3 414:57.95 sander.MPI
> 6139 mis9 16 0 354m 206m 3356 S 37 5.3 413:06.71 sander.MPI
> 6138 mis9 16 0 354m 206m 3880 R 36 5.3 414:25.87 sander.MPI
> 6136 mis9 16 0 354m 208m 4972 R 29 5.3 353:33.69 sander.MPI
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Feb 02 2010 - 12:00:03 PST
Custom Search