Re: AMBER: Problems with CPU scaling on cluster

From: David E. Konerding <dekonerding.lbl.gov>
Date: Fri, 16 Jul 2004 08:46:01 -0700

Ross Walker wrote:

>
>>Are there any obvious bottlenecks there with regards to amber you can
>>
>>
>spot?
>
>I suspect this is your problem:
>
>6x up to 36-port gigabit ethernet switch
>
>Depending on how those 6 switches are interlinked you could have a serious
>bottle neck here. See if you can find out how they have configured these
>switches, do they have a full speed backblane connecting all the switches or
>have they just chained them together with gigabit? The problem comes if you
>have say 8 cpus on one switch and 8 on another. If the switches do not have
>a non blocking connection between them then you will have the equivalent of
>a single 1Gbit link serving 8 machines...
>
>One option would be to speak to the admins that control the queue system on
>your cluster and see if they can configure it to guarantee that the 16 cpus
>you are allocated for your job are all connected to the same switch, this
>may improve things since you will not then be communicating over the, most
>likely heavily contended, connections between the switches.
>
>

As a diagnostic, you should download and run 'link-checker' from
Microway. These sort of imbalances
are easy to detect visually, and can be used to demonstrate to the
administrator the importance of switch configuration.

Dave
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Fri Jul 16 2004 - 16:52:59 PDT
Custom Search