I think it should be okay to have exe's and input on an nfs share, but have
not done the expt. I am not sure what the exe paging systems look like on
these systems, but you shouldn't be paging in the exe from disk after
initial load anyway (sorry, I have not kept up with all these little fine
points on the gazillion systems I deal with). Going over to high "system
utilization" is probably spinlocking in system space - ie., its mpi system
overhead. The less than total gigabit ethernet saturation can be all kinds
of things including switches. Gigabit ethernet is just not the way to go
(you can get away with it, but you won't realize the full potential of the
rest of your hardware; then it is just a matter of the cost equation as to
what's the best buy).
Best Regards - Bob
----- Original Message -----
From: "Nikola Trbovic" <nt2146.columbia.edu>
To: <amber.scripps.edu>
Sent: Wednesday, June 06, 2007 2:43 PM
Subject: RE: AMBER: problem running parallel jobs
> Thanks, I'll do that right away. But let me just ask a potentially stupid
> question first: is it a bad idea to have the executables and input files
> on
> a shared nfs volume as well, or is that fine?
>
> Also, what I forgot to mention in my previous email, I've noticed that
> with
> 4 and more nodes, when performance starts dropping extremely, my cluster
> management software reports about 70% of the overall CPU usage as "System
> CPU", and only the remaining 30% or so as "User CPU". With 2 and less
> nodes,
> however, when performance still scales nicely, virtually all of the CPU
> usage is classified as "User CPU". On the other hand, the network load
> always stays way below the theoretical capacity of 1Gbps (~0.4Gbps
> regardless if cpu count). Does this still support the notion of an NFS or
> general gigabit issue?
>
> Nikola
>
> -----Original Message-----
> From: owner-amber.scripps.edu [mailto:owner-amber.scripps.edu] On Behalf
> Of
> Robert Duke
> Sent: Wednesday, June 06, 2007 2:03 PM
> To: amber.scripps.edu
> Subject: Re: AMBER: problem running parallel jobs
>
> Nikola -
> Yes, a shared nfs volume is a very bad place to write files. Very, very
> bad. Write to a local volume if at all possible. An nfs-shared volume
> can
> stall the master process for SECONDS on an attempted write, totally hosing
> performance across all your processes. Regarding pmemd performance, you
> should not expect too much from gigabit ethernet, but please run the
> factor_ix and jac benchmarks that ship with the source code to get results
> that can be compared to results on other systems. The only number I would
> suggest changing is nstlim, and as the cpu count goes above say 8, I would
> start bumping that value up to between 2000 and 5000. With pmemd, all
> sorts
>
> of loadbalancing goes on and you won't get optimal times for runs of less
> than 1000 steps - the code is still doing a ton of loadbalance adjusting,
> especially at higher cpu count.
> Regards - Bob Duke
>
> ----- Original Message -----
> From: "Nikola Trbovic" <nt2146.columbia.edu>
> To: <amber.scripps.edu>
> Sent: Wednesday, June 06, 2007 1:31 PM
> Subject: RE: AMBER: problem running parallel jobs
>
>
>> Thanks a lot Robert! That solved it! No more network errors!
>>
>> Now I've performed a few pmemd benchmarks (explicit water, ~16000 atoms)
>> over night and obtained terrible performance. First of all let me repeat
>> that I'm running on a gigabit cluster with four cores per node. Here are
>> the
>> benchmark results:
>>
>> Cores Nodes Time
>> 4 1 16113
>> 8 2 10251
>> 16 4 24143
>> 32 8 48138
>>
>> The odd thing is that while a 2-node job still achieves a 1.6-fold
>> speedup
>> over a single node job, a 4-node job achieves no speedup at all but
>> instead
>> takes more than twice as long as a 2-node job, and an 8-node job four
>> times
>> as long. So above 2 nodes performance scales linearly with the number of
>> nodes! I've read the recent note on pushing the limits with gigabit and
>> multiple cores, but I haven't seen any benchmarks reporting such an
>> extreme
>> drop in performance. I will run new benchmarks after increasing the
>> network
>> buffers and checking my switch settings, but I still wanted to make sure
>> that this type of performance scaling is not perhaps indicative of
>> remaining
>> problems with my network drivers, mpich2 installation or amber
>> installation.
>> I am using NFS on the cluster, and the trajectories were being saved
>> through
>> NFS on the head node. From the latest note on gigabit parallel computing
>> it
>> sounds like that is a really bad idea. Could it explain the observed
>> scaling?
>>
>> Thanks again to Robert, and in advance for any thoughts about the scaling
>> issue,
>>
>> Nikola
>>
>> -----Original Message-----
>> From: owner-amber.scripps.edu [mailto:owner-amber.scripps.edu] On Behalf
>> Of
>> Robert Konecny
>> Sent: Tuesday, June 05, 2007 5:35 PM
>> To: amber.scripps.edu
>> Subject: Re: AMBER: problem running parallel jobs
>>
>> Hi Nikola,
>>
>> try to disable the tcp segmentation offload on your eth0:
>>
>> /usr/sbin/ethtool -K eth0 tso off
>>
>> some versions of the tg3 driver choke on heavier traffic.
>>
>> robert
>>
>>
>>
>> On Tue, Jun 05, 2007 at 04:43:02PM -0400, Nikola Trbovic wrote:
>>> Dear all,
>>>
>>> I'm having problems running pmemd and sander with mpi on more than 2
>>> nodes over gigabit ethernet. Shortly after starting the job, one of the
>>> nodes (which one is random) reports a network error associated with the
>>> tg3 driver:
>>>
>>> tg3: eth0: transmit timed out, resetting
>>> tg3: tg3_stop_block timed out, ofs=2c00 enable_bit=2
>>> ...
>>>
>>> This node then disappears from the network for a couple of minutes and
>>> the job stalls, although it doesn't terminate.
>>>
>>> Running 4 processes on one node, or even 8 on two nodes works fine,
>>> however. I've tried using mpich2 and mpich, with fftw and without - it
>>> made no difference. I'm compiling pmemd with ifort on RHEL 4. I know
>>> this all indicates that it is not a problem with amber, but instead with
>>> my OS/tg3 driver. But I was wondering if anybody had experienced the
>>> same previously and could give advice on how to fix it.
>>>
>>> Thanks a lot in advance,
>>> Nikola Trbovic
>>>
>>> -----------------------------------------------------------------------
>>> The AMBER Mail Reflector
>>> To post, send mail to amber.scripps.edu
>>> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>> -----------------------------------------------------------------------
>> The AMBER Mail Reflector
>> To post, send mail to amber.scripps.edu
>> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>>
>> -----------------------------------------------------------------------
>> The AMBER Mail Reflector
>> To post, send mail to amber.scripps.edu
>> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>>
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Sun Jun 10 2007 - 06:07:15 PDT