Zhihong -
Carlos gave you incorrect advice. Use:
>>| Master NonSetup wall time: 54 seconds
Why? Three reasons. This is a time that excludes setup time, which is the
time that is required to set up to do a run at the beginning. The setup
time should not increase above what you see here (3 seconds) if you run
1000000 steps, so at short timings, it is misleading, and also it does not
really reflect "run times" - the time to run a step of dynamics. Second
issue - use a "wall time", or wallclock time, rather than a cpu time like
Carlos quoted. Why? Well, when you are waiting on a job or being billed
for use of a computer, it is wall clock time that matters. The cpu time
will always be less (aside from rounding down issues in wall clock time;
depends on when the code takes its initial and final times). The reason the
cpu time is less is that if your processors are stalled waiting for i/o -
either mpi or disk, depending on the system, your job may give the cpu back
to the system, so to speak, and typically this time will be "charged" to the
idle loop or other system tasks (assuming that the only job running on each
cpu in your job is your job, which should be the case). For many systems,
cpu time and wall time are pretty close; this is the case for systems where
i/o is handled by spin-locking, which basically means that the code polls
the i/o device continuously when it has to wait for i/o (thus the cpu time
is charged to you - this is the appropriate model for the kind of jobs we
run, but a lot of linux systems with cheap interconnects seem to actually
release the cpu when waiting on mpi i/o - this is probably a concession to
the fact that mpi is sometimes used in a multiuser environment). Finally,
the master time is the time to use, not even an average of total times from
the log (which are cpu times in any case). The master is in control of
everybody else, initiates reading of input, distributes necessary data,
turns the slaves loose to work on it, and ultimately collects final data
from all the other processes and reports it (writes to the various output
files). So the master is who you are waiting on, and who determines the
elapsed wall time for the job.
I would double nstlim to get a better time sample - your error on a 54
second time is 2% purely on clock resolution. Carlos is right about the
data distribution time being high; this is, I bet, gigabit ethernet in a
small linux cluster, somewhere between 8-16 processors? There are tuning
tips on the amber web site that may help some, but gigabit ethernet does not
have the bandwidth to handle pme at large processor count - the distributed
transposes for the fft's involve LOT's of data, and this adds up to lots of
communications time if the interconnect is slow.
Regards - Bob Duke
----- Original Message -----
From: "Carlos Simmerling" <carlos.csb.sunysb.edu>
To: <amber.scripps.edu>
Sent: Monday, May 29, 2006 5:50 AM
Subject: Re: AMBER: which time should I use to calculate the number of
"ps/Day"
> use the total time (52.79).
> HOWEVER it looks like you are spending mode time
> in data distribution than calculation- you might be using too many
> CPUs. try doing benchmarks with different numbers of processors.
> more is not always better.
>
> Zhihong Yu wrote:
>
>>Dear all,
>>
>> I've done the benchmark of JAC system for parallel pmed (Amber9), the
>> list of timings at the end of out file is as follow, then which time
>> should I use to calculate the number of "ps/Day" ? 52.79, 29.90 or 57?
>> thanks very much!
>>
>>
>>--------------------------------------------------------------------------------
>> 5. TIMINGS
>>--------------------------------------------------------------------------------
>>
>>| NonSetup CPU Time in Major Routines, Average for All Tasks:
>>|
>>| Routine Sec %
>>| ------------------------------
>>| DataDistrib 36.37 68.90
>>| Nonbond 15.67 29.69
>>| Bond 0.00 0.01
>>| Angle 0.03 0.05
>>| Dihedral 0.12 0.23
>>| Shake 0.15 0.29
>>| RunMD 0.41 0.78
>>| Other 0.03 0.05
>>| ------------------------------
>>| Total 52.79
>>
>>| PME Nonbond Pairlist CPU Time, Average for All Tasks:
>>|
>>| Routine Sec %
>>| ---------------------------------
>>| Set Up Cit 0.20 0.38
>>| Build List 0.75 1.42
>>| ---------------------------------
>>| Total 0.95 1.81
>>
>>| PME Direct Force CPU Time, Average for All Tasks:
>>|
>>| Routine Sec %
>>| ---------------------------------
>>| NonBonded Calc 9.98 18.90
>>| Exclude Masked 0.22 0.41
>>| Other 1.06 2.01
>>| ---------------------------------
>>| Total 11.26 21.33
>>
>>| PME Reciprocal Force CPU Time, Average for All Tasks:
>>|
>>| Routine Sec %
>>| ---------------------------------
>>| 1D bspline 0.20 0.38
>>| Grid Charges 0.36 0.69
>>| Scalar Sum 0.23 0.43
>>| Gradient Sum 0.45 0.84
>>| FFT 2.22 4.21
>>| ---------------------------------
>>| Total 3.46 6.56
>>
>>| PME Load Balancing CPU Time, Average for All Tasks:
>>|
>>| Routine Sec %
>>| ------------------------------------
>>| Atom Reassign 0.02 0.03
>>| Image Reassign 0.09 0.18
>>| FFT Slab Reassign 1.33 2.52
>>| ------------------------------------
>>| Total 1.44 2.73
>>
>>| Master Setup CPU time: 0.59 seconds
>>| Master NonSetup CPU time: 29.31 seconds
>>| Master Total CPU time: 29.90 seconds 0.01 hours
>>
>>| Master Setup wall time: 3 seconds
>>| Master NonSetup wall time: 54 seconds
>>| Master Total wall time: 57 seconds 0.02 hours
>>
>>
>>
>>-----------------------------------------------------------------------
>>The AMBER Mail Reflector
>>To post, send mail to amber.scripps.edu
>>To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>>
>>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed May 31 2006 - 00:11:57 PDT