Re: [AMBER] Question about mdinfo file. from Robert Duke on 2010-08-17 (Amber Archive Aug 2010)

From: Robert Duke <rduke.email.unc.edu>
Date: Tue, 17 Aug 2010 13:25:17 -0400

I think I would look at it as not an incorrect accounting, but a matter of
whether the mpi implementation is waiting in spin-locks on communications,
for which the user is then charged, or is actually blocking and awaiting an
async signal, in which case the cpu could actually be used by another
process. This latter mode is more appropriate for a machine that is in
multi-user mode (or multi-tasking on different tasks, like a workstation,
say) than nodes in a cluster considered to be a "supercomputer" or smp,
where the cpu's are for all intents dedicated to one process. I have not
delved into the guts of mpi to confirm this, but that is my best guess.
Regards - Bob
----- Original Message -----
From: "Ross Walker" <ross.rosswalker.co.uk>
To: "'AMBER Mailing List'" <amber.ambermd.org>
Sent: Tuesday, August 17, 2010 1:04 PM
Subject: Re: [AMBER] Question about mdinfo file.

> Hi Mayank,
>
> The only metric you should worry about here is the wallclock time. This
> tells you the time to run the actual calculation. Division between time
> spent on cpu vs gpu is not meaningful since the cpu blocks while the gpu
> is
> running and vice versa. Some OS's actually accrue cputime incorrectly
> depending on how it blocks, a similar thing happens with MPI installations
> that don't use interrupts when sitting at a barrier - e.g. Microsoft MPI.
> This will accrue minimal cpu time in the counters. E.g. with some MPI
> installations you can keep increasing the number of cpus in use and the
> cpu
> time keeps going down and down making the performance look better and
> better
> when really what is happening is lots of time is being accumulated in
> barriers but these don't show up in the cpu time counters. So the cpu time
> shows the performance getting better and better but really your
> calculation
> is taking longer and longer. This is mostly a recent occurrence in the
> last
> year or so as some MPI implementations have changed the way they sit at
> barriers. I can envision a similar thing with GPU runs.
>
> However, the wallclock time is always correct - since it is real world
> time
> so this is what you should look at to see your performance. The mdinfo
> file
> uses wallclock time in its calculations of performance.
>
> All the best
> Ross
>
>> -----Original Message-----
>> From: Mayank Daga [mailto:mdaga.vt.edu]
>> Sent: Tuesday, August 17, 2010 9:42 AM
>> To: AMBER Mailing List
>> Subject: Re: [AMBER] Question about mdinfo file.
>>
>> Thanks for the replies.
>>
>> I tried looking at the Timing values in mdout.
>>
>> It contains the Total CPU Time as well as Total Wall Time. For my
>> simulation, both are 50 secs. I wanted to know do these times include
>> the
>> time spent on the GPU? If not, where do I find the Total GPU Time? If I
>> use
>> 50secs to calculate ns/day, I am getting amazingly high numbers.
>>
>> Thanks again,
>>
>> ~mayank
>>
>> On Tue, Aug 17, 2010 at 12:00 PM, Jason Swails
>> <jason.swails.gmail.com>wrote:
>>
>> > It's certainly possible that the simulation becomes more efficient in
>> later
>> > steps, especially since the first steps have the setup and such.
>> However,
>> > it's trivial to calculate the ns/day that you achieve by looking at
>> the
>> > mdout file. Simply look at how long it took to complete your 10000
>> steps
>> > and convert that into ns/day. (Simple dimensional analysis that we
>> do in
>> > our 1st chemistry class). In fact, that's how it's done for the
>> mdinfo.
>> >
>> > On Tue, Aug 17, 2010 at 11:55 AM, Mayank Daga <mdaga.vt.edu> wrote:
>> >
>> > > What I am concerned is how would the ns/day be affected if and if
>> not the
>> > > simulations run to the entirety. The mdinfo file states ns/day
>> obtained
>> > due
>> > > to last 'x' steps, hence if the 'x' = 10000 and not 1000, is there
>> a
>> > chance
>> > > the average would be better for 10000 steps??
>> > > ~mayank
>> > >
>> > > On Tue, Aug 17, 2010 at 11:31 AM, Jason Swails
>> <jason.swails.gmail.com
>> > > >wrote:
>> > >
>> > > > Hello,
>> > > >
>> > > > I believe that pmemd does not update the mdinfo file as often as
>> it
>> > > updates
>> > > > the mdout file due to performance implications. You can figure
>> out the
>> > > > source of this discrepancy by digging through the pmemd code, but
>> this
>> > > has
>> > > > no effect on your results.
>> > > >
>> > > > Hope this helps,
>> > > > Jason
>> > > >
>> > > > On Tue, Aug 17, 2010 at 11:10 AM, Mayank Daga <mdaga.vt.edu>
>> wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > I am a newbie using AMBER on the GPUs.
>> > > > > When I run my simulations, I see two output files, mdout and
>> mdinfo.
>> > In
>> > > > the
>> > > > > mdinfo file, I see the timing details as to how many ns/day I
>> get.
>> > The
>> > > > > issue
>> > > > > is some steps are always uncompleted according to this file
>> while
>> > mdout
>> > > > > lists that all the steps have been completed. Why is this
>> > discrepancy?
>> > > > > For example, if I run a simulation for 10000 steps, mdinfo
>> shows 9000
>> > > > steps
>> > > > > remaining while mdout list energy values for all 10000 steps.
>> > > > >
>> > > > > I am using the input files as downloaded from the AMBER website
>> and
>> > to
>> > > > run
>> > > > > the simulation:
>> > > > > ~/amber11/bin/pmemd.cuda -O -i mdin -o mdout -p prmtop -c
>> inpcrd -r
>> > > > restrt
>> > > > > -x mdcrd -gpu 0
>> > > > >
>> > > > > Please explain this behaviour.
>> > > > >
>> > > > > Thanks,
>> > > > > ~mayank
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Mayank Daga | SyNeRGy Laboratory | Dept. of Computer Science
>> > > > > Virginia Tech | http://synergy.cs.vt.edu | http://www.cs.vt.edu
>> > > > > _______________________________________________
>> > > > > AMBER mailing list
>> > > > > AMBER.ambermd.org
>> > > > > http://lists.ambermd.org/mailman/listinfo/amber
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Jason M. Swails
>> > > > Quantum Theory Project,
>> > > > University of Florida
>> > > > Ph.D. Graduate Student
>> > > > 352-392-4032
>> > > > _______________________________________________
>> > > > AMBER mailing list
>> > > > AMBER.ambermd.org
>> > > > http://lists.ambermd.org/mailman/listinfo/amber
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Mayank Daga | SyNeRGy Laboratory | Dept. of Computer Science
>> > > Virginia Tech | http://synergy.cs.vt.edu | http://www.cs.vt.edu
>> > > _______________________________________________
>> > > AMBER mailing list
>> > > AMBER.ambermd.org
>> > > http://lists.ambermd.org/mailman/listinfo/amber
>> > >
>> >
>> >
>> >
>> > --
>> > Jason M. Swails
>> > Quantum Theory Project,
>> > University of Florida
>> > Ph.D. Graduate Student
>> > 352-392-4032
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>>
>>
>>
>> --
>> Mayank Daga | SyNeRGy Laboratory | Dept. of Computer Science
>> Virginia Tech | http://synergy.cs.vt.edu | http://www.cs.vt.edu
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Aug 17 2010 - 10:30:26 PDT