Re: [AMBER] slow : amber job

From: Aron Broom <broomsday.gmail.com>
Date: Sat, 3 Mar 2012 23:03:51 -0500

Honestly, if you really have to solve this problem, I think you'll have to
do some trouble-shooting yourself. That is, you'll have to read through
the amber manual, do some controls with her system when it's running fast
and when it's not, and compare that against known problems.

It's just so much of an open-ended problem at the moment. I think if you
are going to get any help from the list at this point, you need to give
extremely specific information concerning the fast vs. slow runs. At the
moment from following this thread I have only the vaguest sense that as a
serial job it was fine, and now as a parallel MPI job there is a problem,
but what are the simulation speeds in both cases? What does the
configuration file look like (that is, can you post the actual config
file). How many atoms is her system? Has she tried scaling it up
incrementally, that is, since it runs as expected on 1 core, what happens
when she scales up to 2, 4, 8 cores on the same CPU? Does the problem only
come about when the job is spread over multiple nodes? How often is there
I/O happening? So many things, and I realize you're not the one running
it, but a molecular dynamics simulation is quite a complicated system of
calculations, without specifics there is no hope of finding the answer.

~Aron

On Sat, Mar 3, 2012 at 10:21 PM, akshar bhosale <akshar.bhosale.gmail.com>wrote:

> hi,
> but issue has been reported to us hencee have to sort it out..any
> pointers will be helpful
> thanks...
>
> On 3/4/12, Carlos Simmerling <carlos.simmerling.gmail.com> wrote:
> > The user should contact the list herself since she will or should know
> many
> > more details that we will need.
> > On Mar 3, 2012 1:27 PM, "akshar bhosale" <akshar.bhosale.gmail.com>
> wrote:
> >
> >> thanks for the detailed info Jason...
> >>
> >> however, job is running 100x slower. when she ran job earlier with
> >> same conf, it was performing fast now this one is slower. other jobs
> >> are fine using the same MPI and env...i mean non amber jobs. what
> >> tuning needs to be done?where can be problem? process spawing is aslo
> >> proper. we have 8 cpu nodes, and cpu load is 8, memory comsumpiton is
> >> around 3 GB out of 24 GB RAM per node. such 52 nodes.
> >>
> >> On 3/3/12, Jason Swails <jason.swails.gmail.com> wrote:
> >> > On Fri, Mar 2, 2012 at 10:14 PM, akshar bhosale
> >> > <akshar.bhosale.gmail.com>wrote:
> >> >
> >> >> hi,
> >> >> sorry to give incomplete info but even i dont hv complete info as i
> am
> >> >> sysadmin and our client is complying about job running very slow. she
> >> >> says that when she runs classical amber job, it runs fast but when
> she
> >> >> runs replica exchange job, it runs very slow.
> >> >>
> >> >
> >> > There are too many degrees of freedom here. First, the issue could be
> >> > completely unrelated to Amber (it could be a poorly configured MPI).
> It
> >> > could also be that the user doesn't know how to properly run MPI jobs
> on
> >> > the cluster (that is, she may be trying to run 16 threads across 2
> >> > 8-core
> >> > nodes, for instance, but is binding all 16 threads to the same node).
> >> >
> >> > She may also be comparing sander performance to pmemd performance,
> which
> >> > accounts for a factor of 2-3 in speed and scaling. She could also be
> >> > attempting exchanges of replicas every 2 steps, which will cause
> >> > significant slowdown (around 2x slower or so) but is completely
> >> > expected.
> >> >
> >> > This is only a small sample of the possible things that could be
> >> happening,
> >> > all that easily explain the available info. However, "very slow" and
> >> "runs
> >> > fast" could mean anything. Is it a 1000x slowdown? Is it a 2x
> >> > slowdown?
> >> > Each implies a different set of explanations (but simply giving us
> that
> >> > number is still not enough to diagnose).
> >> >
> >> > My suggestion would be to rule out factors that are _unrelated_ to
> Amber
> >> > first, since those are most likely the root causes of such
> observations.
> >> > For instance -- how are her MPI threads distributed? Is she using
> the
> >> > correct MPI executables that correspond to the ones that were used to
> >> build
> >> > parallel Amber in the first place? Etc.
> >> >
> >> > You could also check the Amber installation itself and make sure that
> >> > the
> >> > benchmark suite gives results expected based on comparisons to
> >> > http://ambermd.org.
> >> >
> >> > HTH,
> >> > Jason
> >> >
> >> >
> >> >> On 3/1/12, Cannon, John F. <CannonJ.health.missouri.edu> wrote:
> >> >> > Dear Akshar,
> >> >> >
> >> >> > You have provided practically no useful information about your
> >> >> > simulation
> >> >> > for diagnosis. How many atoms? What is the nonbonded cutoff, etc?
> >> >> > What
> >> >> were
> >> >> > the benchmarks on other simulations on your computer?
> >> >> >
> >> >> > John Cannon
> >> >> > Genetics Program Chair and
> >> >> > Associate Professor of
> >> >> > Molecular Microbiology and Immunology
> >> >> > University of Missouri
> >> >> > 1 Hospital Drive
> >> >> > Columbia, Missouri 65212
> >> >> >
> >> >> >
> >> >> > -----Original Message-----
> >> >> > From: akshar bhosale [mailto:akshar.bhosale.gmail.com]
> >> >> > Sent: Thursday, March 01, 2012 11:44 AM
> >> >> > To: amber.ambermd.org
> >> >> > Subject: [AMBER] slow : amber job
> >> >> >
> >> >> > hi,
> >> >> >
> >> >> > my amber jobs are running very slow and has completed only 1 ns in
> 3
> >> >> days. i
> >> >> > am using amber 10. job is bigger.
> >> >> >
> >> >> > _______________________________________________
> >> >> > AMBER mailing list
> >> >> > AMBER.ambermd.org
> >> >> > http://lists.ambermd.org/mailman/listinfo/amber
> >> >> >
> >> >> > _______________________________________________
> >> >> > AMBER mailing list
> >> >> > AMBER.ambermd.org
> >> >> > http://lists.ambermd.org/mailman/listinfo/amber
> >> >> >
> >> >>
> >> >> _______________________________________________
> >> >> AMBER mailing list
> >> >> AMBER.ambermd.org
> >> >> http://lists.ambermd.org/mailman/listinfo/amber
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Jason M. Swails
> >> > Quantum Theory Project,
> >> > University of Florida
> >> > Ph.D. Candidate
> >> > 352-392-4032
> >> > _______________________________________________
> >> > AMBER mailing list
> >> > AMBER.ambermd.org
> >> > http://lists.ambermd.org/mailman/listinfo/amber
> >> >
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Mar 03 2012 - 20:30:02 PST
Custom Search