Re: AMBER: memory SC45 from Robert Duke on 2003-10-21 (Amber Archive Oct 2003)

From: Robert Duke <rduke.email.unc.edu>
Date: Tue, 21 Oct 2003 09:20:59 -0400

Mu -
One messup in the mail below. ulimit is a bourne shell and its derivatives
command. limit is a csh and its derivatives command. So I actually use
limit, not ulimit, since I use csh's. - Bob
----- Original Message -----
From: "Robert Duke" <rduke.email.unc.edu>
To: <amber.scripps.edu>
Sent: Tuesday, October 21, 2003 8:55 AM
Subject: Re: AMBER: memory SC45

> Mu -
> A 58K atom system should run in about 51 MB on a single processor, which
is
> nothing. Once again, HOW MANY PROCESSESORS are you using? If you are
using
> 128 on a shared memory machine, then there could be global allocation
maxima
> you are bumping into. If you are using 16, there should be no way you are
> having problems. Also, knowing how much physical memory is available per
> cpu, how it is shared, what the maximum physical memory really is, all are
> important if you push the limits. Also, it is possible there are issues
> with what else is running, though I just don't know enough about Tru64
SMP's
> to tell you. I have had problems with ulimits on machines I did not have
> root privileges on, so depending on how your facility is set up, you may
be
> having problems with propagating the ulimit setting to the other nodes you
> are running on. You would have to talk to a sysadmin on that one, but I
> would think that doing the ulimit in a .login or .cshrc (assuming csh)
ought
> to work (but it actually didn't for me here on the UNC linux cluster,
which
> is why I never changed the ifc build to actually use stack memory under
> Linux - not that I couldn't get it fixed, but that I didn't want to
imagine
> hoards of users dealing with sysadmins to get the stacksize ulimit
changed).
> I HAVE seen memory problems on Tru64 shared memory machines, but it was
with
> 91K atoms and 32 processors, and in a sense it is a measure of the
> unreasonableness of a shared memory setup as you scale larger and larger
(or
> at least using mpi on these systems, but the world is going to mpi because
> it is one standard everyone can get to run everywhere, and SMP systems
> ultimately bottleneck on cache coherence problems, so you eventually have
to
> shift to a networked cluster paradigm anyway). So the salient points:
> 1) You should be able to run at least 16 processors on this problem. If
you
> can't at least do this, then there is some sort of memory management issue
> at your facility. Actually, one other possibility could be that you have
> set the direct force calculation cutoff value very high. It defaults to
8,
> with a skin of 1 (the skin is a region of space from which atoms are put
in
> the pairlist because they are apt to move into the cutoff in the next few
> simulation steps). If you increased these to an arbitrarily large number,
> it could take lots of memory. I expect this is not the case but I am
trying
> to think of everything.
> 2) Talk to your sysadmin about ulimits and how much memory is really
> available.
> 3) Please send me all the information requested.
> Regards - Bob

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Tue Oct 21 2003 - 14:53:00 PDT