Re: AMBER: memory SC45

From: Robert Duke <rduke.email.unc.edu>
Date: Tue, 21 Oct 2003 08:55:22 -0400

Mu -
A 58K atom system should run in about 51 MB on a single processor, which is
nothing. Once again, HOW MANY PROCESSESORS are you using? If you are using
128 on a shared memory machine, then there could be global allocation maxima
you are bumping into. If you are using 16, there should be no way you are
having problems. Also, knowing how much physical memory is available per
cpu, how it is shared, what the maximum physical memory really is, all are
important if you push the limits. Also, it is possible there are issues
with what else is running, though I just don't know enough about Tru64 SMP's
to tell you. I have had problems with ulimits on machines I did not have
root privileges on, so depending on how your facility is set up, you may be
having problems with propagating the ulimit setting to the other nodes you
are running on. You would have to talk to a sysadmin on that one, but I
would think that doing the ulimit in a .login or .cshrc (assuming csh) ought
to work (but it actually didn't for me here on the UNC linux cluster, which
is why I never changed the ifc build to actually use stack memory under
Linux - not that I couldn't get it fixed, but that I didn't want to imagine
hoards of users dealing with sysadmins to get the stacksize ulimit changed).
I HAVE seen memory problems on Tru64 shared memory machines, but it was with
91K atoms and 32 processors, and in a sense it is a measure of the
unreasonableness of a shared memory setup as you scale larger and larger (or
at least using mpi on these systems, but the world is going to mpi because
it is one standard everyone can get to run everywhere, and SMP systems
ultimately bottleneck on cache coherence problems, so you eventually have to
shift to a networked cluster paradigm anyway). So the salient points:
1) You should be able to run at least 16 processors on this problem. If you
can't at least do this, then there is some sort of memory management issue
at your facility. Actually, one other possibility could be that you have
set the direct force calculation cutoff value very high. It defaults to 8,
with a skin of 1 (the skin is a region of space from which atoms are put in
the pairlist because they are apt to move into the cutoff in the next few
simulation steps). If you increased these to an arbitrarily large number,
it could take lots of memory. I expect this is not the case but I am trying
to think of everything.
2) Talk to your sysadmin about ulimits and how much memory is really
available.
3) Please send me all the information requested.
Regards - Bob

----- Original Message -----
From: "Mu Yuguang (Dr)" <YGMu.ntu.edu.sg>
To: <amber.scripps.edu>
Sent: Monday, October 20, 2003 11:46 PM
Subject: RE: AMBER: memory SC45


> Thanks Bill,Bob
> My system is 57999 atoms with PME uing PMEMD. The system is Compac Tru64
> SC45.
>
> FATAL global dynamic memory setup allocation error!
>
> I try a smaller system, 331 atoms, it works
> With memory used printed in mdout file:
>
> | Dynamic Memory, Types Used:
> | Reals 95863
> | Integers 6957
>
> | Nonbonded Pairs Initial Allocation: 3472
>
> Alos I try
>
> unlimit memoryuse
> no effects
>
> What is the Dynamic Memory, how it scales with the number of atoms,
> Here 331 atoms ~ 100K
> How about 57999 atoms ?
>
>
> -----Original Message-----
> From: Robert Duke [mailto:rduke.email.unc.edu]
> Sent: Tuesday, October 21, 2003 11:17 AM
> To: amber.scripps.edu
> Subject: Re: AMBER: memory SC45
>
> Mu -
> Are you talking pmemd here, or sander 6, or sander 7? How many atoms?
> How
> many cpu's? How many processors sharing memory? How much physical
> memory
> total? You can run a 90906 atom problem in about 79 MB on a single
> processor, and since the pairlist is divided when running in parallel,
> memory requirements growth will be less than linear in processor count.
> Thus, about 25 processes would run in 2 GB on a shared memory machine
> (rough
> estimate). That is half the memoryuse listed. It is possible, but
> unlikely, for weird things to happen with mpi buffering. Without
> knowing
> more about your problem size and memory configuration, it is not
> possible to
> determine if it is reasonable for you to be running out of memory.
> Regards - Bob
> ----- Original Message -----
> From: "Mu Yuguang (Dr)" <YGMu.ntu.edu.sg>
> To: <amber.scripps.edu>
> Sent: Monday, October 20, 2003 10:39 PM
> Subject: AMBER: memory SC45
>
>
> > Dear all,
> > Thank you very much for your help.
> >
> > Now I have been successful in
> >
> > prun -
> >
> > But when I ask for more memory, in case my program treats with more
> > atoms,
> >
> > The program failed in allocate memory .
> >
> > I check my limit , it wrote as :
> >
> >
> > cputime unlimited
> >
> > filesize unlimited
> >
> > datasize 4194304 kbytes
> >
> > stacksize 3465214 kbytes
> >
> > coredumpsize 0 kbytes
> >
> > memoryuse 4089768 kbytes
> >
> > vmemoryuse 4194304 kbytes
> >
> > descriptors 4096
> >
> > Could I ask for the system administrator for setting the memoryuse
> and
> > vmemoryuse to be unlimited ?
> >
> >
> >
> -----------------------------------------------------------------------
> > The AMBER Mail Reflector
> > To post, send mail to amber.scripps.edu
> > To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
> >
> >
>
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
>



-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Tue Oct 21 2003 - 13:53:02 PDT
Custom Search