Re: [AMBER] memory issue in mmpbsa_py_nabnmode from Jason Swails on 2015-02-09 (Amber Archive Feb 2015)

From: Jason Swails <jason.swails.gmail.com>
Date: Mon, 9 Feb 2015 22:51:23 -0500

On Mon, Feb 9, 2015 at 8:28 PM, David A Case <case.biomaps.rutgers.edu>
wrote:

> On Tue, Feb 10, 2015, Marek Maly wrote:
> >
> > I am trying to do calculation of the entropic part of the binding energy
> > of my system (big protein + small ligand). Complex has 12819 atoms.
> >
> > I know it is pretty big system for such analysis but I have personal
> > machine with 74GB RAM and also access to our brand new cluster with
> > nodes having 130 GB RAM.
>
> This looks like enough RAM.
>

Agreed. The full Hessian takes (12819*3)**2 doubles -- roughly 11 GB.
Since it's upper-triangular, so you need roughly half of this, but you need
scratch space and space for the eigenvectors as well, but I can't imagine
this blowing 74 GB, let alone 130 GB of RAM.

I'd advise against running the system through mmpbsa: take a complex
> structure
> and see if you can minimize it "by hand" (i.e. using the NAB code).
>

To put further support behind this suggestion -- the nab program computing
normal modes is launched as a subprocess from a Python thread... I have no
idea what, if any, restrictions get placed on the spawned subprocess as a
result (my initial suspicion is 'none', but this is worth trying).

> > MIN: Iter = 96 NFunc = 1492 E = -32142.52278 RMSG =
> 4.6957045e-02
> > ----------------------------------------------------------------
> > END: :-) E = -32142.52278 RMSG = 0.0469570
> >
> > ----Convergence Satisfied----
> >
> > iter Total bad vdW elect nonpolar genBorn
> > frms
> > ff: 1 -32142.52 10756.97 -4400.07 -28725.03 0.00 -9774.39
> > 4.70e-02
> >
> > Energy = -3.2142522781e+04
> > RMS gradient = 4.6957044817e-02
> > allocation failure in vector: nh = -1336854855
>
> Not sure what is going on here (maybe Jason knows), but you will need
> a much lower gradient than 0.04 to proceed...the gradient should at least
> be in the range of 10**-6 to 10**-8.
>

I agree that 0.04 is too low. It's not a bad test (limits the time spent
in minimization), but a stricter tolerance should be used for normal mode
calculations. Did you change the drms value in the MMPBSA.py input file?

> Once this is done, (if it can be), try a normal mode calculation. I'm
> guessing the error above comes from line 1838 of nmode.c (but you could
> use print statements or a debugger to be sure.) In some way, you have to
> figure out how a negative value for nh is being passed to the "vector()"
> routine. This looks like integer overflow, but that should not be
> happening.
>

Except the vector constructor uses size_t, not int (and it looks to me like
all the types in nmode.c are size_t that end up getting passed to vector(),
too, so there's no implicit conversion that I see going on). My
understanding is that size_t should not overflow for legal memory address
ranges...

> 
> Is there any chance you are using a 32-bit OS or compiler?
>

Can't if her program is already using 24 GB at some point... In any case,
you would need at LEAST a 32 bit integer to even overflow back to -1.3 bn.

> Compile the following program and see what result you get:
>
> #include <stdio.h>
> int main(){
> printf( "%zd\n", sizeof(size_t));
> }
>
> [However, no machine with over 70 Gb of RAM would have a 32-bit OS....]
>

It *could*... only the first 4 GB would be addressable though :). For a
long time Windoze machines would come with 8+GB of RAM loaded on a 32bit
OS...

All the best,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Mon Feb 09 2015 - 20:00:02 PST