Re: [AMBER] memory issue in mmpbsa_py_nabnmode from Marek Maly on 2015-02-12 (Amber Archive Feb 2015)

From: Marek Maly <marek.maly.ujep.cz>
Date: Thu, 12 Feb 2015 23:14:51 +0100

Hi Jason,

thank you for your comments and complex explanation !

Please check my last questions (in the text).

Dne Thu, 12 Feb 2015 14:48:49 +0100 Jason Swails <jason.swails.gmail.com>
napsal/-a:

>
>> On Feb 10, 2015, at 10:49 AM, Marek Maly <marek.maly.ujep.cz> wrote:
>>
>> Dear David and Jason,
>>
>> first of all thank you for the really quick response !
>>
>> Since in this case it seems that there is really some memory
>> bug inside "mmpbsa_py_nabnmode" code, which might be not so "trivial"
>> to identify (or even to correct/solve) especially for the ordinary Amber
>> user as I am,
>> I am sending you relevant files off this mailing list to allow you i)
>> reproduce the issue which I reported ii) analyze the problem (for sure
>> 1000% more effectively than I could).
>>
>> Of course that on both mentioned machines we run 64bit Linux and
>> compilation was done
>> with these compilers:
>>
>> gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) (my machine)
>> gcc version 4.8.3 (Gentoo 4.8.3 p1.1, pie-0.5.9) (cluster)
>>
>> using 64bit libs (if not there would be inter alia impossible that the
>> nmode first phase
>> (already after the initial minimization) is working for several hours
>> with
>> 24GB RAM requirements).
>>
>> Regarding drms, that 0.05 was set just to not waste much time with
>> minimization and analyze the problem with nmode phase. I know that
>> recommended values you mentioned (i.e. at least 1e-6), but in case of
>> bigger systems it might be "killing" requirement ... so in such a cases
>> I
>> am using 0.01.
>>
>> The problem is that in the actual version of mmpbsa_py , all the work
>> (minimization + nmode) on the
>> given frame is done using just 1 CPU core (or even less if the
>> multithreading is activated).
>>
>> So only way in the actual version how to overcome this "weakness" is
>> probably to minimize
>> selected snaps in parallel manner (before mmpbsa_py analysis) using
>> sander
>> to reach drms 1e-6 or less in some acceptable time (especially for the
>> bigger molecular systems). Am I right ?
>
> Yes, but then you can’t use those frames to compute implicit solvent
> energies, since they will no longer have a Boltzmann distribution (you
> have taken them *off* the free energy surface and dropped them to the
> potential energy surface).

Even if for the purpose of the sander/pmemed minimization
igb parameter is set to the suitable value (the same like nmode_igb in
mmpbsa.py) to achieve
minimization including the implicit solvent effect of the same type which
is used by the mmpbsa.py
(e.g. igb=nmode_igb=1) ?

If there is still problem which you described (even if proper igb value is
used during sander/pmemd minimisation) how it is possible that
minimization of the frames directly by the mmpbsa.py is OK
and minimization which could be done with sander/pmemd is problematic as
the frames are
"taken *off* the free energy surface and dropped to the potential energy
surface)." ?

Which is the difference between the two possible minimization approaches ?

>
>> Anyway why exactly so low drms values (at least 1e-6) are recommended
>> before nmode analysis ?
>
> This has to do with the very nature of the normal mode approximation
> itself. It is based on diagonalizing the Hessian matrix and treating
> each of the eigenvectors as an orthogonal “vibrational” mode that is
> approximately harmonic at the minimum, and using the frequency
> (eigenvalue) -- which is related to the force constant of the harmonic
> well -- to calculate the vibrational entropy for that mode.
>
> Now there are 6 modes (5 for linear molecules) that represent purely
> translational or purely rotational motions -- these should have
> eigenvalues of 0, ideally, at the true minimum (since there is no
> potential energy well to rigid body translation or rotation in the
> absence of external fields). Other than those 6 modes, there will be
> modes with low frequencies (small force constants) that contribute
> greatly to the entropy, up to those with very high force constants that
> contribute very little.
>
> So the modes that contribute *most* to the entropy are those that are
> closest to 0. These modes are the *most* sensitive to deviations from
> the minimum, and very small changes to these eigenvalues can result in
> much *larger* changes to their contributions to the entropy. That’s one
> overarching problem with normal modes -- the most imprecise eigenvalues
> contribute the most to the final answer. (Another, of course, being that
> biomolecules are “soft” and globular and therefore do not rotate
> rigidly.)
>
> If you are far enough away from the minimum, then more than just those 6
> omitted modes will have negative eigenvalues (imaginary frequencies) --
> in those cases, nab omits those modes from the entropy calculation
> altogether... but those modes *would* be among the largest contributors
> if the structure was more fully minimized, so you can see where the big
> errors begin to creep in.

So if I understood well if I find (in the nmode out file) records like
this (bellow),
where evidently just 6 modes are omitted from the vibrational analysis,
the calculation
is OK. If there will be more than 6 omitted modes listed, the structure
was not
minimized enough and the final results are not reliable. Did I understood
you well ?

--------------------------------------------------------------------
              freq. E Cv S
             cm**-1 kcal/mol cal/mol-K cal/mol-K
Total: 4295.950 2398.211 2979.436
translational: 0.888 2.979 52.783
rotational: 0.888 2.979 52.433
vibrational: 6988.720 2392.254 2874.220
ff energy: -2694.546
      1 -0.180
      2 -0.000
      3 0.000
      4 0.000
      5 0.068
      6 0.155
      7 2.962 0.592 1.986 10.420
      8 3.466 0.592 1.986 10.108
      . . . . .
      . . . . .
------------------------------------------------------------------

>
>> I just for the curiosity did entropy analysis of much smaller complex
>> (1174 atoms):
>
> For every atom you add, you add *three* additional degrees of freedom,
> and three additional possible modes. Therefore, I don’t think that
> comparisons like this are that scalable.

I will check the difference in my medium size systems (several thousands
atoms) and will post the results here.

>
>> a) with drms = 0.01
>>
>> b) with drms = 1e-6
>>
>> The mmpbsa_py results (TdS) for these cases are:
>>
>> a)
>> DELTA S total= -49.9888 9.2120
>> 3.2569
>>
>> Total calculation time: 31.836 min.
>> Total normal mode calculation time: 16.327 min.
>>
>>
>> b)
>> DELTA S total= -48.2373 8.3468
>> 2.9510
>>
>> Total calculation time: 1.146 hr.
>> Total normal mode calculation time: 0.532 hr.
>>
>> As we can see the results are quite similar (difference cca 2%) but the
>
> I encourage you to look at the actual nmode output files. Look at the
> summary of the eigenvectors and eigenvalues -- how many are omitted? Is
> it just the 6? How much do those “smaller” modes differ in their
> entropy contributions between the drms=0.01 and drms=1e-6 cases?
> Certainly if nmode needed to omit more than the 6 initial modes, you
> have not minimized enough...
>
>> time requirements
>> are here 2.2x higher in case b). The interesting point here is that for
>> better minimized
>> system (b) not only the initial minimization part lasted longer (which
>> is
>> easy understable)
>> but also the "Total normal mode calculation time" is significanly (cca
>> 2x)
>> longer here why ?
>
> I think what you are seeing may be an artifact. The minimization time
> is actually included in the normal mode timing, since they are done in
> the same program at the same time. So all of the time difference should
> have been attributed to the normal mode calculation time. If you use
> MPI, though, then the normal mode calculation time may be misleading.
> The MPI timings are reported individually for *just* the master node.
> This isn’t a big deal, since usually one frame takes the same amount of
> time as any other. For normal modes, this isn’t true, since some frames
> may take *much* longer to minimize. If this frame(s) is being done by a
> processor *besides* an arbitrarily defined “master”, then the master
> thread will spend most if its time waiting for the other CPUs to
> finish. So the amount of time reported by the master on normal modes
> will not reflect the amount of time spent by the other CPUs. But I
> never really thought it was that important to get the timers working in
> parallel...
>
>>
>> Just for the curiosity I started mmpbsa_py minimization with drms=1e-6
>> condition also in case of
>> my problematic big systems (cca 12k atoms), but I am afraid that it will
>> take week/(maybe weeks) to reach drms=1e-6 considering already known
>> times for reaching drms=0.01.
>
> Your estimates are probably good -- weeks is probably not too far off.

MIN: Iter = 504 NFunc = 15287 E = -32264.08459 RMSG = 4.0505793e-03
   CG: It= 50 ( 1.003)q :-(
   LS: i= 1 lhs_f= -0.016299724 rhs_f= -3.257741e-06
             lhs_g= 2.8932305e-05 rhs_g= 0.029319669
             rel_s= 1 abs_s= 2.3411719
             max_d= 0.061657238 i_xyz= 10463x

state after two days of minimization :)) so let's see ...

> You may want to consider the quasi-harmonic approximation (which is
> PCA-based), as there is no minimization (you build a mass-weighted
> covariance matrix as a pseudo-Hessian).

OK, I will try.

    Best wishes,

        Marek

>
> HTH,
> Jason
>
> --
> Jason M. Swails
> BioMaPS,
> Rutgers University
> Postdoctoral Researcher
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Thu Feb 12 2015 - 15:00:02 PST