Re: [AMBER] Have any one used the Nmode to calculate the entropy? from Matthew Tessier on 2012-08-20 (Amber Archive Aug 2012)

From: Matthew Tessier <matthew.tessier.gmail.com>
Date: Mon, 20 Aug 2012 11:42:43 -0400

Ren,

This article in J. Chem. Inf. & Model. explores the entropy difference between full and truncated normal modes calculations. You should probably take a look at it.

http://pubs.acs.org/doi/abs/10.1021/ci3001919

The Normal-Mode Entropy in the MM/GBSA Method: Effect of System Truncation, Buffer Region, and Dielectric Constant. Samuel Genheden, Oliver Kuhn, Paulius Mikulskis, Daniel Hoffmann, and Ulf Ryde. J. Chem. Inf. Model. (2012)

We have performed a systematic study of the entropy term in the MM/GBSA (molecular mechanics combined with generalized Born and surface-area solvation) approach to calculate ligand-binding affinities. The entropies are calculated by a normal-mode analysis of harmonic frequencies from minimized snapshots of molecular dynamics simulations. For computational reasons, these calculations have normally been performed on truncated systems. We have studied the binding of eight inhibitors of blood clotting factor Xa, nine ligands of ferritin, and two ligands of HIV-1 protease and show that removing protein residues with distances larger than 8–16 Å to the ligand, including a 4 Å shell of fixed protein residues and water molecules, change the absolute entropies by 1–5 kJ/mol on average. However, the change is systematic, so relative entropies for different ligands change by only 0.7–1.6 kJ/mol on average. Consequently, entropies from truncated systems give relative binding affinities that are identical to those obtained for the whole protein within statistical uncertainty (1–2 kJ/mol). We have also tested to use a distance-dependent dielectric constant in the minimization and frequency calculation (ε = 4r), but it typically gives slightly different entropies and poorer binding affinities. Therefore, we recommend entropies calculated with the smallest truncation radius (8 Å) and ε =1. Such an approach also gives an improved precision for the calculated binding free energies.

-Matt

From: Matthew Tessier [mailto:matthew.tessier.gmail.com]
Sent: Thursday, August 09, 2012 3:13 PM
To: AMBER Mailing List
Subject: Re: [AMBER] Have any one used the Nmode to calculate the entropy?

Ren,

We've found the best way to do nModes calculations on large systems is to break up the calculations into single-frame, single-core jobs which can be submitting over multiple computing cores (& nodes) on a cluster. Jason is right in that each of these jobs takes awhile especially for a system that large. You may want to try reducing the system size by truncating the protein outside a certain distance from the ligand. This does alter your entropy numbers slightly (we noticed a systemic 2 kcal/mol reduction in entropy energy in our particular test case) but it will allow you to get an approximation with a considerably shorter calculation time. We were able to reduce our compute time from 24 hours/frame to about 8 hours/frame (these are computer-dependent times). We ended up going with the full-system approach because we have a lot of cores at our disposal but your system is about 3x's the size of ours.

Also, Jason made the point that it does use a lot of RAM. When I submit these, I don't fill a compute node's processors because there isn't enough RAM on the node to do this. You'll want to gauge the usage of your computer resources before submitting a lot of these. The disadvantage of doing one frame per job is that you'll have to setup a script to post-process the statistical information that MMPBSA.py does normally but you can hack at the MMPBSA.py code to do this for you.

Good luck

-Matthew Tessier

On Thu, Aug 9, 2012 at 1:35 PM, Jason Swails <jason.swails.gmail.com> wrote:

On Thu, Aug 9, 2012 at 12:59 PM, Kong, Ren <rkong.tmhs.org> wrote:

> Dear amber users,
>
> It is my first time to use Nmode to calculate entropy.
> I have met with two problems:
>
> 1. I extracted 10 snapshots to do the calculation and the system is a
> protein and ligand complex with 10550 atoms. It seems extreme time
> consuming. I submit the mpi job with 8 threads and it keep running for more
> than one week. And the job is still running. Is this normal for the
> calculation? There are no error output informed. How long will it take for
> a system like that?
>

This is a *huge* system. Normal mode calculations have to do 2 things:
each snapshot must be minimized to a local minimum, and then the normal
modes in the minimum have to be calculated.

I would not be surprised if the minimizations are taking a very long time.
I have no idea how long that kind of system would take (it will largely
depend on how long it takes to minimize to a local minimum).

>
> 2. I tried to use 5 snapshots to do the calculation. The job quitted
> abnormally.
>
> The output file is:
>
> Running MMPBSA.MPI on 4 processors...
>
> Reading command-line arguments and input files...
>
> Loading and checking parameter files for compatibility...
>
> ptraj found! Using /home/rkong/amber11/bin/ptraj
>
> nmode program found! Using /home/rkong/amber11/bin/mmpbsa_py_nabnmode
>
> Preparing trajectories for simulation...
>
> 1000 frames were read in and processed by ptraj for use in calculation.
>
>
>
> Beginning nmode calculations with mmpbsa_py_nabnmode...
>
> Master thread is calculating normal modes for 2 frames
>
>
>
> calculating complex contribution for frame 0
>
> FATAL: allocation failure in vector()
>
> FATAL: allocation failure in vector()
>
> close failed in file object destructor:
>
> IOError: [Errno 9] Bad file descriptor
>
> FATAL: allocation failure in vector()
>
> close failed in file object destructor:
>
> IOError: [Errno 9] Bad file descriptor
>
> FATAL: allocation failure in vector()
>
> close failed in file object destructor:
>
> IOError: [Errno 9] Bad file descriptor
>
> The input file for 10 snapshots is as following:
> &general
> startframe=1,endframe=1000
> keep_files=2,
> receptor_mask=':1-692',ligand_mask=':693'
> /
> &nmode
> nmstartframe=100, nmendframe=1000,
> nminterval=100, nmode_igb=1, nmode_istrng=0.1,
> /
>
> The input file for 5 snapshots is as following:
> &general
> startframe=1,endframe=1000
> keep_files=2,
> receptor_mask=':1-692',ligand_mask=':693'
> /
> &nmode
> nmstartframe=100, nmendframe=1000,
> nminterval=200, nmode_igb=1, nmode_istrng=0.1,
> /
>
> The only difference between the input files is "nminterval". I just don't
> know why the 5 snapshots job cannot run normally as the 10 snapshots job.
>
> Could anyone give some comments?
>

The errors you're getting suggest a lack of memory. Nmode calculations
require storing a 3Nx3N Hessian matrix (although only an upper-triangular
portion is saved), as well as substantial scratch space for the work the
diagonlizer has to do. When you run 4 threads that all happen to be
diagonalizing at the same time, that means you'll need 4x the amount of RAM
required for a single calculation.

HTH,
Jason

--
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
 
-- 
Matthew B. Tessier
Complex Carbohydrate Research Center / Chemistry Dept.
University of Georgia
mbt3911.uga.edu
matthew.tessier.gmail.com
1-706-542-3508
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Mon Aug 20 2012 - 09:00:06 PDT