Re: [AMBER] MMPBSA.py: error during parsing of _MMPBSA_complex_gb.mdout from Jan-Philip Gehrcke on 2011-10-25 (Amber Archive Oct 2011)

From: Jan-Philip Gehrcke <jgehrcke.googlemail.com>
Date: Tue, 25 Oct 2011 17:30:40 +0200

Hey,

please see below for further comments on this issue. Especially, we've
done a comparison between Python and Perl version of MMPBSA on the same
molecular system. Although both are using sander10, the decomposition
energetical values printed by sander10 are very different from Python to
Perl version.

On 10/18/2011 08:08 PM, Jason Swails wrote:
> On Tue, Oct 18, 2011 at 1:03 PM, Jan-Philip Gehrcke<jgehrcke.googlemail.com
>> wrote:
>
>> Sorry for the late reply. I have some more details regarding this issue:
>>
>> 1) weird: parsing error depends on MPI process count
>> ====================================================
>> An MMPBSA.py.MPI run using `mpirun -np 8` on 101 frames succeeds
>> (executed locally on my 8 core machine). An MMPBSA.py.MPI run using
>> `mpirun -np 2` using the *same files* with the *same input
>> configuration* on the same machine fails with the parsing error
>> indicated in the opening mail (due to "*********" tokens in the
>> _MMPBSA_ligand_gb.mdout).
>>
>> Let me provide more detail on the differences between the two
>> _MMPBSA_ligand_gb.mdout files created by these two independent runs:
>>
>>
>> - The "crashtest1/_MMPBSA_ligand_gb.mdout" corresponds to the -np 2 run.
>> - The file "crashtest3/_MMPBSA_ligand_gb.mdout" corresponds to the -np 8
>> run.
>>
>> First of all, I checked the number of lines containing "***" in both files:
>>
>> $ cat crashtest1/_MMPBSA_ligand_gb.mdout | grep "\*\*\*" | wc -l
>> 297
>>
>
> Which lines were these on (specifically, which terms were blowing up)? My
> suggestion here is to visualize the trajectories _MMPBSA_ligand.mdcrd(.x)
> with your ligand topology file to make sure that these are created properly.
>
For the file mentioned above, these are the "blowup counts" for all the
data columns:

internal: 297
vdw: 0
eel: 18
pol: 0
sas: 0

I visualized the _MMPBSA_ligand.mdcrd using the ligand topology file and
everything looked fine.

>
>>
>> $ cat crashtest3/_MMPBSA_ligand_gb.mdout | grep "\*\*\*" | wc -l
>> 0
>>
>>
>> Then, I took "crashtest1/_MMPBSA_ligand_gb.mdout" (from the -np 2 run)
>> as reference. In this file, I grepped all lines starting with either
>> VDWAALS, 1-4, BOND, or ESURF. For each grepped line, I checked in the
>> "crashtest3/_MMPBSA_ligand_gb.mdout" (the -np 8 run) if an equivalent
>> line is existing. If it exists, it is a match and counted:
>>
>> $ cat crashtest1/_MMPBSA_ligand_gb.mdout | grep VDWAALS | while read
>> LINE; do grep "${LINE}" crashtest3/_MMPBSA_ligand_gb.mdout; done | wc -l
>> 101
>>
>> $ cat crashtest1/_MMPBSA_ligand_gb.mdout | grep 1-4 | while read LINE;
>> do grep "${LINE}" crashtest3/_MMPBSA_ligand_gb.mdout; done | wc -l
>> 101
>>
>> $ cat crashtest1/_MMPBSA_ligand_gb.mdout | grep BOND | while read LINE;
>> do grep "${LINE}" crashtest3/_MMPBSA_ligand_gb.mdout; done | wc -l
>> 101
>>
>> $ cat crashtest1/_MMPBSA_ligand_gb.mdout | grep ESURF | while read
>> LINE; do grep "${LINE}" crashtest3/_MMPBSA_ligand_gb.mdout; done | wc -l
>> 80
>>
>> I believe that the runs should not differ in their results so that each
>> grepped line from the reference file should find a matching line in the
>> comparison file. As 101 frames have been analyzed by MMPBSA, the 101
>> should be the desired output, right?
>>
>> However, the matching above shows that the ESURF results differ
>> significantly in 21 of 101 cases only due to a change from -np 2 to -np
>> 8. How can this be explained?
>>
>
> I'll preface this by saying that decomposition uses gbsa=2 and all other GB
> calculations use gbsa=1 (LCPO).
>
> Are you using decomposition?

Yes, sorry that I did not provide this information in the first place.

> If so, the gbsa=2 code has known issues (at
> least 1 bug). If you run a 1-step minimization using imin=1 (so it's just
> an energy extraction) using gbsa==2, the ESURF energy printed out in the
> first step differs from the ESURF energy printed out in the summary, which
> is indicative of an issue. All other energies are identical, as you would
> expect. This was raised in a thread within the last year or so, but I'm too
> lazy to go after it now... I'm not exactly sure what causes this issue, but
> I'm not keen on hunting through the (sander) code looking for it.

Hence, we do not know where the problem is exactly. But I would like to
ask: Is this just a guess or is it obvious for you that the bug you are
describing leads to the dependency of resulting numerical values on the
number of MPI processes?

>
>
>> I performed both tests three times and reproduced the result.
>>
>>
>> 2) The initial trajectory
>> =========================
>> The initial trajectory (from which the frames are extracted) looks fine
>> in VMD. Also, the RMSD of the whole complex compared to the initial
>> structure over time is quite stable.
>>
>> Jason, you wrote "Try visualizing the created trajectories with VMD or
>> something to make sure that they look OK". So you mean I should also
>> visualize the extracted frames, yes? Can't we rely on ptraj that the
>> extraction went well? How would I check if "they look OK"?
>>
>
> Extracted frames for the complex, receptor, and ligand, yes. By "look OK" I
> mean they don't look completely warped and ridiculous (caused by the fact
> that the atoms in the trajectory don't correspond to the atoms in the
> associated topology file).
>
>
>>
>>
>> 3) Various other frame count / -np X combinations
>> =================================================
>> I've performed some more tests under constant conditions. All MMPBSA
>> runs are based on the same trajectory file. Let me list the test results
>> in dependence on the X in `-np X` and on the `interval` variable in the
>> `&general` block of the MMPBSA input file:
>>
>> - 11 Frames / np 2: success
>> - 101 Frames / np 2: parsing error
>>
>> - 11 Frames / np 8: success
>> - 101 Frames / np 8: success
>> - 10001 Frames / np 8: parsing error
>>
>>
>> Hence, just using "-np 8" is not a solution. I/we have to understand
>> what is going on here. What can I do now? Thanks for your help!
>>
>
> Try using NetCDF trajectories as intermediates (this won't work for Alanine
> scanning, but it will for everything else). You can do this via the
> "netcdf=1" variable in the MM/PBSA input file. Are your original
> trajectories NetCDF?

My original trajectory is saved in NetCDF format. I did not use
intermediate NetCDF trajectories so far, but I will test that.

Comparison Perl/Python version
==============================
A colleague of mine who has some experience in using the Perl version of
MMPBSA ran the analysis with equivalent input files and same
(corresponding) settings, i.e.

1) From my original NetCDF trajectory, I extracted 101 ascii frames.
2) Using this trajectory, I ran MMPBSA (Python, non-MPI) with the input

$ cat mmpbsa_decomp.in
&general
     receptor_mask = :1-205,
     ligand_mask = :206-355,
/
&gb
     igb = 5,
/
&decomp
     idecomp = 1,
     dec_verbose = 3,
/

resulting in the following blowup line count:

internal: 1030
vdw: 0
eel: 615
pol: 34
sas: 0

The sander version used by this MMPBSA run was from Amber 10.

3) My colleague ran the Perl version of MMPBSA, also with sander from
Amber 10 and input settings corresponding to the input file above. All
data input files were the *same* as in the Python MMPBSA run.

During the Perl MMPBSA run, an error was printed for only one of the 101
frames. Overall, the analysis apparently was performed properly.

All energetical values written by sander10 in the context of Python
MMPBSA are very different from the energetical values written by
sander10 in the context of Perl MMPBSA. The values produced "by Python
MMPBSA" are much too big. I've analyzed the output files:

- snapshots_lig.all.out (Perl version)
- _MMPBSA_ligand_gb.mdout (Python version)

All lines starting with "TDC" have been extracted, providing the columns
carrying the different energy contributions. For each column, all values
that have not "blown up" have been averaged (hence, averaged over
residues and frames). The result is now presented in the following form:

NAME (VALUE COUNT): MEAN OF ABSVALUE +- STDDEV OF ABSVALUE

Perl version:
internal (15150): 22.9 +- 6.0
vdw (15150): 4.4 +- 2.4
eel (15150): 36.8 +- 39.4
pol (15150): 16.1 +- 23.3
sas (15150): 59.1 +- 54.6

Python version:
internal (14714): 4226 +- 2890
vdw (15150): 417 +- 292
eel (14642): 3962 +- 2439
pol (15135): 746 +- 1312
sas (15150): 3007 +- 3563

If we've not done any mistake during our comparison, this tells us that
Python and Perl version use sander in a different way, while the "Perl
way" seems to be the more correct one... here you have to weigh in!

Thanks for your help,

Jan-Philip

>
> All the best,
> Jason
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Oct 25 2011 - 09:00:02 PDT