Re: [AMBER] MMPBSA.py: ECAVITY / ENPOLAR discrepancy from Jan-Philip Gehrcke on 2013-05-03 (Amber Archive May 2013)

From: Jan-Philip Gehrcke <jgehrcke.googlemail.com>
Date: Fri, 03 May 2013 17:28:53 +0200

On 05/03/2013 04:12 PM, Jason Swails wrote:
>
>> Also, I have seen in the code that there is a print_summary_csv() method
>> somewhere whose purpose seems to be to write the same information as in
>> FINAL_RESULTS_MMPBSA.dat to a CSV file -- but from the docs and from the
>> code I was not able to find the external interface to this method. Do
>> you have plans to provide the results in FINAL_RESULTS_MMPBSA.dat in a
>> machine-readable output file (not surprisingly the
>> FINAL_RESULTS_MMPBSA.dat 'format' itself has changed between MMPBSA.py
>> versions...)?
>>
>
> The CSV format for the standard output was an idea initially, then I
> decided against it since a large amount of the information printed in the
> output file is not really data, but more just information about the
> calculation. As a result, I never implemented the summary CSV file. The
> decomposition results, on the other hand, are almost entirely data, so
> those have been implemented in CSV format. So too have the energy dumps
> (-eo and -deo flags). If there's general interest in a particular feature,
> I can work on that for the next release.

I have this request :-) The FINAL..dat file definitely makes sense, it
indeed contains a lot of useful information besides just data. For the
data itself, in addition to what we have now, a machine-readable file
would be great for further post-processing. JSON would be quite an
appropriate output format here: instead of forcing structured data into
columns of a CSV file, one could and should keep the structure within a
JSON representation. Also, JSON has a clear definition of data types and
an extremely convincing language support. As a bonus, JSON usage with
Python is more straight-forward than CSV. If you like, I could work on a
patch.

> I haven't changed the actual format of the output appreciably since the
> original version 3 to 4 years ago. I have added new terms as they became
> necessary (like improper, urey-bradley, and CMAP terms for chamber
> topologies, the SCF terms for QM/MM, the RISM solvation terms, EDISPER,
> etc.), but hopefully nothing that breaks existing parsers (am I wrong here?)

:-) At least one thing comes to my mind. What once was

DELTA G binding = -43.2307 +/- 4.2578 0.4258

now is

DELTA TOTAL -20.4747 7.4389
0.7439

This changes keyword as well as column number criteria. I think the
point here is that a file like the FINAL..dat is just not made for
parsers to rely on a certain format. So, you are definitely not to blame
for changing things there.

>
> On the other hand, there's a new API (not _really_ an API, but an easy way
> to extract data from the _MMPBSA_ intermediate files into your own python
> script with a single function call) so you can do your own types of data
> analysis in Python using the numpy/scipy machinery (documented in the
> AmberTools 13 manual).

I was actually trying this out. I like the idea, but especially for my
workflow there is one major disadvantage: since you decided to
re-calculate results within MMPBSA_API.load_mmpbsa_info(), this call
takes its time. I have lots and lots of MMPBSA runs and am playing with
their data and merging it. When I just parse the FINAL..dat of all
MMPBSA runs myself, then I get my merged result instantaneously. Using
the API, it feels like forever.

Thanks for your comments,

Jan-Philip

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri May 03 2013 - 09:00:03 PDT