Thanks for your comprehensive answer. I'm trying to understand as much as
possible what is happening and why. Indeed, after checking the trajectory
file, the first outputted pdb during generation of conformations is NOT the
first frame in the trajectory. It is exactly the same entry as the
respective energy in the energy file.
My alarm was actually raised because I was using MMPBSA for 'stability
calculations' to check (normalized) energies if they would match the QM
energies. The patterns were completely different but this was due to the
fact I called MMPBSA on each pdb in the order they were outputted and
plotted this vs the QM energies - which ofcourse didn't correspond at all.
If I ordered the QM energies accordingly (using sed) the energy profiles
did correspond between QM and MMPBSA, but as I stated earlier: I was under
the impression that the trajectory just held every conformation in the
exact same order as it was initially outputted. Maybe I should have checked
this to begin with...
Thanks again for all your help and explanations
Charles-Alexandre Mattelaer
2017-06-13 6:46 GMT+02:00 David Cerutti <dscerutti.gmail.com>:
> The function your input is calling is actually RegExpFileSearch() right
> above where I pointed you, but again it makes use of ent = readdir().
> There's a comment in there about "returns an alphabetized list" but I may
> not have gotten to that last bit, which would be what caused your alarm.
>
> Dave
>
>
> On Tue, Jun 13, 2017 at 12:23 AM, David Cerutti <dscerutti.gmail.com>
> wrote:
>
> > Sorry I didn't see this earlier! Here's what happening. The energies
> are
> > not being extracted in random order, but rather an order which matches
> the
> > resulting trajectory. In this sense, the energies will be properly
> > attributed to each conformation so long as you read the energies_mdgx.dat
> > file along with the trajectory. Try converting the coordinates
> Coords.cdf
> > to PDB / ascii and verifying that the frames match:
> >
> > cat > conv.ptrj << EOF
> > trajin Coords.cdf
> > trajout Coords.pdb pdb
> > go
> > EOF
> >
> > There should be no problem, I've checked over this specifically. Back
> > when I did this same chore with a python script, the ordering got mangled
> > by the imported glob library. When I actually sat down to write it in
> > proper C code, I began to see the issues. At bottom, the computer
> doesn't
> > alphabetize or numeratize things the way you do. If you wish to see
> under
> > the hood, look in AmberTools/src/mdgx/Parse.c for DirectoryFileSearch(),
> > and find the following:
> >
> > while ((ent = readdir(dir)) != NULL) {
> > ...
> > }
> >
> > It's that ent = readdir() operation that I'm reliant upon, which glob is
> > probably reliant upon, and which you might use yourself if you need to
> > write a directory scanning function at some time in the future. It will
> > list each file in the directory, but not in the order you might expect.
> I
> > simply make sure to extract coordinates and energies from one file at a
> > time and keep them in the order they were read, while culling the bad
> data
> > points.
> >
> > Try running a simple parameterization with energies_mdgx.dat and
> > Coords.cdf, and you should be able to get sane results (errors <= 3-4
> > kcal/mol once you include all the parameters relevant to the
> configurations
> > you've generated). Let me know if you don't!
> >
> > Dave
> >
> >
> > On Mon, Jun 12, 2017 at 9:36 AM, Charles-Alexandre Mattelaer <
> > camattelaer01.gmail.com> wrote:
> >
> >> Dear Amber users
> >>
> >> As I already mentioned, I'm using mdgx to generate parameters for
> modified
> >> nucleic acids. However, after extraction of the energies calculated by
> >> ORCA, the resulting 'energies_mdgx.dat' file listed the calculated
> >> energies
> >> in random(?) order.I'm not quite sure what is happening here since the
> >> extraction should be fairly simple (enclosed file '04_ExtractEnergies').
> >> Using a 'for'-loop with the command sed I generated my own
> >> 'energies_sed.dat' file for comparison.
> >>
> >> I enclosed the first of the ORCA outputs (Conf1.oout) so you can verify
> >> that this energy does not show up as the first entry in the
> >> 'energies_mdgx.dat' file. I also enclosed my own extracted
> >> 'energies_sed.dat' file where this is correctly performed.
> >>
> >> Additionally, I can add that during the generation of the conformers
> (also
> >> using mdgx) I lose several conformations so that I don't have
> 'Conf1.orca'
> >> to 'ConfN.orca' consecutively, resulting in several empty 'ConfX.oout'
> >> files since ORCA is called consecutively in a 'for'-loop. I don't know
> if
> >> this should bother mdgx during extraction to throw the energies around?
> >>
> >> I assume this influences the parameter fitting stage, since the energies
> >> should be attributed respectively to the conformation in the trajectory
> >> file?
> >>
> >> Kind regards
> >>
> >> Charles-Alexandre Mattelaer
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 14 2017 - 08:00:02 PDT