Re: [AMBER] leap cannot read MOL2 file that it has written before: small fix suggested

From: Jan-Philip Gehrcke <jgehrcke.googlemail.com>
Date: Fri, 11 Nov 2011 11:00:09 +0100

Thanks for your comment, Bud. I propose a new change (explanation follows):

--- src/leap/pdbFile.c.old 2011-11-11 10:54:44.552943823 +0100
+++ src/leap/pdbFile.c 2011-11-11 10:52:11.704288206 +0100
.. -1301,6 +1301,7 ..
      prPPdb->rRes = NULL;
      prPPdb->iPdbSequence = -9999;
      prPPdb->uUnit = (UNIT)oCreate(UNITid);
+ ContainerSetName(prPPdb->uUnit, "default_name");
      prPPdb->iNextUnit = 0;
      if ( prPPdb->iMaxSerialNum == 0 ) {
          VP0(( "\tNo atoms!\n" ));

Speaking in the leap code language, a molecular structure is represented
in a UNIT, which itself is contained in a CONTAINER. The `sName`
attribute of this CONTAINER in the end gets written to the MOL2 file as
molecule name. In container.c, line 127 `sName` is initialized as an
empty string

     strcpy(c->sName, "" );

*after* UNIT creation, i.e. defining a default name for the container
carrying the UNIT in the UNIT constructor does not make sense, since it
is overwritten later on.

Hence, the "XXX" you are proposing was chosen to be an "" by the
original author. This is also what ends up in the MOL2 file. We cannot
estimate which and how many consequences a change of this default
container name would have. I think another default string could change
the behavior of leap in may different places, since basically everything
in leap is represented as a CONTAINER. Therefore, I think it's risky to
change this default name. At least it would require a lot of testing
afterwards.

I suggest defining a new default name for UNITs created from PDB files.
The correct place to do this is in function `zPdbReadAndCreateUnit` in
pdbFile.c. In line 1303, the new UNIT -- which will carry the PDB
content -- is constructed:

     prPPdb->uUnit = (UNIT)oCreate(UNITid);

Consequently, one could add a new line 1304:

     ContainerSetName(prPPdb->uUnit, "default_name");

This is also shown in the diff above. I've tested this; "default_name"
ends up in the written MOL2 file as molecule name without any other
changes to the original code.

This would be entirely analogue to the behavior I see in tripos.c, which
defines how MOL2 files are read. There, the molecule name is read from
the input file and saved to the CONTAINER in the same way as now
proposed for the PDB case (ll 138 and 140):

     T_FSCANF(iRet, fIn, sLine, (sLine, "%s", sUnitName), FAIL);
     ContainerSetName(uUnit, sUnitName);

Jan-Philip

On 11/11/2011 12:35 AM, M. L. Dodson wrote:
>
> On Nov 10, 2011, at 4:32 PM, Jan-Philip Gehrcke wrote:
>
>> Hey,
>>
>> yes, obviously the `sName` attribute of the container corresponding to x
>> is not set after loading x via loadpdb.
>>
>> I've done a small test to illustrate what is missing. In mol2File.c,
>> around line 293, I've written:
>>
>> VP0(("sName before ContainerSetName: %s\n", uUnit->cHeader.sName));
>> ContainerSetName(&uUnit->cHeader, "powername");
>> VP0(("lol cHeader: %s\n", uUnit->cHeader.sName));
>>
>> The rest of the file is unchanged, especially this line is active again:
>>
>> strcpy(sTemp, sContainerName((CONTAINER) uUnit));
>>
>> The output is:
>>
>>> Writing mol2 file: test.mol2
>>> sName before ContainerSetName:
>>> sName after ContainerSetName: powername
>>
>> "powername" then appears in the MOL2 file, i.e.
>> `sContainerName((CONTAINER) uUnit)` returned a non-empty string.
>>
>>
>> Hence, we have to make sure that `ContainerSetName(&uUnit->cHeader,
>> name);` is called at some point after loading a PDB file to an object
>> with a proper name.
>>
>> Jan-Philip
>>
>
> That takes care of the pdb case, but there may be other ways of
> creating/reading molecule (maybe in the future).
>
> Don't know anything about leap code or the mol2 format, but I was
> thinking along the line of an additional line after the allocation
> of memory for the molecule structure. Just set the name to a default
> there. If it is never overwritten, it will show up as the default
> name. If other code (say loadFoo) overwrites it, then that is the
> correct behavior as well. I think it should be an obvious name,
> e.g, "XXX", which is pretty generally meant to indicate additional
> editing is needed at a later time.
>
> Bud
>
> PS, you need to get someone commit whatever you decide if you don't
> have privileges.
>
>>
>> On 10.11.2011 22:50, M. L. Dodson wrote:
>>> Replying to myself again, it appears that when a molecule
>>> is read in from a pdb, there is no molecule name assigned
>>> (in the mol2-format sense). Maybe we should initialize the
>>> molecule name in the code before we even get started. If we
>>> need it for a molecule read from a pdb file, then the default
>>> is there. I suggest something like REPLACEME. If the name
>>> is assigned by some other way, then it overwrites the default.
>>>
>>> Bud
>>>
>>> On Nov 10, 2011, at 3:30 PM, M. L. Dodson wrote:
>>>
>>>> Replying to myself.
>>>>
>>>> If you write a pdb with x, then use it the way you describe,
>>>> you get the behavior you indicated. Something is going
>>>> wrong. What leap has after the combine is different from
>>>> what it has from reading the pdb file:
>>>>
>>>> source leaprc.ff99SB
>>>> x = combine {ACE THR NME}
>>>> savepdb x test.pdb
>>>> savemol2 x test.mol2 0
>>>> y = loadmol2 test.mol2
>>>> quit
>>>>
>>>> versus
>>>>
>>>> source leaprc.ff99SB
>>>> x = loadpdb test.pdb
>>>> savemol2 x test.mol2 0
>>>> y = loadmol2 test.mol2
>>>> quit
>>>>
>>>> One thing I noticed is that x from combine has TER cards in
>>>> the written pdb file, whereas the pdb file I used in the
>>>> second case had the TER cards removed. (But the behavior
>>>> you described happened with or without the TER cards.)
>>>>
>>>> Puzzled,
>>>> Bud
>


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Nov 11 2011 - 02:30:02 PST
Custom Search