Re: [AMBER] leap cannot read MOL2 file that it has written before: small fix suggested

From: M. L. Dodson <mldodson.comcast.net>
Date: Fri, 11 Nov 2011 09:00:38 -0600

On Nov 11, 2011, at 4:00 AM, Jan-Philip Gehrcke wrote:

> Thanks for your comment, Bud. I propose a new change (explanation follows):
>
> --- src/leap/pdbFile.c.old 2011-11-11 10:54:44.552943823 +0100
> +++ src/leap/pdbFile.c 2011-11-11 10:52:11.704288206 +0100
> .. -1301,6 +1301,7 ..
> prPPdb->rRes = NULL;
> prPPdb->iPdbSequence = -9999;
> prPPdb->uUnit = (UNIT)oCreate(UNITid);
> + ContainerSetName(prPPdb->uUnit, "default_name");
> prPPdb->iNextUnit = 0;
> if ( prPPdb->iMaxSerialNum == 0 ) {
> VP0(( "\tNo atoms!\n" ));
>
> Speaking in the leap code language, a molecular structure is represented
> in a UNIT, which itself is contained in a CONTAINER. The `sName`
> attribute of this CONTAINER in the end gets written to the MOL2 file as
> molecule name. In container.c, line 127 `sName` is initialized as an
> empty string
>
> strcpy(c->sName, "" );
>
> *after* UNIT creation, i.e. defining a default name for the container
> carrying the UNIT in the UNIT constructor does not make sense, since it
> is overwritten later on.
>
> Hence, the "XXX" you are proposing was chosen to be an "" by the
> original author. This is also what ends up in the MOL2 file. We cannot
> estimate which and how many consequences a change of this default
> container name would have. I think another default string could change
> the behavior of leap in may different places, since basically everything
> in leap is represented as a CONTAINER. Therefore, I think it's risky to
> change this default name. At least it would require a lot of testing
> afterwards.
>

Got it. Without reading the code, my guess at how it works was too
simplistic.

> I suggest defining a new default name for UNITs created from PDB files.
> The correct place to do this is in function `zPdbReadAndCreateUnit` in
> pdbFile.c. In line 1303, the new UNIT -- which will carry the PDB
> content -- is constructed:
>
> prPPdb->uUnit = (UNIT)oCreate(UNITid);
>
> Consequently, one could add a new line 1304:
>
> ContainerSetName(prPPdb->uUnit, "default_name");
>
> This is also shown in the diff above. I've tested this; "default_name"
> ends up in the written MOL2 file as molecule name without any other
> changes to the original code.
>
> This would be entirely analogue to the behavior I see in tripos.c, which
> defines how MOL2 files are read. There, the molecule name is read from
> the input file and saved to the CONTAINER in the same way as now
> proposed for the PDB case (ll 138 and 140):
>
> T_FSCANF(iRet, fIn, sLine, (sLine, "%s", sUnitName), FAIL);
> ContainerSetName(uUnit, sUnitName);
>
> Jan-Philip
>

Works for me!
Bud

> On 11/11/2011 12:35 AM, M. L. Dodson wrote:
>>
>> On Nov 10, 2011, at 4:32 PM, Jan-Philip Gehrcke wrote:
>>
>>> Hey,
>>>
>>> yes, obviously the `sName` attribute of the container corresponding to x
>>> is not set after loading x via loadpdb.
>>>
>>> I've done a small test to illustrate what is missing. In mol2File.c,
>>> around line 293, I've written:
>>>
>>> VP0(("sName before ContainerSetName: %s\n", uUnit->cHeader.sName));
>>> ContainerSetName(&uUnit->cHeader, "powername");
>>> VP0(("lol cHeader: %s\n", uUnit->cHeader.sName));
>>>
>>> The rest of the file is unchanged, especially this line is active again:
>>>
>>> strcpy(sTemp, sContainerName((CONTAINER) uUnit));
>>>
>>> The output is:
>>>
>>>> Writing mol2 file: test.mol2
>>>> sName before ContainerSetName:
>>>> sName after ContainerSetName: powername
>>>
>>> "powername" then appears in the MOL2 file, i.e.
>>> `sContainerName((CONTAINER) uUnit)` returned a non-empty string.
>>>
>>>
>>> Hence, we have to make sure that `ContainerSetName(&uUnit->cHeader,
>>> name);` is called at some point after loading a PDB file to an object
>>> with a proper name.
>>>
>>> Jan-Philip
>>>
>>
>> That takes care of the pdb case, but there may be other ways of
>> creating/reading molecule (maybe in the future).
>>
>> Don't know anything about leap code or the mol2 format, but I was
>> thinking along the line of an additional line after the allocation
>> of memory for the molecule structure. Just set the name to a default
>> there. If it is never overwritten, it will show up as the default
>> name. If other code (say loadFoo) overwrites it, then that is the
>> correct behavior as well. I think it should be an obvious name,
>> e.g, "XXX", which is pretty generally meant to indicate additional
>> editing is needed at a later time.
>>
>> Bud
>>
>> PS, you need to get someone commit whatever you decide if you don't
>> have privileges.
>>
>>>
>>> On 10.11.2011 22:50, M. L. Dodson wrote:
>>>> Replying to myself again, it appears that when a molecule
>>>> is read in from a pdb, there is no molecule name assigned
>>>> (in the mol2-format sense). Maybe we should initialize the
>>>> molecule name in the code before we even get started. If we
>>>> need it for a molecule read from a pdb file, then the default
>>>> is there. I suggest something like REPLACEME. If the name
>>>> is assigned by some other way, then it overwrites the default.
>>>>
>>>> Bud
>>>>
>>>> On Nov 10, 2011, at 3:30 PM, M. L. Dodson wrote:
>>>>
>>>>> Replying to myself.
>>>>>
>>>>> If you write a pdb with x, then use it the way you describe,
>>>>> you get the behavior you indicated. Something is going
>>>>> wrong. What leap has after the combine is different from
>>>>> what it has from reading the pdb file:
>>>>>
>>>>> source leaprc.ff99SB
>>>>> x = combine {ACE THR NME}
>>>>> savepdb x test.pdb
>>>>> savemol2 x test.mol2 0
>>>>> y = loadmol2 test.mol2
>>>>> quit
>>>>>
>>>>> versus
>>>>>
>>>>> source leaprc.ff99SB
>>>>> x = loadpdb test.pdb
>>>>> savemol2 x test.mol2 0
>>>>> y = loadmol2 test.mol2
>>>>> quit
>>>>>
>>>>> One thing I noticed is that x from combine has TER cards in
>>>>> the written pdb file, whereas the pdb file I used in the
>>>>> second case had the TER cards removed. (But the behavior
>>>>> you described happened with or without the TER cards.)
>>>>>
>>>>> Puzzled,
>>>>> Bud
>>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

-- 
M. L. Dodson
Business email: activesitedynamics-at-gmail-dot-com
Personal email: mldodson-at-comcast-dot-net
Gmail: mlesterdodson-at-gmail-dot-com
Phone: eight_three_two-five_63-386_one
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Nov 11 2011 - 07:30:02 PST
Custom Search