Re: [AMBER] Generate a pdb with unique atom names

From: Jean-Patrick Francoia <jeanpatrick.francoia.gmail.com>
Date: Tue, 15 Apr 2014 09:52:37 +0200

Le 15/04/2014 07:31, FyD a écrit :
> Dear Jean-Patrick,
>
> If you do want to parameterize a polymer I would like to suggest you
> to use R.E.D. Server Development/R.E.D. Python at
> http://q4md-forcefieldtools.org/REDS-Development/
>
> You need to split your polymer into elementary building blocks...
>
> See examples in R.E.DD.B.:
> see http://q4md-forcefieldtools.org/REDDB/projects/F-93/
> http://q4md-forcefieldtools.org/REDDB/projects/F-90/
> http://q4md-forcefieldtools.org/REDDB/projects/F-60/
>
> Tutorials at http://q4md-forcefieldtools.org/Tutorial/
>
> Finally the OFF library file format has advantages, but also limitations;
> See http://q4md-forcefieldtools.org/Tutorial/leap-mol3.php
>
> We are about to release a new version of REDS-Development/R.E.D.
> Python and a new tutorial.
>
> If you are interested I can give you the address of the new web site
> and tutorial...
>
> regards, Francois
>
>
>> Le 14/04/2014 14:10, Jason Swails a écrit :
>>> On Sun, 2014-04-13 at 21:40 +0200, Jean-Patrick Francoia wrote:
>>>> Hello,
>>>>
>>>> I am a master student, and one of my course is molecular dynamic. So, in
>>>> class we use (the old) Amber 9 to minimize some structures, for example.
>>>> Also, one of our exercices is to build the structure of a lysine polymer
>>>> with Python (a programmation language), and then to minimize the
>>>> structure with Amber.
>>>>
>>>> The important thing is my Python program generates a SMILE string. I can
>>>> import this string with plenty of softwares, and then save it into the
>>>> pdb format. But in this pdb, each atom does not have a unique name, a
>>>> very common issue when using Amber.
>>>>
>>>> So, I would like to know if there is a solution to assign each atom a
>>>> unique name. I have seen a lot of threads dealing about this problem on
>>>> the mailing list, but I have never seen any solution. I can't obviously
>>>> rename each atom by hand.
>>> LEaP must know about every residue defined in an input structure so it
>>> knows what atom types to apply to each atom as well as what charges to
>>> apply. To that end, the atom names of every atom must match the names
>>> in the library files so that LEaP can map the atom names to the atom
>>> types, charges, and connectivities.
>>>
>>> The atom names of the amino acid and nucleic acid residues match the
>>> atom names used by the PDB so that almost any PDB you download from the
>>> official Protein Database will work. If you are creating your own PDB
>>> from SMILES strings, you will need to fix the atom names yourself. If
>>> you are already using Python to do some file manipulations, you can use
>>> Python to write a script to rename your atoms.
>>>
>>> Alternatively, you can use LEaP to create your Lysine polymer with the
>>> "sequence" command. If you want a 3-mer of lysine residues, the command
>>>
>>> lys3 = sequence {NLYS LYS CLYS}
>>>
>>> will generate a tripeptide with 3 lysine residues.
>>>
>>> HTH,
>>> Jason
>>>
>> I can't use leap to generate the polymer, sadly. Mine is a complex one,
>> branched on the epsilon and alpha amines, with several hundreds of amino
>> acids.
>>
>> Ok I get it. But how can I know how the atoms are named in the library ?
>> What is the naming algorythm ?
>>
>> And what about generating my own library for my molecule ? Like they do
>> here:
>> http://ambermd.org/tutorials/basic/tutorial4b/
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


I'm answering to both of you, dear Jason and Francois:

"If you do want to parameterize a polymer I would like to suggest you to
use R.E.D. Server Development/R.E.D. Python at
http://q4md-forcefieldtools.org/REDS-Development/"

It is (almost..) what they do in this tutorial, isn't it ?
http://ambermd.org/tutorials/basic/tutorial4b/

And what do you mean by splitting my polymer ? It is composed of 48
Lysines right now (first generation), but in the end, it will be
composed of around 1000. What I really need to do is a honest
minimization, and maybe a bit of docking later. I don't need to be super
accurate.


.Jason:

Correct me if I'm wrong. I wrote a little Python script which "clean" a
bit the pdb generated by my visualization program, which transforms my
SMILES chain into a pdb. I now have a pdb, wherein each atom has a
unique name (element + a random alpha-numeric chain). I attached this
pdb with the name "names.pdb".

I can load this pdb with Amber:

G2 = loadpdb names.pdb
Loading PDB file: ./names.pdb
Created a new atom named: O5K3 within residue: .R<CLYS 0>
Created a new atom named: CB6H within residue: .R<CLYS 0>
Created a new atom named: OF0C within residue: .R<CLYS 0>
[...]
Added missing heavy atom: .R<CLYS 0>.A<OXT 23>
   Added missing heavy atom: .R<CLYS 0>.A<C 21>
   Added missing heavy atom: .R<CLYS 0>.A<CA 3>
   Added missing heavy atom: .R<CLYS 0>.A<O 22>
   Added missing heavy atom: .R<CLYS 0>.A<N 1>
   Added missing heavy atom: .R<CLYS 0>.A<CB 5>
   Added missing heavy atom: .R<CLYS 0>.A<CG 8>
   Added missing heavy atom: .R<CLYS 0>.A<CD 11>
   Added missing heavy atom: .R<CLYS 0>.A<CE 14>
   Added missing heavy atom: .R<CLYS 0>.A<NZ 17>
   total atoms in file: 1011
   Leap added 23 missing atoms according to residue templates:
        10 Heavy
        13 H / lone pairs
   The file contained 1011 atoms not in residue templates

I can then save the pdb, and everything seems all right (no duplicate),
except I think leap added a supplementary lysine. (And why the name of
the residue is mentionned as CLYS ?)

Of course you noticed in the pdb "names.pdb", atoms don't have a type,
so when I do:

saveamberparm G2 G2.prmtop G2.inpcrd
Checking Unit.
FATAL: Atom .R<CLYS 0>.A<O5K3 24> does not have a type.
[...]

I have something like this. So, I looked in the OFF library
documentation (thanks for the link by the way):

!entry.CLYS.unit.atoms table str name str type int typex int resx int
flags int seq int elmnt dbl chg

"N" "N" 0 1 131072 1 7 -0.463000
"H" "H" 0 1 131072 2 1 0.252000
"CA" "CT" 0 1 131072 3 6 0.035000
"HA" "HC" 0 1 131072 4 1 0.048000
"CB" "CT" 0 1 131072 5 6 -0.098000
"HB2" "HC" 0 1 131072 6 1 0.038000
"HB3" "HC" 0 1 131072 7 1 0.038000
"CG" "CT" 0 1 131072 8 6 -0.210000
"HG2" "HC" 0 1 131072 9 1 0.116000
"HG3" "HC" 0 1 131072 10 1 0.116000
"CD" "CT" 0 1 131072 11 6 -0.230000
"HD2" "HC" 0 1 131072 12 1 0.122000
"HD3" "HC" 0 1 131072 13 1 0.122000
"CE" "CT" 0 1 131072 14 6 -0.138000
"HE2" "HC" 0 1 131072 15 1 0.098000
"HE3" "HC" 0 1 131072 16 1 0.098000
"NZ" "N3" 0 1 131072 17 7 -0.138000
"HZ1" "H3" 0 1 131072 18 1 0.094000
"HZ2" "H3" 0 1 131072 19 1 0.094000
"HZ3" "H3" 0 1 131072 20 1 0.094000
"C" "C" 0 1 131072 21 6 0.624000
"O" "O2" 0 1 131072 22 8 -0.356000
"OXT" "O2" 0 1 131072 23 8 -0.356000


So it means each atom in the residue has an unique name, of course, and
a type. A type can be shared with other atoms. There are basically only
these types:
N
H
CT
HC
N3
H3
C
O2

I assume these types match the chemical types: aliphatic carbon, epsilon
amine, alpha amine, etc. So now, I just have to modify my parser to
assign the good type to each atom, right ?


Thank you anyway, both of you, for your detailled answers.








_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Tue Apr 15 2014 - 01:00:02 PDT
Custom Search