Re: [AMBER] Generate a pdb with unique atom names from FyD on 2014-04-15 (Amber Archive Apr 2014)

From: FyD <fyd.q4md-forcefieldtools.org>
Date: Tue, 15 Apr 2014 14:51:15 +0200

Jean-Patrick,

> "If you do want to parameterize a polymer I would like to suggest you
> to use R.E.D. Server Development/R.E.D. Python at
> http://q4md-forcefieldtools.org/REDS-Development/"
>
> It is (almost..) what they do in this tutorial, isn't it ?
> http://ambermd.org/tutorials/basic/tutorial4b/

I do not think so...

In this 'basic/tutorial4b' tutorial a small/basic molecule is
parameterized - this is totally different from parameterizing a
biopolymer, where the later has to be decomposed into elementary
building blocks, and where molecular fragments have to be
designed/generated...

In this context the mol3 file format has some advantages (over the off
file format for instance)...

R.E.D. Server Dev/R.E.D. Python allows force field generation for a
small molecule, a bunch of small molecules as well as for any type of
polymers.

See once again the work of J. Sanders (Sanders et al. force field) in
REDDB for instance:
http://q4md-forcefieldtools.org/REDDB/projects/F-93/ where a force
field for the PNA biopolymer
(http://en.wikipedia.org/wiki/Peptide_nucleic_acid) is designed.

--

Concerning your PDB file; three points:
- when designing a force field library two atoms cannot share the same
name in a given residue; BUT FORTUNATELY two atoms can share the same
name if they belong to two different residues. Let's take an example:
NLYS-LYS-LYS-CLYS
there are four LYS residues and the four alpha-carbons have the same
name: CA because they belong to four different residues.

- you need to know at which pH you want to use your force field; thus
the amino group of the side chain of Lysine is protonated at pH = 7.

- your structure (names.pdb) just corresponds to the polymerization of
a small number of different residues; each one being represented by a
central, N-terminal and C-terminal fragments...

> And what do you mean by splitting my polymer ? It is composed of 48
> Lysines right now (first generation), but in the end, it will be
> composed of around 1000. What I really need to do is a honest
> minimization, and maybe a bit of docking later. I don't need to be
> super accurate.

obviously you can design a polymer composed of 48 or 1000 units using
the same set of molecular fragments...

regards, Francois

>> If you do want to parameterize a polymer I would like to suggest you
>> to use R.E.D. Server Development/R.E.D. Python at
>> http://q4md-forcefieldtools.org/REDS-Development/
>>
>> You need to split your polymer into elementary building blocks...
>>
>> See examples in R.E.DD.B.:
>> see http://q4md-forcefieldtools.org/REDDB/projects/F-93/
>> http://q4md-forcefieldtools.org/REDDB/projects/F-90/
>> http://q4md-forcefieldtools.org/REDDB/projects/F-60/
>>
>> Tutorials at http://q4md-forcefieldtools.org/Tutorial/
>>
>> Finally the OFF library file format has advantages, but also limitations;
>> See http://q4md-forcefieldtools.org/Tutorial/leap-mol3.php
>>
>> We are about to release a new version of REDS-Development/R.E.D.
>> Python and a new tutorial.
>>
>> If you are interested I can give you the address of the new web site
>> and tutorial...
>>
>> regards, Francois
>>
>>
>>> Le 14/04/2014 14:10, Jason Swails a écrit :
>>>> On Sun, 2014-04-13 at 21:40 +0200, Jean-Patrick Francoia wrote:
>>>>> Hello,
>>>>>
>>>>> I am a master student, and one of my course is molecular dynamic. So, in
>>>>> class we use (the old) Amber 9 to minimize some structures, for example.
>>>>> Also, one of our exercices is to build the structure of a lysine polymer
>>>>> with Python (a programmation language), and then to minimize the
>>>>> structure with Amber.
>>>>>
>>>>> The important thing is my Python program generates a SMILE string. I can
>>>>> import this string with plenty of softwares, and then save it into the
>>>>> pdb format. But in this pdb, each atom does not have a unique name, a
>>>>> very common issue when using Amber.
>>>>>
>>>>> So, I would like to know if there is a solution to assign each atom a
>>>>> unique name. I have seen a lot of threads dealing about this problem on
>>>>> the mailing list, but I have never seen any solution. I can't obviously
>>>>> rename each atom by hand.
>>>> LEaP must know about every residue defined in an input structure so it
>>>> knows what atom types to apply to each atom as well as what charges to
>>>> apply. To that end, the atom names of every atom must match the names
>>>> in the library files so that LEaP can map the atom names to the atom
>>>> types, charges, and connectivities.
>>>>
>>>> The atom names of the amino acid and nucleic acid residues match the
>>>> atom names used by the PDB so that almost any PDB you download from the
>>>> official Protein Database will work. If you are creating your own PDB
>>>> from SMILES strings, you will need to fix the atom names yourself. If
>>>> you are already using Python to do some file manipulations, you can use
>>>> Python to write a script to rename your atoms.
>>>>
>>>> Alternatively, you can use LEaP to create your Lysine polymer with the
>>>> "sequence" command. If you want a 3-mer of lysine residues, the command
>>>>
>>>> lys3 = sequence {NLYS LYS CLYS}
>>>>
>>>> will generate a tripeptide with 3 lysine residues.
>>>>
>>>> HTH,
>>>> Jason
>>>>
>>> I can't use leap to generate the polymer, sadly. Mine is a complex one,
>>> branched on the epsilon and alpha amines, with several hundreds of amino
>>> acids.
>>>
>>> Ok I get it. But how can I know how the atoms are named in the library ?
>>> What is the naming algorythm ?
>>>
>>> And what about generating my own library for my molecule ? Like they do
>>> here:
>>> http://ambermd.org/tutorials/basic/tutorial4b/
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
> I'm answering to both of you, dear Jason and Francois:
>
> "If you do want to parameterize a polymer I would like to suggest you
> to use R.E.D. Server Development/R.E.D. Python at
> http://q4md-forcefieldtools.org/REDS-Development/"
>
> It is (almost..) what they do in this tutorial, isn't it ?
> http://ambermd.org/tutorials/basic/tutorial4b/
>
> And what do you mean by splitting my polymer ? It is composed of 48
> Lysines right now (first generation), but in the end, it will be
> composed of around 1000. What I really need to do is a honest
> minimization, and maybe a bit of docking later. I don't need to be
> super accurate.
>
>
> .Jason:
>
> Correct me if I'm wrong. I wrote a little Python script which "clean" a
> bit the pdb generated by my visualization program, which transforms my
> SMILES chain into a pdb. I now have a pdb, wherein each atom has a
> unique name (element + a random alpha-numeric chain). I attached this
> pdb with the name "names.pdb".
>
> I can load this pdb with Amber:
>
> G2 = loadpdb names.pdb
> Loading PDB file: ./names.pdb
> Created a new atom named: O5K3 within residue: .R<CLYS 0>
> Created a new atom named: CB6H within residue: .R<CLYS 0>
> Created a new atom named: OF0C within residue: .R<CLYS 0>
> [...]
> Added missing heavy atom: .R<CLYS 0>.A<OXT 23>
> Added missing heavy atom: .R<CLYS 0>.A<C 21>
> Added missing heavy atom: .R<CLYS 0>.A<CA 3>
> Added missing heavy atom: .R<CLYS 0>.A<O 22>
> Added missing heavy atom: .R<CLYS 0>.A<N 1>
> Added missing heavy atom: .R<CLYS 0>.A<CB 5>
> Added missing heavy atom: .R<CLYS 0>.A<CG 8>
> Added missing heavy atom: .R<CLYS 0>.A<CD 11>
> Added missing heavy atom: .R<CLYS 0>.A<CE 14>
> Added missing heavy atom: .R<CLYS 0>.A<NZ 17>
> total atoms in file: 1011
> Leap added 23 missing atoms according to residue templates:
> 10 Heavy
> 13 H / lone pairs
> The file contained 1011 atoms not in residue templates
>
> I can then save the pdb, and everything seems all right (no duplicate),
> except I think leap added a supplementary lysine. (And why the name of
> the residue is mentionned as CLYS ?)
>
> Of course you noticed in the pdb "names.pdb", atoms don't have a type,
> so when I do:
>
> saveamberparm G2 G2.prmtop G2.inpcrd
> Checking Unit.
> FATAL: Atom .R<CLYS 0>.A<O5K3 24> does not have a type.
> [...]
>
> I have something like this. So, I looked in the OFF library
> documentation (thanks for the link by the way):
>
> !entry.CLYS.unit.atoms table str name str type int typex int resx int
> flags int seq int elmnt dbl chg
>
> "N" "N" 0 1 131072 1 7 -0.463000
> "H" "H" 0 1 131072 2 1 0.252000
> "CA" "CT" 0 1 131072 3 6 0.035000
> "HA" "HC" 0 1 131072 4 1 0.048000
> "CB" "CT" 0 1 131072 5 6 -0.098000
> "HB2" "HC" 0 1 131072 6 1 0.038000
> "HB3" "HC" 0 1 131072 7 1 0.038000
> "CG" "CT" 0 1 131072 8 6 -0.210000
> "HG2" "HC" 0 1 131072 9 1 0.116000
> "HG3" "HC" 0 1 131072 10 1 0.116000
> "CD" "CT" 0 1 131072 11 6 -0.230000
> "HD2" "HC" 0 1 131072 12 1 0.122000
> "HD3" "HC" 0 1 131072 13 1 0.122000
> "CE" "CT" 0 1 131072 14 6 -0.138000
> "HE2" "HC" 0 1 131072 15 1 0.098000
> "HE3" "HC" 0 1 131072 16 1 0.098000
> "NZ" "N3" 0 1 131072 17 7 -0.138000
> "HZ1" "H3" 0 1 131072 18 1 0.094000
> "HZ2" "H3" 0 1 131072 19 1 0.094000
> "HZ3" "H3" 0 1 131072 20 1 0.094000
> "C" "C" 0 1 131072 21 6 0.624000
> "O" "O2" 0 1 131072 22 8 -0.356000
> "OXT" "O2" 0 1 131072 23 8 -0.356000
>
>
> So it means each atom in the residue has an unique name, of course, and
> a type. A type can be shared with other atoms. There are basically only
> these types:
> N
> H
> CT
> HC
> N3
> H3
> C
> O2
>
> I assume these types match the chemical types: aliphatic carbon,
> epsilon amine, alpha amine, etc. So now, I just have to modify my
> parser to assign the good type to each atom, right ?

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Apr 15 2014 - 06:00:05 PDT