Re: [AMBER] Error in tutorial A16, Amber 14 from Benjamin D Madej on 2015-04-09 (Amber Archive Apr 2015)

From: Benjamin D Madej <bmadej.ucsd.edu>
Date: Thu, 9 Apr 2015 20:28:19 +0000

Hi Michael,

Given the information you sent, all I can do is explain how the script works and potential problems with the pdb file that would cause this issue.

In the meantime, let me go into more detail and things to check with your pdb file:
1. The error comes from checking the number of atoms in the residue to the template residue in charmmlipid2amber.csv. Each template residue in charmmlipid2amber.csv has an associated number atoms it expects for the residue. If a residue in the pdb file doesn't match the template, the script gives this error.

2. This could happen because there is a residue in the pdb file that is different from the template and has a different number of residues. This is not likely as most of the time the CHARMM-GUI pdb files convert fine with charmmlipid2amber.py. If there was a change to CHARMM-GUI pdb files, then that is another issue. It is also possible there could be a residue name collision perhaps other components of the pdb file.

3. Based on the limited description of your case, it is likely an issue with how the script recognizes residues. I briefly tried to explain how the script does that in the last email but let me go into more detail. If the residues were scanned by the script and incorrectly split, then it is possible the script would find a residue with an incorrect number of residues.

4. Let me briefly explain the algorithm for finding residues. A new residue is identified either with a TER card and/or an increment in the residue sequence number (pdb file columns 23-27). If you want a full explanation, check out the source code.

Given a pdb file, the script scans through all the lines:
1. If the first line is an ATOM or HETATM, then that is the start of residue.
2. For the rest of the lines:
  If the line is an ATOM or HETATM we are in a residue and it has an sequence number.
  Else if the line is a TER we are at the end of the residue with a sequence number from the previous line.
  Else we are not in a residue.

  If the previous line residue sequence number was not the same as the current line residue sequence number:
    Decide if we are ending a residue, starting a residue, or both.
3. For the last line, if a residue has not ended yet, define the end

So the key here is that the *residue sequence number* (pdb file columns 23-27) increments and/or there is a *TER line* at the end of residue. That is how residues are found in the pdb file.

It is a *bit* difficult to diagnose the problem without an actual pdb file. If this email does not solve your problem you can send me the pdb file off the list.

Best,
Ben
________________________________________
From: Michael Shokhen [michael.shokhen.biu.ac.il]
Sent: Thursday, April 09, 2015 3:14 AM
To: AMBER Mailing List
Subject: Re: [AMBER] Error in tutorial A16, Amber 14

Hi Ben,

Thank you for your comment:
The error message indicates that one of your residues in your PDB has an
incorrect number of atoms. This is one of
the checks that the script does on the molecules while processing it.
Basically, the script does a substitution on each
atom in the residue and if it finds the incorrect number of atoms then it
stops.

It well correlates with the error message from charmmlipid2amber.py:

Error: Number of atoms in residue does not match number of atoms in residue
in replacement data file

Unfortunately your comment and the error message are too general to work
the problem around.

I can't find any specific place of incorrect atom number in the
step5_assembly.pdb file that causes the error, because the atom numbers are
smoothly increasing from 1 to 91154 as generated by Charmm-Gui.

Regarding the recommended in the A16 tutorial TER insertion after protein.

In original step5_assembly.pdb generated by Charmm-Gui there is no TER
after protein:

ATOM 5461 OT1 ALA 348 -10.030 -0.328 -36.334 1.00 0.00 PROA

ATOM 5462 OT2 ALA 348 -9.337 -1.565 -38.055 1.00 0.00 PROA

ATOM 5463 C3 CHL1 1 -28.933 18.011 17.293 1.00 0.00 MEMB

ATOM 5464 H3 CHL1 1 -27.890 18.323 17.067 1.00 0.00 MEMB

Following the A16 tutorial direction I have added it manually by text
editor:

ATOM 5461 OT1 ALA 348 -10.030 -0.328 -36.334 1.00 0.00 PROA

ATOM 5462 OT2 ALA 348 -9.337 -1.565 -38.055 1.00 0.00 PROA

TER

ATOM 5463 C3 CHL1 1 -28.933 18.011 17.293 1.00 0.00 MEMB

ATOM 5464 H3 CHL1 1 -27.890 18.323 17.067 1.00 0.00 MEMB

You see that indeed there is an interruption in atom numbers in the TER row

which doesn't contain any atom number in contrast to the standard PDB
format where

it shoud be: TER 5463 ALA 348

Unfortunately, I can't manually renumber all remaining ~90000 atoms after
TER by text editor,

so I used Chimera software to correct the situation:

ATOM 5461 OT1 ALA 348 -10.030 -0.328 -36.334 1.00 0.00 O

ATOM 5462 OT2 ALA 348 -9.337 -1.565 -38.055 1.00 0.00 O

TER 5463 ALA 348

HETATM 5464 C3 CHL1 1 -28.933 18.011 17.293 1.00 0.00 C

HETATM 5465 H3 CHL1 1 -27.890 18.323 17.067 1.00 0.00 H

As you can see the Chimera changes the original format of Charmm-Gui,

that could be a potential error origin.

I have examined all three variants of PDB file. The answer from
charmmlipid2amber.py was absolutely identical:

Error:
Number of atoms in residue does not match number of atoms in residue
in replacement data file

I would appreciate any your help in the problem solution.

Thank you,

Michael

*****************************
Michael Shokhen, PhD
Associate Professor
Department of Chemistry
Bar Ilan University,
Ramat Gan, 52900
Israel
email: michael.shokhen.gmail.com
email: shokhen.mail.biu.ac.il

On Thu, Apr 9, 2015 at 6:15 AM, Benjamin D Madej <bmadej.ucsd.edu> wrote:

> Hi Michael,
>
> The error message indicates that one of your residues in your PDB has an
> incorrect number of atoms. This is one of the checks that the script does
> on the molecules while processing it. Basically, the script does a
> substitution on each atom in the residue and if it finds the incorrect
> number of atoms then it stops.
>
> Residues are split in the PDB by the residue number field of the PDB and
> identified by the residue sequence field and TER cards. The script only
> processes C36 lipids, water, and some ions. All other residues are ignored.
>
> So, first check if the residues to be converted (lipids, water, ions)
> match the template of substitution file (charmmlipid2amber.csv). If there
> is no problem there, then it is probably a problem with residue sequence or
> TER. The script assumes that the normal PDB residue sequence and TER cards
> are used.
>
> Best,
> Ben
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Apr 09 2015 - 13:30:03 PDT