Re: [AMBER] bash scripting for MD tasks

From: Anselm Horn <>
Date: Wed, 24 Sep 2014 12:43:17 +0200

Hi James,

to obtain a list of all CYX residues in your pdb file in consecutive
numbering, you could do something like the following:

grep ATOM XXXX.pdb | grep CA | awk 'BEGIN{n=0}{n++; if ($4=="CYX"){print

When you are sure, that only two CYX entries exist in your file, you
could simply used the 'head' and 'tail' command to extract the two numbers:

cyx1=`grep ATOM XXXX.pdb | grep CA | awk 'BEGIN{n=0}{n++; if
($4=="CYX"){print n}}' | head -n 1`
cyx2=`grep ATOM XXXX.pdb | grep CA | awk 'BEGIN{n=0}{n++; if
($4=="CYX"){print n}}' | tail -n 1`

Maybe this helps.



Am 24.09.2014 11:21, schrieb James Starlight:
> In my case there are no any marks for the SS bonds in pdb besides 2 CYX CYX
> residues because the pdb have been processing using pdb2pqr previously to
> assign all protonation states so the much simple case is just to find
> positions of each CYX residues within the sequence using something like
> grep "ATOM.*\(CYX)" $pdb
> but I have no idea how to obtain the positions of each CYX quickly as the
> individual variable in bash.
> 2014-09-24 11:14 GMT+02:00 Hannes Loeffler <>:
>> The formats from the PDB have disulfide bridges marked. If your input
>> file doesn't you could try to do distance-based assignment but this may
>> lead to false negatives and false positives. Another problem is that
>> leap reassigns residue numbers starting with the first it finds and
>> renumbering all subsequent ones by incrementing by one.
>> On Wed, 24 Sep 2014 10:59:04 +0200
>> James Starlight <> wrote:
>>> Dear Amber users,
>>> I wounder about possibilities to define disulphide bond between any
>>> pairs of SG atoms of CYX residues using tleap scripts in some
>>> automatic fashion.
>>> In my case I use tleap as part of some big script to process many
>>> models for further md simulation. Each of the model consist of pair
>>> of CYX residues (assigned by pdb2pqr) in different positions of its
>>> sequence. So in script I need firstly to know the number of position
>>> for each of CYX residues of each model and than to fill this numbers
>>> to the tleap input files for each model
>>> in bash for one model it will be look like:
>>> #some command to scan the sequence of model.pdb and define pair of
>>> its CYX residues within it as the k ans i variables
>>> printf "source leaprc.ff03.r1\nprotein = loadpdb model.pdb\nsetbox
>>> protein centers\nbond protein.${k}.SG protein.${i}.SG\nsaveamberparm
>>> protein protein.parm7 protein.inpcrd\nquit" > ./
>>> so my task is only to find some command which will scan model and find
>>> positions of the CYX within its sequence which could be put to the
>>> tleap as two digits. It will be better to find those 2 digits using
>>> pdb as an input and some unix command like sed or grep to find
>>> positions
>>> I will be very thankful for any suggestions!
>>> James
>> --
>> Scanned by iCritical.
>> _______________________________________________
>> AMBER mailing list
> _______________________________________________
> AMBER mailing list

AMBER mailing list
Received on Wed Sep 24 2014 - 04:00:02 PDT
Custom Search