Re: [AMBER] bash scripting for MD tasks from Anselm Horn on 2014-09-24 (Amber Archive Sep 2014)

From: Anselm Horn <Anselm.Horn.biochem.uni-erlangen.de>
Date: Wed, 24 Sep 2014 12:43:17 +0200

Hi James,

to obtain a list of all CYX residues in your pdb file in consecutive
numbering, you could do something like the following:

grep ATOM XXXX.pdb | grep CA | awk 'BEGIN{n=0}{n++; if ($4=="CYX"){print
n}}'

When you are sure, that only two CYX entries exist in your file, you
could simply used the 'head' and 'tail' command to extract the two numbers:

cyx1=`grep ATOM XXXX.pdb | grep CA | awk 'BEGIN{n=0}{n++; if
($4=="CYX"){print n}}' | head -n 1`
cyx2=`grep ATOM XXXX.pdb | grep CA | awk 'BEGIN{n=0}{n++; if
($4=="CYX"){print n}}' | tail -n 1`

Maybe this helps.

Regards,

Anselm

Am 24.09.2014 11:21, schrieb James Starlight:
> In my case there are no any marks for the SS bonds in pdb besides 2 CYX CYX
> residues because the pdb have been processing using pdb2pqr previously to
> assign all protonation states so the much simple case is just to find
> positions of each CYX residues within the sequence using something like
> grep "ATOM.*\(CYX)" $pdb
> but I have no idea how to obtain the positions of each CYX quickly as the
> individual variable in bash.
>
> 2014-09-24 11:14 GMT+02:00 Hannes Loeffler <Hannes.Loeffler.stfc.ac.uk>:
>
>> The formats from the PDB have disulfide bridges marked. If your input
>> file doesn't you could try to do distance-based assignment but this may
>> lead to false negatives and false positives. Another problem is that
>> leap reassigns residue numbers starting with the first it finds and
>> renumbering all subsequent ones by incrementing by one.
>>
>> On Wed, 24 Sep 2014 10:59:04 +0200
>> James Starlight <jmsstarlight.gmail.com> wrote:
>>
>>> Dear Amber users,
>>>
>>>
>>> I wounder about possibilities to define disulphide bond between any
>>> pairs of SG atoms of CYX residues using tleap scripts in some
>>> automatic fashion.
>>>
>>> In my case I use tleap as part of some big script to process many
>>> models for further md simulation. Each of the model consist of pair
>>> of CYX residues (assigned by pdb2pqr) in different positions of its
>>> sequence. So in script I need firstly to know the number of position
>>> for each of CYX residues of each model and than to fill this numbers
>>> to the tleap input files for each model
>>>
>>> in bash for one model it will be look like:
>>>
>>> #some command to scan the sequence of model.pdb and define pair of
>>> its CYX residues within it as the k ans i variables
>>> printf "source leaprc.ff03.r1\nprotein = loadpdb model.pdb\nsetbox
>>> protein centers\nbond protein.${k}.SG protein.${i}.SG\nsaveamberparm
>>> protein protein.parm7 protein.inpcrd\nquit" > ./tleap.in
>>>
>>>
>>> so my task is only to find some command which will scan model and find
>>> positions of the CYX within its sequence which could be put to the
>>> tleap as two digits. It will be better to find those 2 digits using
>>> pdb as an input and some unix command like sed or grep to find
>>> positions
>>>
>>> I will be very thankful for any suggestions!
>>>
>>> James
>> --
>> Scanned by iCritical.
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Sep 24 2014 - 04:00:02 PDT