Re: [AMBER] bash scripting for MD tasks

From: James Starlight <jmsstarlight.gmail.com>
Date: Wed, 24 Sep 2014 10:59:04 +0200

Dear Amber users,


I wounder about possibilities to define disulphide bond between any pairs
of SG atoms of CYX residues using tleap scripts in some automatic fashion.

In my case I use tleap as part of some big script to process many models
for further md simulation. Each of the model consist of pair of CYX
residues (assigned by pdb2pqr) in different positions of its sequence. So
in script I need firstly to know the number of position for each of CYX
residues of each model and than to fill this numbers to the tleap input
files for each model

in bash for one model it will be look like:

#some command to scan the sequence of model.pdb and define pair of its CYX
residues within it as the k ans i variables
  printf "source leaprc.ff03.r1\nprotein = loadpdb model.pdb\nsetbox
protein centers\nbond protein.${k}.SG protein.${i}.SG\nsaveamberparm
protein protein.parm7 protein.inpcrd\nquit" > ./tleap.in


so my task is only to find some command which will scan model and find
positions of the CYX within its sequence which could be put to the tleap as
two digits. It will be better to find those 2 digits using pdb as an input
and some unix command like sed or grep to find positions

I will be very thankful for any suggestions!

James


2014-09-11 11:15 GMT+02:00 James Starlight <jmsstarlight.gmail.com>:

> Thanks for the suggestions!
> Yes I'll add additional *if checkings* on some steps to avoid its
> re-calculations- it's not a problem. The most problem is to decide number
> of loops for each model and it's order. In my example I need to prepare big
> set of the membrane receptors for md simulation. For this time I have made
> (in one loop) script which do preparation of each model with the tleap
> including pdb2pqr, superimpose to reference, copy lipids from ref to each
> of the model, tleap, make all input files for md.
>
> Besides I should to (before the above processing) dock each of that model
> with the number of ligands which I'd like to define as the firt 2 loops
> (for ligands and receptors) with the sequence => openbabel, superimpose
> agains reference for which XYZ of cavity ha been determined, autodock.
>
> Because it's abit complex task I'm looking for some workflow to simplify
> all of those stepts and reduce number of the loops.
>
> James
>
> 2014-09-10 18:48 GMT+04:00 James Maier <jimbo.maier.gmail.com>:
>
>> On Mon, Sep 8, 2014 at 3:24 PM, James Starlight <jmsstarlight.gmail.com>
>> wrote:
>>
>> > In what case my script will be more flexible
>> >
>> and well-controlled (e.g in case where I need to add new models to
>> existing
>> > sets or change some parameters for each preparatory stage)? What should
>> I
>> > also take into account that could make my life better ? :)
>> >
>>
>> In general, this is a very general question and perhaps off topic for the
>> AMBER reflector (StackExchange may be a better forum for BASH questions).
>> But I'll share one practice that's helped me.
>> I've found putting lists of jobs in a file to be effective. Say you have
>> a
>> script:
>>
>> #!/bin/bash
>>
>> while read i;do
>> > something $i
>> > done < $1
>>
>>
>> This will run the 'something' command for each line in the first argument
>> to the bash script. For different sets of jobs, you can just supply the
>> job list as an argument to your script. I guess this fits into the
>> "looping each model" approach.
>>
>> Of course, change 'something' to whatever you need to do, you can write a
>> function to do all the steps if you'd like. You may also have your
>> 'something' function check whether it's already completed a job and just
>> return to avoid redoing calculations.
>>
>> James
>>
>>
>> > Thanks for help,
>> >
>> > James
>> >
>> > 2014-07-21 17:32 GMT+04:00 Parker de Waal <Parker.deWaal.vai.org>:
>> >
>> > > Hi James,
>> > >
>> > > I would recommend reading into gnu parallel. For a small example see
>> here
>> > >
>> > > for i in (1..6); do
>> > > sem -j 2 ./simulation_$1.sh
>> > > done
>> > > sem --wait
>> > >
>> > > Best,
>> > > Parker
>> > >
>> > > -----Original Message-----
>> > > From: James Starlight [mailto:jmsstarlight.gmail.com]
>> > > Sent: Monday, July 21, 2014 7:54 AM
>> > > To: AMBER Mailing List
>> > > Subject: Re: [AMBER] bash scripting for MD tasks
>> > >
>> > > Dear Amber users!
>> > >
>> > > For this example in my work dir I have 6 folders named sim_1, sim_2
>> ...
>> > > sim_6 consisted of all required input files including sh scripts which
>> > run
>> > > MD for each system. I need an idea for some shell main script which
>> will
>> > > run several md jobs at the same time. For instance totally I need run
>> 6
>> > md
>> > > jobs (just run six simulation_№(1-6).sh launch files in each sim
>> folder).
>> > > Because my workstation has 2 GPUs I have to run only 2 simulations at
>> > same
>> > > time repeating each circle 3 times (2sims*3repeations like in GYM
>> when we
>> > > do lifting :-) ) Below I can see my suggestion about such main script.
>> > > Using csh (or bash) I can use some loops e.g foreach or while in csh
>> > >
>> > > foreach i (1 2 3 4 5 6)
>> > > cd ./simulation_$i
>> > > sh simulation_$1.sh
>> > > cd ..
>> > > end
>> > >
>> > > Hovewer this algoritm will run next (i+1) simulation (.sh file) only
>> in
>> > > case when previous (i) have been finished. How should I modify my
>> script
>> > > to make 2 simulations in each for repetition providing also in each
>> > > simulation path to the specified GPU ( I guess in csh it could be
>> done by
>> > > means of adding "set env CUDA_VISIBLE_DEVICE 0,1" in the foreach loop
>> of
>> > my
>> > > main script ).
>> > >
>> > >
>> > > Thanks for help,
>> > >
>> > >
>> > > James
>> > >
>> > >
>> > > 2014-06-23 18:30 GMT+04:00 Parker de Waal <Parker.deWaal.vai.org>:
>> > >
>> > > > HI James,
>> > > >
>> > > > I don't believe this is a de-novo approach, rather an in-silico
>> > > > introduction of backbone dependent rotomters. When introducing point
>> > > > mutations the structure of the protein will not be altered, simply a
>> > > > residue will be replaced with another.
>> > > >
>> > > > To your second point, yes you could do that. You could simply loop
>> > > > over the pdb file, and subsiquent mutant pdbs until all the desired
>> > > > mutations are applied. Ex:
>> > > > BASEPDB -> 1, LEU -> 1LEUPDB
>> > > > 1LEUPDB -> 2, ASN -> 2ASNPDB
>> > > > etc etc.
>> > > >
>> > > > I'm sure there are better ways to do this, however this would be
>> quite
>> > > > simple to write. Good luck.
>> > > >
>> > > > Best,
>> > > > Parker
>> > > > ________________________________________
>> > > > From: James Starlight [jmsstarlight.gmail.com]
>> > > > Sent: Monday, June 23, 2014 7:48 AM
>> > > > To: AMBER Mailing List
>> > > > Subject: Re: [AMBER] bash scripting for MD tasks
>> > > >
>> > > > Thanks for suggestions!
>> > > >
>> > > > In fact this script uses input WT pdb as the template + list of
>> > > > mutation in mutate.dat for de-novo modelling of the mutated protein
>> > > > using modeller, doesn't it? If yes my questions: 1) How the
>> > > > conformation of flexible parts of the mutated protein (e.g loops) be
>> > > > perturbed (in comparison to the WT
>> > > > model) in case when mutations will be introduced in this regions
>> after
>> > > > such modelling ?
>> > > > 2) Some trivial question but should be specified. Might such python
>> > > > script used with multiple mutations.dat files to produce several
>> > > > mutants from 1 WT model an once (e.g by means of new looping script
>> > > > using python mutate_model.py, dat files for each mutant and 1 WT
>> pdb as
>> > > the inputs) ?
>> > > >
>> > > > TFH,
>> > > >
>> > > >
>> > > > James
>> > > >
>> > > >
>> > > > 2014-06-22 18:07 GMT+04:00 Parker de Waal <Parker.deWaal.vai.org>:
>> > > >
>> > > > > while read mutations -r -d ',' mutLoc aminoAcid ; do
>> > > > > python mutate_model.py PDB_NAME $mutLoc $aminoAcid A >
>> $mutLoc.log
>> > > > > done <mutations.dat
>> > > > >
>> > > > > Fixed line break.
>> > > > > -----Original Message-----
>> > > > > From: Parker de Waal [mailto:Parker.deWaal.vai.org]
>> > > > > Sent: Sunday, June 22, 2014 10:04 AM
>> > > > > To: 'AMBER Mailing List'
>> > > > > Subject: Re: [AMBER] bash scripting for MD tasks
>> > > > >
>> > > > > Hi James,
>> > > > >
>> > > > > I would highly recommend a quick google search for 'mutate pdb
>> > > residue'.
>> > > > > You'll find the second or third link provides a nice python script
>> > > > > to interface with Modeller which you could easily automate to
>> > > > > introduce hundreds of mutations.
>> > > > >
>> > > > > Example:
>> > > > > #!/bin/bash
>> > > > > cat > mutations.dat <<'EOF'
>> > > > > 1, LEU
>> > > > > 2, ASP
>> > > > > 3, ASN
>> > > > > 4, LYS
>> > > > > EOF
>> > > > >
>> > > > > while read mutations -r -d ',' mutLoc aminoAcid ; do
>> > > > > python mutate_model.py PDB_NAME $mutLoc $aminoAcid A >
>> $mutLoc.log
>> > > > > done <mutations.dat
>> > > > >
>> > > > > Please note I did not test this script, however it should work.
>> > > > >
>> > > > > Parker
>> > > > >
>> > > > > -----Original Message-----
>> > > > > From: James Starlight [mailto:jmsstarlight.gmail.com]
>> > > > > Sent: Sunday, June 22, 2014 9:25 AM
>> > > > > To: AMBER Mailing List
>> > > > > Subject: Re: [AMBER] bash scripting for MD tasks
>> > > > >
>> > > > > Thanks Dan,
>> > > > >
>> > > > > One of my task consits of the quick introduction of the point
>> > > > > mutations
>> > > > to
>> > > > > the given PDB of membrane receptor and further creation of the
>> > > > > models of the mutated protein by the tleap. Assuming I can easily
>> > > > > make script for
>> > > > the
>> > > > > second part of this workflow having as the input mutated pdb and
>> > > > > solvated membran in merged pdb I dont absolutely know how I could
>> > > > > make quickly mutations in the receptor avoiding of the usage of
>> any
>> > > > > GUI-programs like CHIMERA.
>> > > > >
>> > > > > I'll be thankful for any proposed solutions.
>> > > > >
>> > > > > James
>> > > > >
>> > > > >
>> > > > > 2014-06-20 18:47 GMT+04:00 Daniel Roe <daniel.r.roe.gmail.com>:
>> > > > >
>> > > > > > See the supporting info in this publication:
>> > > > > >
>> http://scanmail.trustwave.com/?c=129&d=oNmm09uxxQhwv7nR-AzXvp7JpjT
>> >
>> > > > > > PPzP
>> > > > > >
>> RoPkyOe7veg&u=http%3a%2f%2fpubs%2eacs%2eorg%2fdoi%2fabs%2f10%2e102
>> > > > > > 1%2f
>> > > > > > jp4125099
>> > > > > >
>> > > > > > -Dan
>> > > > > >
>> > > > > >
>> > > > > > On Fri, Jun 20, 2014 at 8:25 AM, James Starlight
>> > > > > > <jmsstarlight.gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > One question about possible algorithm for quick checking of
>> the
>> > > > > > convergence
>> > > > > > > of my trajectory (ies).
>> > > > > > > For instance I'd like to write some simple script having as
>> the
>> > > > > > > input several trajectories of the same system with different
>> > > > > > > length to check
>> > > > > > (for
>> > > > > > > instance by means of principal mode analysis or another not
>> very
>> > > > > > expensive
>> > > > > > > method) in what case my system have been converged fully. Are
>> > > > > > > there any examples of such scripts or ready workflows?
>> > > > > > >
>> > > > > > > James
>> > > > > > >
>> > > > > > >
>> > > > > > > 2014-06-19 17:52 GMT+04:00 James Starlight <
>> > jmsstarlight.gmail.com
>> > > >:
>> > > > > > >
>> > > > > > > > Thanks, Jason!
>> > > > > > > >
>> > > > > > > > It's very useful advises and you've made very great script
>> > > library!
>> > > > > > I'll
>> > > > > > > > try to follow your basic ideas during my own studies.
>> > > > > > > >
>> > > > > > > > James
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > 2014-06-17 23:25 GMT+04:00 Jason Swails <
>> > jason.swails.gmail.com
>> > > >:
>> > > > > > > >
>> > > > > > > > On Tue, 2014-06-17 at 22:16 +0400, James Starlight wrote:
>> > > > > > > >> > Hi Dan,
>> > > > > > > >> >
>> > > > > > > >> >
>> > > > > > > >> > many thanks for the bash guide- I've found it very
>> useful.
>> > > > > > > >> > In
>> > > > > > general
>> > > > > > > >> I'd
>> > > > > > > >> > like to look at some basic bash script examples suitable
>> > > > > > > >> > for typical
>> > > > > > > md
>> > > > > > > >> > jobs dealing with the running of many of simulation on
>> > > > > > > >> > clusters
>> > > > > > > because
>> > > > > > > >> the
>> > > > > > > >> > most complicated examples like replica exchange
>> simulation
>> > > > > > > >> > have
>> > > > > > > already
>> > > > > > > >> > been present in the amber tutorials.
>> > > > > > > >>
>> > > > > > > >> You've gotten the most helpful responses you can possibly
>> get
>> > > > > > > >> about
>> > > > > > your
>> > > > > > > >> question so far, so I won't belabor the points others have
>> > made.
>> > > > > > > >> I'll relay my own opinions on the topic, though.
>> > > > > > > >>
>> > > > > > > >> Scripting is, at its core, simply a tool we (as
>> computational
>> > > > > > > >> scientists) use to increase our efficiency and
>> productivity.
>> > > > > > > >> For example, high-throughput work like screening a database
>> > > > > > > >> of millions of compounds cannot be done unless "scripted."
>> > > > > > > >>
>> > > > > > > >> When you are designing an experiment or calculation you
>> want
>> > > > > > > >> to
>> > > > > > perform,
>> > > > > > > >> you have a list of tasks you need to get done. Designing a
>> > > > > > > >> script to carry out these tasks requires you to divide your
>> > > > > > > >> problem up into simpler chunks that can be easily
>> represented
>> > > > > > > >> with common logic structures in programming/scripting, like
>> > > > > > > >> loops and simple
>> > > > > > conditionals.
>> > > > > > > >> Writing the script is easy -- if you don't know the syntax
>> of
>> > > > > > > >> doing something like looping over a list, you can google
>> your
>> > > > > > > >> question and
>> > > > > > see
>> > > > > > > >> that it has most likely been asked and answered several
>> times
>> > > > > > > >> on StackOverflow before.
>> > > > > > > >>
>> > > > > > > >> _Designing_ the script is the real challenge (it is an
>> art).
>> > > > > > > >> It is
>> > > > > > not
>> > > > > > > >> something easily taught in a tutorial (nor is there any one
>> > > > "right"
>> > > > > > way
>> > > > > > > >> to do it). You can use the existing tutorials, and the
>> > > > > > > >> scripts
>> > > > > > written
>> > > > > > > >> therein, to try and reverse-engineer the design and try to
>> > > > > > > >> understand the thought process that led the tutorial
>> authors
>> > > > > > > >> to write it that
>> > > > > > way.
>> > > > > > > >> Then if you're ambitious, try improving it.
>> > > > > > > >>
>> > > > > > > >> When you are doing your own project, focus on carrying out
>> > > > > > > >> your experiment. If you come up to a part that is
>> > > > > > > >> particularly repetitive
>> > > > > > or
>> > > > > > > >> something that fits conceptually into a scripting or
>> > > > > > > >> programming paradigm, write a script to handle that part
>> > > > > > > >> (Googling your question when you don't know how to do
>> > > > > > > >> something). The more you do this, the better you will get
>> at
>> > > > > > > >> scripting and the more you will be able to automate your
>> > > workflows.
>> > > > > > > >>
>> > > > > > > >> If you find yourself doing the same thing over and over for
>> > > > > > > >> different projects (like imaging a trajectory or
>> RMS-fitting
>> > > > > > > >> your system with cpptraj or computing a distance and
>> plotting
>> > > > > > > >> the result), try to
>> > > > > > write a
>> > > > > > > >> script to automate that task. As your experience in the
>> > > > > > > >> field grows,
>> > > > > > so
>> > > > > > > >> too will your library of scripts you find useful and your
>> > > > > > > >> scripting ability overall. Mine is here:
>> > > > > > > >>
>> http://scanmail.trustwave.com/?c=129&d=oNmm09uxxQhwv7nR-AzXvp
>> >
>> > > > > > > >> 7Jpj
>> > > > > > > >>
>> TPPzPRoPlia-rteg&u=https%3a%2f%2fgithub%2ecom%2fswails%2fjmss
>> > > > > > > >> crip
>> > > > > > > >> ts%2f
>> > > > > > > and
>> > > > > > > >> a trained eye can clearly see which ones I wrote when I was
>> > > > > > experienced
>> > > > > > > >> and which I didn't.
>> > > > > > > >>
>> > > > > > > >> 6 years ago, I had never used Unix before. I was decent at
>> > > > > > > >> scripting within a few months and quite strong within a
>> year
>> > > > > > > >> or two -- all following the above advice. That which is
>> > > > > > > >> self-learned is learned the best (and is remembered the
>> > > longest).
>> > > > > > > >>
>> > > > > > > >> Always rambling,
>> > > > > > > >> Jason
>> > > > > > > >>
>> > > > > > > >> --
>> > > > > > > >> Jason M. Swails
>> > > > > > > >> BioMaPS,
>> > > > > > > >> Rutgers University
>> > > > > > > >> Postdoctoral Researcher
>> > > > > > > >>
>> > > > > > > >>
>> > > > > > > >> _______________________________________________
>> > > > > > > >> AMBER mailing list
>> > > > > > > >> AMBER.ambermd.org
>> > > > > > > >>
>> http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04
>> >
>> > > > > > > >> h-Hh
>> > > > > > > >>
>> nsTJFbwn46wDPTCg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailm
>> > > > > > > >> an%2
>> > > > > > > >> flistinfo%2famber
>> > > > > > > >>
>> > > > > > > >
>> > > > > > > >
>> > > > > > > _______________________________________________
>> > > > > > > AMBER mailing list
>> > > > > > > AMBER.ambermd.org
>> > > > > > >
>> http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04h-H
>> >
>> > > > > > > hnsT
>> > > > > > >
>> JFbwn46wDPTCg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2fl
>> > > > > > > isti
>> > > > > > > nfo%2famber
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > -------------------------
>> > > > > > Daniel R. Roe, PhD
>> > > > > > Department of Medicinal Chemistry
>> > > > > > University of Utah
>> > > > > > 30 South 2000 East, Room 201
>> > > > > > Salt Lake City, UT 84112-5820
>> > > > > >
>> http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04h-Hhn
>> >
>> > > > > > sTJF
>> > > > > >
>> bwi45xzyEAg&u=http%3a%2f%2fhome%2echpc%2eutah%2eedu%2f%7echeatham%
>> > > > > > 2f
>> > > > > > (801) 587-9652
>> > > > > > (801) 585-6208 (Fax)
>> > > > > > _______________________________________________
>> > > > > > AMBER mailing list
>> > > > > > AMBER.ambermd.org
>> > > > > >
>> http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04h-Hhn
>> >
>> > > > > > sTJF
>> > > > > >
>> bwn46wDPTCg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flisti
>> > > > > > nfo%
>> > > > > > 2famber
>> > > > > >
>> > > > > _______________________________________________
>> > > > > AMBER mailing list
>> > > > > AMBER.ambermd.org
>> > > > >
>> > > > >
>> > > >
>> http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
>> >
>> > > >
>> VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
>> > > > 2famber
>> > > > >
>> > > > > _______________________________________________
>> > > > > AMBER mailing list
>> > > > > AMBER.ambermd.org
>> > > > >
>> > > >
>> http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
>> >
>> > > >
>> VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
>> > > > 2famber
>> > > > >
>> > > > > _______________________________________________
>> > > > > AMBER mailing list
>> > > > > AMBER.ambermd.org
>> > > > >
>> > > >
>> http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
>> >
>> > > >
>> VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
>> > > > 2famber
>> > > > >
>> > > > _______________________________________________
>> > > > AMBER mailing list
>> > > > AMBER.ambermd.org
>> > > >
>> > > >
>> http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
>> >
>> > > >
>> VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
>> > > > 2famber
>> > > >
>> > > > _______________________________________________
>> > > > AMBER mailing list
>> > > > AMBER.ambermd.org
>> > > >
>> http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
>> >
>> > > >
>> VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
>> > > > 2famber
>> > > >
>> > > _______________________________________________
>> > > AMBER mailing list
>> > > AMBER.ambermd.org
>> > >
>> > >
>> >
>> http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03aVHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%2famber
>> >
>> > > _______________________________________________
>> > > AMBER mailing list
>> > > AMBER.ambermd.org
>> > > http://lists.ambermd.org/mailman/listinfo/amber
>> > >
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Sep 24 2014 - 02:00:03 PDT
Custom Search