Re: [AMBER] bash scripting for MD tasks

From: James Starlight <jmsstarlight.gmail.com>
Date: Mon, 8 Sep 2014 23:24:42 +0400

Dear Amber's users!

I need some advice conserning algorithms of script for virtual screening
which I'm making at this time.
Briefly it should be some looping (for N number of model.pdb files) with
the sequence of the bash commands dealing with its edition, docking
superimposition, etc. Workflow consist of 3 steps: i) preparatory of the
whole sets of models (resulting in the processing of each models by tleap
and preparation for further md run), ii) md run of each system including
heating, equilibration, prod run, iii)some post-processing of the
trajectories (mmgbsa).

Now I'd like to consider i) step (preparatory of the each system for the
md) which is most complex here:
briefly I need to 0) put each of the model from work dir and make a
specified folder for each of which 1) pdb2pqr of each model to assign
protonation states and convert residues toamber params 2) superimpose(by
means of ProFit for instance) each model agains some reference (for which I
have docking parameters e.g definition of the cavity XYZ), 3) dock (one)
ligand to each of the model using autodock and some inputfile, 4)
processing of each model by tleap, => as the result: n work dirs for each
of models ready for the md.
How do you think will it be better to use all of those commands from one
loop (looping each model and make sequence of the preparatory envents in
each of the dir) or make several loops (first for the pdb2prqprocessing of
the whole sets of the models, the second loop for superimposition, the
third for the docking etc). In what case my script will be more flexible
and well-controlled (e.g in case where I need to add new models to existing
sets or change some parameters for each preparatory stage)? What should I
also take into account that could make my life better ? :)

Thanks for help,

James

2014-07-21 17:32 GMT+04:00 Parker de Waal <Parker.deWaal.vai.org>:

> Hi James,
>
> I would recommend reading into gnu parallel. For a small example see here
>
> for i in (1..6); do
> sem -j 2 ./simulation_$1.sh
> done
> sem --wait
>
> Best,
> Parker
>
> -----Original Message-----
> From: James Starlight [mailto:jmsstarlight.gmail.com]
> Sent: Monday, July 21, 2014 7:54 AM
> To: AMBER Mailing List
> Subject: Re: [AMBER] bash scripting for MD tasks
>
> Dear Amber users!
>
> For this example in my work dir I have 6 folders named sim_1, sim_2 ...
> sim_6 consisted of all required input files including sh scripts which run
> MD for each system. I need an idea for some shell main script which will
> run several md jobs at the same time. For instance totally I need run 6 md
> jobs (just run six simulation_№(1-6).sh launch files in each sim folder).
> Because my workstation has 2 GPUs I have to run only 2 simulations at same
> time repeating each circle 3 times (2sims*3repeations like in GYM when we
> do lifting :-) ) Below I can see my suggestion about such main script.
> Using csh (or bash) I can use some loops e.g foreach or while in csh
>
> foreach i (1 2 3 4 5 6)
> cd ./simulation_$i
> sh simulation_$1.sh
> cd ..
> end
>
> Hovewer this algoritm will run next (i+1) simulation (.sh file) only in
> case when previous (i) have been finished. How should I modify my script
> to make 2 simulations in each for repetition providing also in each
> simulation path to the specified GPU ( I guess in csh it could be done by
> means of adding "set env CUDA_VISIBLE_DEVICE 0,1" in the foreach loop of my
> main script ).
>
>
> Thanks for help,
>
>
> James
>
>
> 2014-06-23 18:30 GMT+04:00 Parker de Waal <Parker.deWaal.vai.org>:
>
> > HI James,
> >
> > I don't believe this is a de-novo approach, rather an in-silico
> > introduction of backbone dependent rotomters. When introducing point
> > mutations the structure of the protein will not be altered, simply a
> > residue will be replaced with another.
> >
> > To your second point, yes you could do that. You could simply loop
> > over the pdb file, and subsiquent mutant pdbs until all the desired
> > mutations are applied. Ex:
> > BASEPDB -> 1, LEU -> 1LEUPDB
> > 1LEUPDB -> 2, ASN -> 2ASNPDB
> > etc etc.
> >
> > I'm sure there are better ways to do this, however this would be quite
> > simple to write. Good luck.
> >
> > Best,
> > Parker
> > ________________________________________
> > From: James Starlight [jmsstarlight.gmail.com]
> > Sent: Monday, June 23, 2014 7:48 AM
> > To: AMBER Mailing List
> > Subject: Re: [AMBER] bash scripting for MD tasks
> >
> > Thanks for suggestions!
> >
> > In fact this script uses input WT pdb as the template + list of
> > mutation in mutate.dat for de-novo modelling of the mutated protein
> > using modeller, doesn't it? If yes my questions: 1) How the
> > conformation of flexible parts of the mutated protein (e.g loops) be
> > perturbed (in comparison to the WT
> > model) in case when mutations will be introduced in this regions after
> > such modelling ?
> > 2) Some trivial question but should be specified. Might such python
> > script used with multiple mutations.dat files to produce several
> > mutants from 1 WT model an once (e.g by means of new looping script
> > using python mutate_model.py, dat files for each mutant and 1 WT pdb as
> the inputs) ?
> >
> > TFH,
> >
> >
> > James
> >
> >
> > 2014-06-22 18:07 GMT+04:00 Parker de Waal <Parker.deWaal.vai.org>:
> >
> > > while read mutations -r -d ',' mutLoc aminoAcid ; do
> > > python mutate_model.py PDB_NAME $mutLoc $aminoAcid A > $mutLoc.log
> > > done <mutations.dat
> > >
> > > Fixed line break.
> > > -----Original Message-----
> > > From: Parker de Waal [mailto:Parker.deWaal.vai.org]
> > > Sent: Sunday, June 22, 2014 10:04 AM
> > > To: 'AMBER Mailing List'
> > > Subject: Re: [AMBER] bash scripting for MD tasks
> > >
> > > Hi James,
> > >
> > > I would highly recommend a quick google search for 'mutate pdb
> residue'.
> > > You'll find the second or third link provides a nice python script
> > > to interface with Modeller which you could easily automate to
> > > introduce hundreds of mutations.
> > >
> > > Example:
> > > #!/bin/bash
> > > cat > mutations.dat <<'EOF'
> > > 1, LEU
> > > 2, ASP
> > > 3, ASN
> > > 4, LYS
> > > EOF
> > >
> > > while read mutations -r -d ',' mutLoc aminoAcid ; do
> > > python mutate_model.py PDB_NAME $mutLoc $aminoAcid A > $mutLoc.log
> > > done <mutations.dat
> > >
> > > Please note I did not test this script, however it should work.
> > >
> > > Parker
> > >
> > > -----Original Message-----
> > > From: James Starlight [mailto:jmsstarlight.gmail.com]
> > > Sent: Sunday, June 22, 2014 9:25 AM
> > > To: AMBER Mailing List
> > > Subject: Re: [AMBER] bash scripting for MD tasks
> > >
> > > Thanks Dan,
> > >
> > > One of my task consits of the quick introduction of the point
> > > mutations
> > to
> > > the given PDB of membrane receptor and further creation of the
> > > models of the mutated protein by the tleap. Assuming I can easily
> > > make script for
> > the
> > > second part of this workflow having as the input mutated pdb and
> > > solvated membran in merged pdb I dont absolutely know how I could
> > > make quickly mutations in the receptor avoiding of the usage of any
> > > GUI-programs like CHIMERA.
> > >
> > > I'll be thankful for any proposed solutions.
> > >
> > > James
> > >
> > >
> > > 2014-06-20 18:47 GMT+04:00 Daniel Roe <daniel.r.roe.gmail.com>:
> > >
> > > > See the supporting info in this publication:
> > > > http://scanmail.trustwave.com/?c=129&d=oNmm09uxxQhwv7nR-AzXvp7JpjT
> > > > PPzP
> > > > RoPkyOe7veg&u=http%3a%2f%2fpubs%2eacs%2eorg%2fdoi%2fabs%2f10%2e102
> > > > 1%2f
> > > > jp4125099
> > > >
> > > > -Dan
> > > >
> > > >
> > > > On Fri, Jun 20, 2014 at 8:25 AM, James Starlight
> > > > <jmsstarlight.gmail.com>
> > > > wrote:
> > > >
> > > > > One question about possible algorithm for quick checking of the
> > > > convergence
> > > > > of my trajectory (ies).
> > > > > For instance I'd like to write some simple script having as the
> > > > > input several trajectories of the same system with different
> > > > > length to check
> > > > (for
> > > > > instance by means of principal mode analysis or another not very
> > > > expensive
> > > > > method) in what case my system have been converged fully. Are
> > > > > there any examples of such scripts or ready workflows?
> > > > >
> > > > > James
> > > > >
> > > > >
> > > > > 2014-06-19 17:52 GMT+04:00 James Starlight <jmsstarlight.gmail.com
> >:
> > > > >
> > > > > > Thanks, Jason!
> > > > > >
> > > > > > It's very useful advises and you've made very great script
> library!
> > > > I'll
> > > > > > try to follow your basic ideas during my own studies.
> > > > > >
> > > > > > James
> > > > > >
> > > > > >
> > > > > > 2014-06-17 23:25 GMT+04:00 Jason Swails <jason.swails.gmail.com
> >:
> > > > > >
> > > > > > On Tue, 2014-06-17 at 22:16 +0400, James Starlight wrote:
> > > > > >> > Hi Dan,
> > > > > >> >
> > > > > >> >
> > > > > >> > many thanks for the bash guide- I've found it very useful.
> > > > > >> > In
> > > > general
> > > > > >> I'd
> > > > > >> > like to look at some basic bash script examples suitable
> > > > > >> > for typical
> > > > > md
> > > > > >> > jobs dealing with the running of many of simulation on
> > > > > >> > clusters
> > > > > because
> > > > > >> the
> > > > > >> > most complicated examples like replica exchange simulation
> > > > > >> > have
> > > > > already
> > > > > >> > been present in the amber tutorials.
> > > > > >>
> > > > > >> You've gotten the most helpful responses you can possibly get
> > > > > >> about
> > > > your
> > > > > >> question so far, so I won't belabor the points others have made.
> > > > > >> I'll relay my own opinions on the topic, though.
> > > > > >>
> > > > > >> Scripting is, at its core, simply a tool we (as computational
> > > > > >> scientists) use to increase our efficiency and productivity.
> > > > > >> For example, high-throughput work like screening a database
> > > > > >> of millions of compounds cannot be done unless "scripted."
> > > > > >>
> > > > > >> When you are designing an experiment or calculation you want
> > > > > >> to
> > > > perform,
> > > > > >> you have a list of tasks you need to get done. Designing a
> > > > > >> script to carry out these tasks requires you to divide your
> > > > > >> problem up into simpler chunks that can be easily represented
> > > > > >> with common logic structures in programming/scripting, like
> > > > > >> loops and simple
> > > > conditionals.
> > > > > >> Writing the script is easy -- if you don't know the syntax of
> > > > > >> doing something like looping over a list, you can google your
> > > > > >> question and
> > > > see
> > > > > >> that it has most likely been asked and answered several times
> > > > > >> on StackOverflow before.
> > > > > >>
> > > > > >> _Designing_ the script is the real challenge (it is an art).
> > > > > >> It is
> > > > not
> > > > > >> something easily taught in a tutorial (nor is there any one
> > "right"
> > > > way
> > > > > >> to do it). You can use the existing tutorials, and the
> > > > > >> scripts
> > > > written
> > > > > >> therein, to try and reverse-engineer the design and try to
> > > > > >> understand the thought process that led the tutorial authors
> > > > > >> to write it that
> > > > way.
> > > > > >> Then if you're ambitious, try improving it.
> > > > > >>
> > > > > >> When you are doing your own project, focus on carrying out
> > > > > >> your experiment. If you come up to a part that is
> > > > > >> particularly repetitive
> > > > or
> > > > > >> something that fits conceptually into a scripting or
> > > > > >> programming paradigm, write a script to handle that part
> > > > > >> (Googling your question when you don't know how to do
> > > > > >> something). The more you do this, the better you will get at
> > > > > >> scripting and the more you will be able to automate your
> workflows.
> > > > > >>
> > > > > >> If you find yourself doing the same thing over and over for
> > > > > >> different projects (like imaging a trajectory or RMS-fitting
> > > > > >> your system with cpptraj or computing a distance and plotting
> > > > > >> the result), try to
> > > > write a
> > > > > >> script to automate that task. As your experience in the
> > > > > >> field grows,
> > > > so
> > > > > >> too will your library of scripts you find useful and your
> > > > > >> scripting ability overall. Mine is here:
> > > > > >> http://scanmail.trustwave.com/?c=129&d=oNmm09uxxQhwv7nR-AzXvp
> > > > > >> 7Jpj
> > > > > >> TPPzPRoPlia-rteg&u=https%3a%2f%2fgithub%2ecom%2fswails%2fjmss
> > > > > >> crip
> > > > > >> ts%2f
> > > > > and
> > > > > >> a trained eye can clearly see which ones I wrote when I was
> > > > experienced
> > > > > >> and which I didn't.
> > > > > >>
> > > > > >> 6 years ago, I had never used Unix before. I was decent at
> > > > > >> scripting within a few months and quite strong within a year
> > > > > >> or two -- all following the above advice. That which is
> > > > > >> self-learned is learned the best (and is remembered the
> longest).
> > > > > >>
> > > > > >> Always rambling,
> > > > > >> Jason
> > > > > >>
> > > > > >> --
> > > > > >> Jason M. Swails
> > > > > >> BioMaPS,
> > > > > >> Rutgers University
> > > > > >> Postdoctoral Researcher
> > > > > >>
> > > > > >>
> > > > > >> _______________________________________________
> > > > > >> AMBER mailing list
> > > > > >> AMBER.ambermd.org
> > > > > >> http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04
> > > > > >> h-Hh
> > > > > >> nsTJFbwn46wDPTCg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailm
> > > > > >> an%2
> > > > > >> flistinfo%2famber
> > > > > >>
> > > > > >
> > > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing list
> > > > > AMBER.ambermd.org
> > > > > http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04h-H
> > > > > hnsT
> > > > > JFbwn46wDPTCg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2fl
> > > > > isti
> > > > > nfo%2famber
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -------------------------
> > > > Daniel R. Roe, PhD
> > > > Department of Medicinal Chemistry
> > > > University of Utah
> > > > 30 South 2000 East, Room 201
> > > > Salt Lake City, UT 84112-5820
> > > > http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04h-Hhn
> > > > sTJF
> > > > bwi45xzyEAg&u=http%3a%2f%2fhome%2echpc%2eutah%2eedu%2f%7echeatham%
> > > > 2f
> > > > (801) 587-9652
> > > > (801) 585-6208 (Fax)
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER.ambermd.org
> > > > http://scanmail.trustwave.com/?c=129&d=odmm0_rRwF4t-ke72pHi04h-Hhn
> > > > sTJF
> > > > bwn46wDPTCg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flisti
> > > > nfo%
> > > > 2famber
> > > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > >
> > >
> > http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
> > VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
> > 2famber
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > >
> > http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
> > VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
> > 2famber
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > >
> > http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
> > VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
> > 2famber
> > >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> >
> > http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
> > VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
> > 2famber
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03a
> > VHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%
> > 2famber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
>
> http://scanmail.trustwave.com/?c=129&d=2v_M02GZkn7FXp8mMS9QQSHYTjnc03aVHz99gDhNQg&u=http%3a%2f%2flists%2eambermd%2eorg%2fmailman%2flistinfo%2famber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Sep 08 2014 - 12:30:03 PDT
Custom Search