Thank you very much guys!
Very useful I'll try to make some experimentation with those combinations
of the command and let you know regarding my success! :)
James
2014-09-24 15:27 GMT+02:00 Hannes Loeffler <Hannes.Loeffler.stfc.ac.uk>:
> There is nothing wrong with a one-off script but there is a problem
> with a, well, less than optimal suggestion to someone who doesn't
> understand what a give code is actually doing and just breaks on the
> second example it is thrown at. As I tried to point out, your example
> is not doing what it is expected to do. A real solution should not
> just work coincidentially and assumptions about input should be as
> minimal as possible. PDB parsing is so common in our field that it is
> most likely worthwhile to learn how to use a "proper" one (ideally one
> which allows to work with all those PDB variants out there).
>
> The 4th column in a PDB file is actually the fourth character of the
> record name. I suggest to have a close look at the documentation at
> http://www.wwpdb.org/documentation/format33/v3.3.html . The residue
> name is really in columns 18-20 but some "extensions" may use 18-21
> unless column 21 is "abused" to extend the chain ID. So is, starting
> from column 18, 'ATOMA' a residue named 'ATOM' and chain 'A' or a
> residue named 'ATO' and chain 'MA' or has 'M' no meaning what-soever?
> What about, starting from column 17!: 'ATOM' = alternate locator 'A'
> and residue name 'TOM' for a standard PDB, etc., etc.
>
> Cheers,
> Hannes.
>
>
> On Wed, 24 Sep 2014 14:45:11 +0200
> Anselm Horn <Anselm.Horn.biochem.uni-erlangen.de> wrote:
>
> > I totally agree that my proposal is far from being perfect.
> >
> > However, it should serve as an example for a potential solution of the
> > problem provided (number of CYX residue) with all the limitations in
> > mind (for many pdb file the fourth column actually holds the residue
> > name). However, there's of course a difference between a script for a
> > specialized task and a more general one.
> >
> > And I further agree, that this is not an AMBER-related topic.
> >
> > Regards,
> >
> > Anselm
> >
> >
> > Am 24.09.2014 14:18, schrieb Hannes Loeffler:
> > > Ouch.
> > >
> > > Just a few problems I spotted with this:
> > > 1) grep ATOM: will return _any_ line with the string 'ATOM' occuring
> > > _anywhere_ on a line; some codes may also decide to use 'HETATM'
> > > instead because CYX is non-standard
> > > 2) grep CA: same as above, could be calcium atom, part of a residue
> > > name, segment name or possibly some other abuse of the format or
> > > simply any occurence on a non ATOM/HETATM record
> > > 3) awk: the ancient PDB format is a _fixed-column_ format which also
> > > implies that columns can "run" into each other while awk splits (by
> > > default) on whitespace which may not be there; also, the residue
> > > name is not the 4th datum in a ATOM/HETATM record.
> > >
> > > The two lines code doubles this but why would one need parse the
> > > input twice anyway?
> > >
> > > Cheers,
> > > Hannes.
> >
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
> --
> Scanned by iCritical.
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Sep 24 2014 - 08:00:03 PDT