Re: [AMBER] Cluster results

From: Valentina Romano <valentina.romano.unibas.ch>
Date: Mon, 11 Aug 2014 10:13:24 +0000

Hi

I read details about the cluster command in cpptraj and it was not clear to me how to specify what reference structure to use.
In addition, what is the criterion to choose a structure as reference structure when a MD trajectory is clustered?

Thank you
Best,
valentina
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
Klingelbergstrasse 61 | CH-4056 Basel |

Phone: +41 61 267 15 80


________________________________________
From: Daniel Roe [daniel.r.roe.gmail.com]
Sent: Friday, August 08, 2014 4:35 PM
To: AMBER Mailing List
Subject: Re: [AMBER] Cluster results

OK, I think the issue is that you are trying to load your stripped
cluster trajectory (i.e. trajectory without water) with your original
topology. What you want to do is add something like 'outprefix nowat'
to your 'strip' command. This will write out a stripped topology named
nowat.<original topology name> that will correspond to your cluster
trajectories/representatives. Also, unless you're using vmd on windows
I really recommend using the netcdf format, 'clusterfmt netcdf', as
this format is superior to the Amber ASCII trajectory format in every
way.

-Dan

On Fri, Aug 8, 2014 at 8:30 AM, Valentina Romano
<valentina.romano.unibas.ch> wrote:
> The input was:
>
> trajin ../PknGAde_restr30ns.mdcrd
> strip :WAT
> cluster out cnumvtime.dat repout PknGAde_restr30ns.pdb repfmt pdb clusterout PknGAde_restr30ns.nc
> averagelinkage clusters 5 rms sieve 10 :247<:5.0
>
> I tried both (Amber coordinate and Amber coordinate with water box) and it did not work.
>
> Vale
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
> Klingelbergstrasse 61 | CH-4056 Basel |
>
> Phone: +41 61 267 15 80
>
>
> ________________________________________
> From: Daniel Roe [daniel.r.roe.gmail.com]
> Sent: Friday, August 08, 2014 4:25 PM
> To: AMBER Mailing List
> Subject: Re: [AMBER] Cluster results
>
> Hi,
>
> Without at least the input you gave to cpptraj (and ideally the
> output) I can only guess, but perhaps you have box coordinates in your
> amber trajectory and you are loading them as 'Amber coordinates'
> instead of 'Amber coordinates with periodic box' (or vice versa).
>
> -Dan
>
> On Fri, Aug 8, 2014 at 8:10 AM, Valentina Romano
> <valentina.romano.unibas.ch> wrote:
>> Hi,
>>
>> Thank you.
>>
>> I have an additional question.
>> I used cpptraj, as you suggested before and I got 5 <trajfileprefix>.cx. If I load them in VMD as AMber coordinates they looks so strange?
>> Do you why?
>>
>> Cheers,
>> Valentina
>>
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>> Klingelbergstrasse 61 | CH-4056 Basel |
>>
>> Phone: +41 61 267 15 80
>>
>>
>> ________________________________________
>> From: Daniel Roe [daniel.r.roe.gmail.com]
>> Sent: Friday, August 08, 2014 3:56 PM
>> To: AMBER Mailing List
>> Subject: Re: [AMBER] Cluster results
>>
>> Hi,
>>
>> On Fri, Aug 8, 2014 at 5:42 AM, Valentina Romano
>> <valentina.romano.unibas.ch> wrote:
>>> About the mask I used ( the correct one: :247<:5.0):
>>> I would like to compare frames using the ligand (residue number 247) and all residues within 5 A from it (<:5.0).
>>> Is the ':247<:5.0' correct for this purpose?
>>> If I do not specify any reference structure, is the first frame used as reference by default? And, if yes, does it correct?
>>
>> The mask is right for what you want. You need to specify a reference
>> in cpptraj for distance-based masks to work, and I believe the same is
>> true in ptraj. It will "work" in the sense that the residues initially
>> selected by that mask based on the reference structure will be the
>> residues used for calculating the RMS metric throughout the
>> trajectory. Whether it will work the way you want or not depends on
>> how much the structure is changing in that region. For example, if
>> there is some conformational shift near the beginning of the
>> trajectory where one of the residues initially close to 247 moves a
>> distance away, you will probably get consistently high RMSDs between
>> structures and your clustering will be affected accordingly. The best
>> way to ascertain the affect is to try it out and see. Pay close
>> attention to what cluster sizes you get, what the representatives look
>> like, etc. Clustering is usually more of an art form than a science,
>> and it takes some playing around with it to get it right.
>>
>> Also, if you haven't already, check out this article on clustering:
>> http://pubs.acs.org/doi/abs/10.1021/ct700119m
>>
>> Very in-depth but worth the time.
>>
>> -Dan
>>
>>>
>>> Thank you.
>>>
>>> Best,
>>> Vale
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>
>>> Phone: +41 61 267 15 80
>>>
>>>
>>> ________________________________________
>>> From: Daniel Roe [daniel.r.roe.gmail.com]
>>> Sent: Thursday, August 07, 2014 4:18 PM
>>> To: AMBER Mailing List
>>> Subject: Re: [AMBER] Cluster results
>>>
>>> Hi,
>>>
>>> A few comments based on your ptraj output. First, as Tom mentioned, in
>>> ptraj if the 'all' cluster trajectory reached a certain file size it
>>> was split up into several chunks to prevent the file size from
>>> exceeding the file system's size limit, which years ago was commonly 4
>>> GB (or even 2 GB). I think the limit encoded in ptraj is even smaller
>>> than that though.
>>>
>>> The reason your cluster trajectories are so large is it appears you
>>> are saving all atoms, solvent included. Unless you really need the
>>> solvent for some reason I recommend that you strip the water prior to
>>> clustering with 'strip :WAT'.
>>>
>>> Also, the mask you are using for RMS fitting appears malformed - it
>>> looks like you tried to have a distance-dependent mask. What you have
>>> is:
>>>
>>> :247 <:5.0
>>>
>>> The space means ':247' and '<:5.0' are treated as separate tokens. You
>>> can see this in your output:
>>>
>>> MASK = :247
>>> Mask [:247] represents 15 atoms
>>>
>>> To get this mask processed correctly you either need to remove the
>>> space or enclose it in quotes. Also, this mask may not do what you
>>> expect. Distance-dependent masks in ptraj and cpptraj are set up once
>>> based on reference structures (except for the 'mask' action in
>>> cpptraj). In your case since no atoms were stripped this might include
>>> whatever water molecules are present in your reference, which will
>>> likely have drifted far away in some frames leading to crazy RMSD
>>> values. If you can elaborate on what your purpose was for using a
>>> distance-dependent mask I may be able to make a better recommendation.
>>>
>>> Finally, I recommend you use cpptraj, which is faster, more stable,
>>> and better-supported. The syntax for coordinate output in clustering
>>> is slightly different, so you might use the following input:
>>>
>>> strip :WAT
>>> cluster out cnumvtime.dat repout PknGAde_restr30ns.pdb repfmt pdb \
>>> clusterout PknGAde_restr30ns.nc clusterfmt netcdf \
>>> averagelinkage clusters 5 rms sieve 10 :247<:5.0
>>>
>>> Remember, for the distance dependent mask to work you will need to
>>> load a reference. See the Amber 14 manual for full details on the
>>> keywords.
>>>
>>> Hope this helps,
>>>
>>> -Dan
>>>
>>>
>>> On Thu, Aug 7, 2014 at 2:34 AM, Valentina Romano
>>> <valentina.romano.unibas.ch> wrote:
>>>> Hi
>>>>
>>>> I got both <filename>.rep.c<#> and <filename>.avg.c<#>. In details, I got 5 representative structures and 5 average structures ( 1 per each cluster).
>>>> Please find in attachment the ptraj output file I got.
>>>> Hope this help to understand.
>>>>
>>>> Best,
>>>> Vale
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>>
>>>> Phone: +41 61 267 15 80
>>>>
>>>>
>>>> ________________________________________
>>>> From: Daniel Roe [daniel.r.roe.gmail.com]
>>>> Sent: Wednesday, August 06, 2014 5:09 PM
>>>> To: AMBER Mailing List
>>>> Subject: Re: [AMBER] Cluster results
>>>>
>>>> Hi,
>>>>
>>>> Just to check, did you also get files <filename>.rep.c<#> and
>>>> <filename>.avg.c<#> (for representative and average structures
>>>> respectively)? The only way I can think of you getting files named
>>>> <filename>.c<#>.X is if you chose a file format that writes multiple
>>>> frames (like Amber restart or PDB etc). I think we would need to see
>>>> your entire ptraj output to debug further.
>>>>
>>>> As Brian suggested, if you're just using averagelinkage you may want
>>>> to try using cpptraj instead; it's faster and better-supported
>>>> overall.
>>>>
>>>> -Dan
>>>>
>>>> On Wed, Aug 6, 2014 at 8:27 AM, Valentina Romano
>>>> <valentina.romano.unibas.ch> wrote:
>>>>> This is the cluster command I used:
>>>>>
>>>>> trajin ../PknGAde_md_rest20ns_reimage.mdcrd
>>>>> cluster out PknGAde_restr20ns representative pdb average pdb all amber averagelinkage clusters 5 rms sieve 10 :247 <:5.0
>>>>>
>>>>> Suggestions?
>>>>>
>>>>> vale
>>>>>
>>>>>
>>>>>
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>>>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>>>
>>>>> Phone: +41 61 267 15 80
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> From: Brian Radak [radak004.umn.edu]
>>>>> Sent: Wednesday, August 06, 2014 3:37 PM
>>>>> To: AMBER Mailing List
>>>>> Subject: Re: [AMBER] Cluster results
>>>>>
>>>>> I think you'll have to provide the specific cluster command you used if you
>>>>> want help on this.
>>>>>
>>>>> Also, if I remember correctly, the cluster implementations in ptraj and
>>>>> cpptraj are a bit different, but the latter is still recommended.
>>>>>
>>>>> Regards,
>>>>> Brian
>>>>>
>>>>>
>>>>> On Wed, Aug 6, 2014 at 8:22 AM, Valentina Romano <valentina.romano.unibas.ch
>>>>>> wrote:
>>>>>
>>>>>> Dear Amber users
>>>>>>
>>>>>> I clustered a MD trajectory using the command cluster in ptraj. I used 5
>>>>>> as number of clusters.
>>>>>>
>>>>>> In addition to 5 filename.ci files (e.g. filename.c0, filename.c1 etc), I
>>>>>> obtained additional files for each .ci file.
>>>>>> For instance, in addition to filename.c0 I also obtained flilename.c0.1,
>>>>>> filename.c0.2 and so on.
>>>>>>
>>>>>> Anyone can explain me what these files are?
>>>>>>
>>>>>> Best,
>>>>>> Valentina
>>>>>>
>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB
>>>>>> Swiss Institute of Bioinformatics
>>>>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>>>>
>>>>>> Phone: +41 61 267 15 80
>>>>>>
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ================================ Current Address =======================
>>>>> Brian Radak : BioMaPS
>>>>> Institute for Quantitative Biology
>>>>> PhD candidate - York Research Group : Rutgers, The State
>>>>> University of New Jersey
>>>>> University of Minnesota - Twin Cities : Center for
>>>>> Integrative Proteomics Room 308
>>>>> Graduate Program in Chemical Physics : 174 Frelinghuysen Road,
>>>>> Department of Chemistry : Piscataway, NJ
>>>>> 08854-8066
>>>>> radak004.umn.edu :
>>>>> ====================================================================
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>>
>>>>
>>>> --
>>>> -------------------------
>>>> Daniel R. Roe, PhD
>>>> Department of Medicinal Chemistry
>>>> University of Utah
>>>> 30 South 2000 East, Room 307
>>>> Salt Lake City, UT 84112-5820
>>>> http://home.chpc.utah.edu/~cheatham/
>>>> (801) 587-9652
>>>> (801) 585-6208 (Fax)
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>
>>>
>>>
>>> --
>>> -------------------------
>>> Daniel R. Roe, PhD
>>> Department of Medicinal Chemistry
>>> University of Utah
>>> 30 South 2000 East, Room 201
>>> Salt Lake City, UT 84112-5820
>>> http://home.chpc.utah.edu/~cheatham/
>>> (801) 587-9652
>>> (801) 585-6208 (Fax)
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> --
>> -------------------------
>> Daniel R. Roe, PhD
>> Department of Medicinal Chemistry
>> University of Utah
>> 30 South 2000 East, Room 307
>> Salt Lake City, UT 84112-5820
>> http://home.chpc.utah.edu/~cheatham/
>> (801) 587-9652
>> (801) 585-6208 (Fax)
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> --
> -------------------------
> Daniel R. Roe, PhD
> Department of Medicinal Chemistry
> University of Utah
> 30 South 2000 East, Room 307
> Salt Lake City, UT 84112-5820
> http://home.chpc.utah.edu/~cheatham/
> (801) 587-9652
> (801) 585-6208 (Fax)
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



--
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Aug 11 2014 - 03:30:03 PDT
Custom Search