Hi,
As in ptraj, reference structures can be specified with the
'reference' command in cpptraj. In general, reference structures are
not needed for clustering. In your case you need one because you are
using a distance-based mask.
-Dan
On Mon, Aug 11, 2014 at 4:13 AM, Valentina Romano
<valentina.romano.unibas.ch> wrote:
> Hi
>
> I read details about the cluster command in cpptraj and it was not clear to me how to specify what reference structure to use.
> In addition, what is the criterion to choose a structure as reference structure when a MD trajectory is clustered?
>
> Thank you
> Best,
> valentina
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
> Klingelbergstrasse 61 | CH-4056 Basel |
>
> Phone: +41 61 267 15 80
>
>
> ________________________________________
> From: Daniel Roe [daniel.r.roe.gmail.com]
> Sent: Friday, August 08, 2014 4:35 PM
> To: AMBER Mailing List
> Subject: Re: [AMBER] Cluster results
>
> OK, I think the issue is that you are trying to load your stripped
> cluster trajectory (i.e. trajectory without water) with your original
> topology. What you want to do is add something like 'outprefix nowat'
> to your 'strip' command. This will write out a stripped topology named
> nowat.<original topology name> that will correspond to your cluster
> trajectories/representatives. Also, unless you're using vmd on windows
> I really recommend using the netcdf format, 'clusterfmt netcdf', as
> this format is superior to the Amber ASCII trajectory format in every
> way.
>
> -Dan
>
> On Fri, Aug 8, 2014 at 8:30 AM, Valentina Romano
> <valentina.romano.unibas.ch> wrote:
>> The input was:
>>
>> trajin ../PknGAde_restr30ns.mdcrd
>> strip :WAT
>> cluster out cnumvtime.dat repout PknGAde_restr30ns.pdb repfmt pdb clusterout PknGAde_restr30ns.nc
>> averagelinkage clusters 5 rms sieve 10 :247<:5.0
>>
>> I tried both (Amber coordinate and Amber coordinate with water box) and it did not work.
>>
>> Vale
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>> Klingelbergstrasse 61 | CH-4056 Basel |
>>
>> Phone: +41 61 267 15 80
>>
>>
>> ________________________________________
>> From: Daniel Roe [daniel.r.roe.gmail.com]
>> Sent: Friday, August 08, 2014 4:25 PM
>> To: AMBER Mailing List
>> Subject: Re: [AMBER] Cluster results
>>
>> Hi,
>>
>> Without at least the input you gave to cpptraj (and ideally the
>> output) I can only guess, but perhaps you have box coordinates in your
>> amber trajectory and you are loading them as 'Amber coordinates'
>> instead of 'Amber coordinates with periodic box' (or vice versa).
>>
>> -Dan
>>
>> On Fri, Aug 8, 2014 at 8:10 AM, Valentina Romano
>> <valentina.romano.unibas.ch> wrote:
>>> Hi,
>>>
>>> Thank you.
>>>
>>> I have an additional question.
>>> I used cpptraj, as you suggested before and I got 5 <trajfileprefix>.cx. If I load them in VMD as AMber coordinates they looks so strange?
>>> Do you why?
>>>
>>> Cheers,
>>> Valentina
>>>
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>
>>> Phone: +41 61 267 15 80
>>>
>>>
>>> ________________________________________
>>> From: Daniel Roe [daniel.r.roe.gmail.com]
>>> Sent: Friday, August 08, 2014 3:56 PM
>>> To: AMBER Mailing List
>>> Subject: Re: [AMBER] Cluster results
>>>
>>> Hi,
>>>
>>> On Fri, Aug 8, 2014 at 5:42 AM, Valentina Romano
>>> <valentina.romano.unibas.ch> wrote:
>>>> About the mask I used ( the correct one: :247<:5.0):
>>>> I would like to compare frames using the ligand (residue number 247) and all residues within 5 A from it (<:5.0).
>>>> Is the ':247<:5.0' correct for this purpose?
>>>> If I do not specify any reference structure, is the first frame used as reference by default? And, if yes, does it correct?
>>>
>>> The mask is right for what you want. You need to specify a reference
>>> in cpptraj for distance-based masks to work, and I believe the same is
>>> true in ptraj. It will "work" in the sense that the residues initially
>>> selected by that mask based on the reference structure will be the
>>> residues used for calculating the RMS metric throughout the
>>> trajectory. Whether it will work the way you want or not depends on
>>> how much the structure is changing in that region. For example, if
>>> there is some conformational shift near the beginning of the
>>> trajectory where one of the residues initially close to 247 moves a
>>> distance away, you will probably get consistently high RMSDs between
>>> structures and your clustering will be affected accordingly. The best
>>> way to ascertain the affect is to try it out and see. Pay close
>>> attention to what cluster sizes you get, what the representatives look
>>> like, etc. Clustering is usually more of an art form than a science,
>>> and it takes some playing around with it to get it right.
>>>
>>> Also, if you haven't already, check out this article on clustering:
>>> http://pubs.acs.org/doi/abs/10.1021/ct700119m
>>>
>>> Very in-depth but worth the time.
>>>
>>> -Dan
>>>
>>>>
>>>> Thank you.
>>>>
>>>> Best,
>>>> Vale
>>>>
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>>
>>>> Phone: +41 61 267 15 80
>>>>
>>>>
>>>> ________________________________________
>>>> From: Daniel Roe [daniel.r.roe.gmail.com]
>>>> Sent: Thursday, August 07, 2014 4:18 PM
>>>> To: AMBER Mailing List
>>>> Subject: Re: [AMBER] Cluster results
>>>>
>>>> Hi,
>>>>
>>>> A few comments based on your ptraj output. First, as Tom mentioned, in
>>>> ptraj if the 'all' cluster trajectory reached a certain file size it
>>>> was split up into several chunks to prevent the file size from
>>>> exceeding the file system's size limit, which years ago was commonly 4
>>>> GB (or even 2 GB). I think the limit encoded in ptraj is even smaller
>>>> than that though.
>>>>
>>>> The reason your cluster trajectories are so large is it appears you
>>>> are saving all atoms, solvent included. Unless you really need the
>>>> solvent for some reason I recommend that you strip the water prior to
>>>> clustering with 'strip :WAT'.
>>>>
>>>> Also, the mask you are using for RMS fitting appears malformed - it
>>>> looks like you tried to have a distance-dependent mask. What you have
>>>> is:
>>>>
>>>> :247 <:5.0
>>>>
>>>> The space means ':247' and '<:5.0' are treated as separate tokens. You
>>>> can see this in your output:
>>>>
>>>> MASK = :247
>>>> Mask [:247] represents 15 atoms
>>>>
>>>> To get this mask processed correctly you either need to remove the
>>>> space or enclose it in quotes. Also, this mask may not do what you
>>>> expect. Distance-dependent masks in ptraj and cpptraj are set up once
>>>> based on reference structures (except for the 'mask' action in
>>>> cpptraj). In your case since no atoms were stripped this might include
>>>> whatever water molecules are present in your reference, which will
>>>> likely have drifted far away in some frames leading to crazy RMSD
>>>> values. If you can elaborate on what your purpose was for using a
>>>> distance-dependent mask I may be able to make a better recommendation.
>>>>
>>>> Finally, I recommend you use cpptraj, which is faster, more stable,
>>>> and better-supported. The syntax for coordinate output in clustering
>>>> is slightly different, so you might use the following input:
>>>>
>>>> strip :WAT
>>>> cluster out cnumvtime.dat repout PknGAde_restr30ns.pdb repfmt pdb \
>>>> clusterout PknGAde_restr30ns.nc clusterfmt netcdf \
>>>> averagelinkage clusters 5 rms sieve 10 :247<:5.0
>>>>
>>>> Remember, for the distance dependent mask to work you will need to
>>>> load a reference. See the Amber 14 manual for full details on the
>>>> keywords.
>>>>
>>>> Hope this helps,
>>>>
>>>> -Dan
>>>>
>>>>
>>>> On Thu, Aug 7, 2014 at 2:34 AM, Valentina Romano
>>>> <valentina.romano.unibas.ch> wrote:
>>>>> Hi
>>>>>
>>>>> I got both <filename>.rep.c<#> and <filename>.avg.c<#>. In details, I got 5 representative structures and 5 average structures ( 1 per each cluster).
>>>>> Please find in attachment the ptraj output file I got.
>>>>> Hope this help to understand.
>>>>>
>>>>> Best,
>>>>> Vale
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>>>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>>>
>>>>> Phone: +41 61 267 15 80
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> From: Daniel Roe [daniel.r.roe.gmail.com]
>>>>> Sent: Wednesday, August 06, 2014 5:09 PM
>>>>> To: AMBER Mailing List
>>>>> Subject: Re: [AMBER] Cluster results
>>>>>
>>>>> Hi,
>>>>>
>>>>> Just to check, did you also get files <filename>.rep.c<#> and
>>>>> <filename>.avg.c<#> (for representative and average structures
>>>>> respectively)? The only way I can think of you getting files named
>>>>> <filename>.c<#>.X is if you chose a file format that writes multiple
>>>>> frames (like Amber restart or PDB etc). I think we would need to see
>>>>> your entire ptraj output to debug further.
>>>>>
>>>>> As Brian suggested, if you're just using averagelinkage you may want
>>>>> to try using cpptraj instead; it's faster and better-supported
>>>>> overall.
>>>>>
>>>>> -Dan
>>>>>
>>>>> On Wed, Aug 6, 2014 at 8:27 AM, Valentina Romano
>>>>> <valentina.romano.unibas.ch> wrote:
>>>>>> This is the cluster command I used:
>>>>>>
>>>>>> trajin ../PknGAde_md_rest20ns_reimage.mdcrd
>>>>>> cluster out PknGAde_restr20ns representative pdb average pdb all amber averagelinkage clusters 5 rms sieve 10 :247 <:5.0
>>>>>>
>>>>>> Suggestions?
>>>>>>
>>>>>> vale
>>>>>>
>>>>>>
>>>>>>
>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics
>>>>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>>>>
>>>>>> Phone: +41 61 267 15 80
>>>>>>
>>>>>>
>>>>>> ________________________________________
>>>>>> From: Brian Radak [radak004.umn.edu]
>>>>>> Sent: Wednesday, August 06, 2014 3:37 PM
>>>>>> To: AMBER Mailing List
>>>>>> Subject: Re: [AMBER] Cluster results
>>>>>>
>>>>>> I think you'll have to provide the specific cluster command you used if you
>>>>>> want help on this.
>>>>>>
>>>>>> Also, if I remember correctly, the cluster implementations in ptraj and
>>>>>> cpptraj are a bit different, but the latter is still recommended.
>>>>>>
>>>>>> Regards,
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 6, 2014 at 8:22 AM, Valentina Romano <valentina.romano.unibas.ch
>>>>>>> wrote:
>>>>>>
>>>>>>> Dear Amber users
>>>>>>>
>>>>>>> I clustered a MD trajectory using the command cluster in ptraj. I used 5
>>>>>>> as number of clusters.
>>>>>>>
>>>>>>> In addition to 5 filename.ci files (e.g. filename.c0, filename.c1 etc), I
>>>>>>> obtained additional files for each .ci file.
>>>>>>> For instance, in addition to filename.c0 I also obtained flilename.c0.1,
>>>>>>> filename.c0.2 and so on.
>>>>>>>
>>>>>>> Anyone can explain me what these files are?
>>>>>>>
>>>>>>> Best,
>>>>>>> Valentina
>>>>>>>
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>> Valentina Romano | PhD Student | Biozentrum, University of Basel & SIB
>>>>>>> Swiss Institute of Bioinformatics
>>>>>>> Klingelbergstrasse 61 | CH-4056 Basel |
>>>>>>>
>>>>>>> Phone: +41 61 267 15 80
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> AMBER mailing list
>>>>>>> AMBER.ambermd.org
>>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ================================ Current Address =======================
>>>>>> Brian Radak : BioMaPS
>>>>>> Institute for Quantitative Biology
>>>>>> PhD candidate - York Research Group : Rutgers, The State
>>>>>> University of New Jersey
>>>>>> University of Minnesota - Twin Cities : Center for
>>>>>> Integrative Proteomics Room 308
>>>>>> Graduate Program in Chemical Physics : 174 Frelinghuysen Road,
>>>>>> Department of Chemistry : Piscataway, NJ
>>>>>> 08854-8066
>>>>>> radak004.umn.edu :
>>>>>> ====================================================================
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>>
>>>>>> _______________________________________________
>>>>>> AMBER mailing list
>>>>>> AMBER.ambermd.org
>>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -------------------------
>>>>> Daniel R. Roe, PhD
>>>>> Department of Medicinal Chemistry
>>>>> University of Utah
>>>>> 30 South 2000 East, Room 307
>>>>> Salt Lake City, UT 84112-5820
>>>>> http://home.chpc.utah.edu/~cheatham/
>>>>> (801) 587-9652
>>>>> (801) 585-6208 (Fax)
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -------------------------
>>>> Daniel R. Roe, PhD
>>>> Department of Medicinal Chemistry
>>>> University of Utah
>>>> 30 South 2000 East, Room 201
>>>> Salt Lake City, UT 84112-5820
>>>> http://home.chpc.utah.edu/~cheatham/
>>>> (801) 587-9652
>>>> (801) 585-6208 (Fax)
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>>
>>>
>>> --
>>> -------------------------
>>> Daniel R. Roe, PhD
>>> Department of Medicinal Chemistry
>>> University of Utah
>>> 30 South 2000 East, Room 307
>>> Salt Lake City, UT 84112-5820
>>> http://home.chpc.utah.edu/~cheatham/
>>> (801) 587-9652
>>> (801) 585-6208 (Fax)
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> --
>> -------------------------
>> Daniel R. Roe, PhD
>> Department of Medicinal Chemistry
>> University of Utah
>> 30 South 2000 East, Room 307
>> Salt Lake City, UT 84112-5820
>> http://home.chpc.utah.edu/~cheatham/
>> (801) 587-9652
>> (801) 585-6208 (Fax)
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> --
> -------------------------
> Daniel R. Roe, PhD
> Department of Medicinal Chemistry
> University of Utah
> 30 South 2000 East, Room 307
> Salt Lake City, UT 84112-5820
> http://home.chpc.utah.edu/~cheatham/
> (801) 587-9652
> (801) 585-6208 (Fax)
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
--
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Aug 11 2014 - 09:30:03 PDT