Re: [AMBER] Large trajectory for cluster analysis in ptraj

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Thu, 27 Sep 2012 21:34:20 -0600

Hi,

On Thu, Sep 27, 2012 at 9:24 PM, Ganesh Kamath <gkamath9173.gmail.com> wrote:
> Hi Dan,
> Out of curiosity, never tested ptraj for systems where there are million
> atoms or more, will ptraj be able to process trajectory say if you had the
> memory on the processors.

As long as the memory is available ptraj/cpptraj should be OK; other
than that hard-limit in the ptraj clustering code I am not aware of
any other size restrictions. Of course, I have not really had an
opportunity to test the code on such massive systems yet so I can't
guarantee anything - probably should go on my 'to-do' list...

> Most of times I would like to analyze trajectories having 100 gb excess
> data course using a submission scripts on nodes with 32 cores. Again I have

Be aware that ptraj.MPI is currently deprecated. Cpptraj will compile
using OpenMP but not all actions make use of the parallelization yet
(currently only closest, distance-based mask parsing, radial,
secstruct, and surf). The next gen. of cpptraj will have further
OpenMP parallelization, and maybe even MPI...

-Dan

> not tested it on nodes of nersc. Maybe I can compile ptraj on these
> nodes.......
> Thanks,
> Ganesh
>
> On Sep 27, 2012 3:25 PM, "Daniel Roe" <daniel.r.roe.gmail.com> wrote:
>
>> Hi,
>>
>> On Thu, Sep 27, 2012 at 12:29 PM, Kira Armacost
>> <kza0004.tigermail.auburn.edu> wrote:
>> > I have a large trajectory (60,000 frames) and am trying to perform a
>> cluster analysis on it. I know that ptraj only has the capability of using
>> 32,000 frames for each analysis
>>
>> Where did you come up with this limit? AFAIK ptraj should only be
>> limited by the size of available memory.
>>
>> > , so I've done 4 cluster analyses for frames 1-15000, 15-30000,
>> 30-45000, and 45-60000. Is this the right way to go about it?
>>
>> It depends on what you are looking for. Cluster analysis of parts of a
>> system may or may not match each other, depending on how well
>> converged the simulation is. If the simulation converges within the
>> first 15000 frames then you might expect to get similar results from
>> clustering 1-15000 and 15000-30000 (but still might not since
>> ostensibly the system is still equilibrating during the first 15k
>> frames). In fact, one way to measure convergence is to cluster the
>> first and last halves of your simulation, then compare the resulting
>> clusters from those to clustering of the entire simulation.
>>
>> You can compare representative structures from clusters and compare
>> populations to get a rough idea of the structures your system is
>> sampling during each time course. However, if you want to look at
>> overall behavior you should still cluster on all frames (which should
>> be possible). You can speed this process up by using the 'sieve'
>> keyword.
>>
>> Hope this is helpful.
>>
>> -Dan
>>
>> --
>> -------------------------
>> Daniel R. Roe, PhD
>> Department of Medicinal Chemistry
>> University of Utah
>> 30 South 2000 East, Room 201
>> Salt Lake City, UT 84112-5820
>> http://home.chpc.utah.edu/~cheatham/
>> (801) 587-9652
>> (801) 585-9119 (Fax)
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 201
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-9119 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Sep 27 2012 - 21:00:02 PDT
Custom Search