Re: [AMBER] CPPTRAJ cluster analysis: Segmentation fault (core dumped) from Daniel Roe on 2016-11-08 (Amber Archive Nov 2016)

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 8 Nov 2016 11:17:35 -0500

On Tue, Nov 8, 2016 at 10:38 AM, Michael Shokhen
<michael.shokhen.biu.ac.il> wrote:
> Unfortunately, I did something wrong, so system reported errors.
> Would you please send me my file with correct addition of sieve function.

Since I don't have access to your files and I only have a vague notion
of what you want to do, this would not be a productive use of my time.
It would be better if you report *exactly* what errors you received,
as well as the complete input you tried. If you haven't already I urge
you to read the complete entry for the 'cluster' command in the Amber
16 manual.

-Dan

>
>
> Many thanks,
>
> Michael
>
>
>
> *****************************
> Michael Shokhen, PhD
> Associate Professor
> Department of Chemistry
> Bar Ilan University,
> Ramat Gan, 52900
> Israel
> email: shokhen.mail.biu.ac.il<https://webmail.biu.ac.il/owa/redir.aspx?C=a160ef9b9a6b4d06992402715d3ee465&URL=mailto%3ashokhen%40mail.biu.ac.il>
> ________________________________
> From: Daniel Roe <daniel.r.roe.gmail.com>
> Sent: Tuesday, November 8, 2016 4:56:51 PM
> To: AMBER Mailing List
> Subject: Re: [AMBER] CPPTRAJ cluster analysis: Segmentation fault (core dumped)
>
> Hi,
>
> As Samuel stated, this is almost certainly due to having not enough
> memory to store the pairwise matrix and coordinates in memory. You
> have several options in addition to what Samuel mentioned:
>
> 1) Use the 'sieve' cluster option instead of reducing the number of
> frames used. The advantage of this is that frames that are not
> clustered in the first pass will be added back in. Importantly, this
> will reduce the size of the pairwise distance matrix in memory which
> is typically what needs the most space. See the manual for full
> details. This is your best option in my opinion.
>
> 2) Reduce the size of the coordinates data set by using TRAJ data sets
> (on-disk) instead of COORDS (in-memory). To do this you'll have to
> pre-process the trajectory first (i.e. perform all imaging, stripping,
> etc) before reading it in as a TRAJ data set. Again, see the manual
> for full details. My feeling is that this will not be as helpful since
> the pairwise matrix tends to take up far more memory.
>
> 3) (Not really recommended). You can use the GitHub beta version of
> cpptraj and prevent caching of the pairwise matrix with 'pairwisecache
> none'. While this will save you memory, it will probably be very very
> slow since all distances must be recalculated every time they are
> needed.
>
> I also recommend at least updating to the AmberTools 16 version of
> cpptraj, and also using the OpenMP version since various parts of the
> clustering algorithm are OpenMP-parallelized.
>
> Hope this helps,
>
> -Dan
>
>
> On Tue, Nov 8, 2016 at 8:28 AM, Michael Shokhen
> <michael.shokhen.biu.ac.il> wrote:
>> Dear Amber experts,
>>
>>
>> I faced the error running clustering analysis in frames of CPPTRAJ
>>
>> (I am using currently AMBER 14 and AMBER TOOLS 15).
>>
>> See below for details the cluster.analysis.in file
>>
>> and the end of the text with system reporting error
>>
>> What changes in the command lines must be done?
>>
>> Thank you for your help,
>> Michael
>>
>> cluster.analysis.in file
>>
>> # Cluster analysis with cpptraj.
>>
>> # Load topology trajectory
>> parm ../*.prmtop
>> trajin ../19_/prod3.mdcrd
>> trajin ../20_/prod4.mdcrd
>> trajin ../21_/prod5.mdcrd
>> trajin ../22_/prod6.mdcrd
>> trajin ../23_/prod7.mdcrd
>> trajin ../24_/prod8.mdcrd
>> trajin ../25_/prod9.mdcrd
>> trajin ../26_/prod10.mdcrd
>> trajin ../27_/prod11.mdcrd
>> trajin ../28_/prod12.mdcrd
>> trajin ../29_/prod13.mdcrd
>> trajin ../30_/prod14.mdcrd
>> trajin ../31_/prod15.mdcrd
>> trajin ../32_/prod16.mdcrd
>> trajin ../33_/prod17.mdcrd
>> center :1-11340 mass origin
>> autoimage origin
>> # Remove ions so they do not appear in output structures.
>> strip :WAT,Cl-,Na+,OL,PA,PE
>> # Cluster analysis command:
>> # C0: Cluster output data set(s) name.
>> # CLUSTERING OPTIONS:
>> # dbscan: Use the DBSCAN (density based) clustering algorithm.
>> # minpoints: Minimum # of points to form a cluster.
>> # epsilon: Distance cutoff for forming cluster.
>> # sievetoframe: Restore sieved frames by comparing to all cluster frames,
>> # not just centroid.
>> # DISTANCE METRIC OPTIONS:
>> # rms <mask>: Use RMSD of atoms in <mask> as distance metric.
>> # sieve 10 : Use <total> / 10 initial frames for clustering.
>> # OUTPUT OPTIONS:
>> # out <file>: Write cluster number versus time to file.
>> # summary <file>: Write overall clustering summary to file.
>> # info <file>: Write detailed cluster results (including DBI, pSF etc) to file.
>> # cpopvtime <file> normframe: Write cluster population vs time to <file>,
>> # normalized by # frames.
>> # COORDINATE OUTPUT OPTIONS:
>> # repout <file prefix> repfmt pdb: Write cluster representatives to files with
>> # PDB format.
>> # singlerepout <file> singlerepfmt netcdf: Write cluster representatives to
>> # single file with NetCDF format.
>> # avgout <file> avgfmt restart: Write average over all frames in each cluster
>> # to separate files with Amber restart file
>> # format.
>> distance C0 :111.OG :164.NE2
>> cluster data C0 clusters 10 epsilon 3.0 summary summary.dat info info.dat
>>
>>
>> The system error report final fragment:
>>
>>
>> ----- prod17.mdcrd (1-5000, 1) -----
>> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.
>>
>> Read 65465 frames and processed 65465 frames.
>> TIME: Trajectory processing: 156.4525 s
>> TIME: Avg. throughput= 418.4336 frames / second.
>>
>> ACTION OUTPUT:
>>
>> ANALYSIS: Performing 1 analyses:
>> 0: [cluster data C0 clusters 10 epsilon 3.0 summary summary.dat info info.dat]
>> Starting clustering.
>> Calculating pair-wise distances.
>> Estimated pair-wise matrix memory usage: > 8174.1348 MB
>> Pair-wise matrix set up, 65465 frames
>> Segmentation fault (core dumped)
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> --
> -------------------------
> Daniel R. Roe
> Laboratory of Computational Biology
> National Institutes of Health, NHLBI
> 5635 Fishers Ln, Rm T900
> Rockville MD, 20852
> https://www.lobos.nih.gov/lcb
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

-- 
-------------------------
Daniel R. Roe
Laboratory of Computational Biology
National Institutes of Health, NHLBI
5635 Fishers Ln, Rm T900
Rockville MD, 20852
https://www.lobos.nih.gov/lcb
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Tue Nov 08 2016 - 08:30:03 PST