Re: [AMBER] CPPTRAJ cluster analysis: Segmentation fault (core dumped)

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 8 Nov 2016 09:56:51 -0500

Hi,

As Samuel stated, this is almost certainly due to having not enough
memory to store the pairwise matrix and coordinates in memory. You
have several options in addition to what Samuel mentioned:

1) Use the 'sieve' cluster option instead of reducing the number of
frames used. The advantage of this is that frames that are not
clustered in the first pass will be added back in. Importantly, this
will reduce the size of the pairwise distance matrix in memory which
is typically what needs the most space. See the manual for full
details. This is your best option in my opinion.

2) Reduce the size of the coordinates data set by using TRAJ data sets
(on-disk) instead of COORDS (in-memory). To do this you'll have to
pre-process the trajectory first (i.e. perform all imaging, stripping,
etc) before reading it in as a TRAJ data set. Again, see the manual
for full details. My feeling is that this will not be as helpful since
the pairwise matrix tends to take up far more memory.

3) (Not really recommended). You can use the GitHub beta version of
cpptraj and prevent caching of the pairwise matrix with 'pairwisecache
none'. While this will save you memory, it will probably be very very
slow since all distances must be recalculated every time they are
needed.

I also recommend at least updating to the AmberTools 16 version of
cpptraj, and also using the OpenMP version since various parts of the
clustering algorithm are OpenMP-parallelized.

Hope this helps,

-Dan


On Tue, Nov 8, 2016 at 8:28 AM, Michael Shokhen
<michael.shokhen.biu.ac.il> wrote:
> Dear Amber experts,
>
>
> I faced the error running clustering analysis in frames of CPPTRAJ
>
> (I am using currently AMBER 14 and AMBER TOOLS 15).
>
> See below for details the cluster.analysis.in file
>
> and the end of the text with system reporting error
>
> What changes in the command lines must be done?
>
> Thank you for your help,
> Michael
>
> cluster.analysis.in file
>
> # Cluster analysis with cpptraj.
>
> # Load topology trajectory
> parm ../*.prmtop
> trajin ../19_/prod3.mdcrd
> trajin ../20_/prod4.mdcrd
> trajin ../21_/prod5.mdcrd
> trajin ../22_/prod6.mdcrd
> trajin ../23_/prod7.mdcrd
> trajin ../24_/prod8.mdcrd
> trajin ../25_/prod9.mdcrd
> trajin ../26_/prod10.mdcrd
> trajin ../27_/prod11.mdcrd
> trajin ../28_/prod12.mdcrd
> trajin ../29_/prod13.mdcrd
> trajin ../30_/prod14.mdcrd
> trajin ../31_/prod15.mdcrd
> trajin ../32_/prod16.mdcrd
> trajin ../33_/prod17.mdcrd
> center :1-11340 mass origin
> autoimage origin
> # Remove ions so they do not appear in output structures.
> strip :WAT,Cl-,Na+,OL,PA,PE
> # Cluster analysis command:
> # C0: Cluster output data set(s) name.
> # CLUSTERING OPTIONS:
> # dbscan: Use the DBSCAN (density based) clustering algorithm.
> # minpoints: Minimum # of points to form a cluster.
> # epsilon: Distance cutoff for forming cluster.
> # sievetoframe: Restore sieved frames by comparing to all cluster frames,
> # not just centroid.
> # DISTANCE METRIC OPTIONS:
> # rms <mask>: Use RMSD of atoms in <mask> as distance metric.
> # sieve 10 : Use <total> / 10 initial frames for clustering.
> # OUTPUT OPTIONS:
> # out <file>: Write cluster number versus time to file.
> # summary <file>: Write overall clustering summary to file.
> # info <file>: Write detailed cluster results (including DBI, pSF etc) to file.
> # cpopvtime <file> normframe: Write cluster population vs time to <file>,
> # normalized by # frames.
> # COORDINATE OUTPUT OPTIONS:
> # repout <file prefix> repfmt pdb: Write cluster representatives to files with
> # PDB format.
> # singlerepout <file> singlerepfmt netcdf: Write cluster representatives to
> # single file with NetCDF format.
> # avgout <file> avgfmt restart: Write average over all frames in each cluster
> # to separate files with Amber restart file
> # format.
> distance C0 :111.OG :164.NE2
> cluster data C0 clusters 10 epsilon 3.0 summary summary.dat info info.dat
>
>
> The system error report final fragment:
>
>
> ----- prod17.mdcrd (1-5000, 1) -----
> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.
>
> Read 65465 frames and processed 65465 frames.
> TIME: Trajectory processing: 156.4525 s
> TIME: Avg. throughput= 418.4336 frames / second.
>
> ACTION OUTPUT:
>
> ANALYSIS: Performing 1 analyses:
> 0: [cluster data C0 clusters 10 epsilon 3.0 summary summary.dat info info.dat]
> Starting clustering.
> Calculating pair-wise distances.
> Estimated pair-wise matrix memory usage: > 8174.1348 MB
> Pair-wise matrix set up, 65465 frames
> Segmentation fault (core dumped)
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe
Laboratory of Computational Biology
National Institutes of Health, NHLBI
5635 Fishers Ln, Rm T900
Rockville MD, 20852
https://www.lobos.nih.gov/lcb
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Nov 08 2016 - 07:00:02 PST
Custom Search