Re: [AMBER] clustering problem in ambertool14

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 16 Dec 2014 13:19:51 -0700

Hi,

Usually when you get this error message during a command that uses a
COORDS data set (cluster, 2drms, crdfluct etc) it's because you ran
out of memory. Here is a formula to estimate the amount of memory you
will need to hold a COORDS data set:

memory_in_bytes = (F * A * 3) * 4

where F is the number of frames, A is the number of atoms (after
stripping in this case), the 3 is from # of coords per atom and 4 is
bytes (COORDS are single precision). Divide by 1048576 to get the
result in MB. Add 6 to (F * A *3) if you have box coordinates, double
if you have velocities as well.

However, in place of a COORDS data set cpptraj also lets you use what
is called a TRAJ data set (which leaves data on-disk). The only issue
with this is because it remains on the disk you cannot modify a TRAJ
data set, so you will have to pre-process your trajectory (i.e.
strip/image) first. This is a good idea to do in general since it will
make subsequent analyses faster. Here is some input as an example.

# Step 1 - Preprocess
parm myparm.parm7
trajin mytraj.nc
strip :Na+,WAT nobox outprefix strip
autoimage
rms first mass .C,CA,N
trajout strip.mytraj.nc nobox

A few things to note here. First is that I put the 'strip' command
before everything else; this way subsequent commands will be faster
because there are less atoms to deal with. Also note in my 'strip'
command I'm writing out a stripped topology for use with my stripped
trajectory. Finally and most importantly, because you are rms-fitting
you will no longer be able to image anyway, so I'm getting rid of any
box coordinates.

# Step 2 - Cluster
parm strip.myparm.parm7
trajin strip.mytraj.nc
loadtraj name MYTRAJ
cluster crdset MYTRAJ :1-291.CA,N,C,O mass clusters 10 out cluster_out
nofit averagelinkage \
  summary summary_out info Cluster_info repout box2.rep repfmt pdb
clusterout cluster.nc clusterfmt netcdf

The 'loadtraj' command in this case is taking all loaded trajectories
from 'trajin' statements and putting them into a TRAJ data set named
MYTRAJ, which stays on-disk and can subsequently be used by the
'cluster' command.

One more thing to keep in mind is that even though the coordinates
will be kept on disk, you will still need enough memory to hold the
pairwise distance matrix:

memory_in_bytes = ((F * (F-1)) / 2) * 4

If you don't have enough memory to hold the pairwise distance matrix
try using the 'sieve' keyword to reduce the number of frames being
clustered in the first pass. This will also speed up the actual
clustering a bit. Last and most importantly make sure you are using
the most up-to-date version of cpptraj (14.22).

Hope this helps,

-Dan

On Tue, Dec 16, 2014 at 11:28 AM, Mahendra B Thapa <thapamb.mail.uc.edu> wrote:
> Dear Amber users
> I used following command for clustering 50ns all-atom simulated data.
> cpptraj -i input_file -p para_top
> where 'input_file' consists of
>
> trajin mdcrd_files
> autoimage
> rms first mass .C,CA,N
> strip :Na+,WAT
> cluster :1-291.CA,N,C,O mass clusters 10 out cluster_out nofit
> averagelinkage \
> summary summary_out info Cluster_info repout box2.rep repfmt pdb
> clusterout cluster.nc clusterfmt netcdf
> go
>
> After running the command, I got following message without any output files:
>
> 1]terminate called after throwing an instance of 'std::bad_alloc'
> what(): std::bad_alloc
> Aborted
>
> 2] Warning: One or more analyses requested creation of default COORDS
> DataSet.
> CREATECRD: Saving coordinates from Top to file to "_DEFAULTCRD_"
>
>
> 3]Warning: Coordinates are being rotated and box coordinates are present.
> Warning: Unit cell vectors are NOT rotated; imaging will not be possible
> Warning: after the RMS-fit is performed.
>
> Any comments and suggestion will be very useful.
>
> Thank you,
> Mahendra Thapa
> University of Cincinnati
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Dec 16 2014 - 12:30:02 PST
Custom Search