Re: [AMBER] clustering problem in ambertool14

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Fri, 2 Jan 2015 14:35:17 -0700

Hi,

I suspect that if you are running out of memory even after using
'loadtraj', the issue may be with the pairwise distance matrix.

To reduce the amount of memory needed by the pairwise distance matrix use
the 'sieve' keyword. Try 'sieve 10' to start. Increase the sieve value as
necessary.

-Dan

On Friday, January 2, 2015, Mahendra B Thapa <thapamb.mail.uc.edu> wrote:

> Dear Dr. Daniel
>
> Thank you for the suggestion; I successfully updated to 'CPPTRAJ:
> Trajectory Analysis. V14.22'
>
> But, again previous error message (as you answered in the first time)
> appeared as
> terminate called after throwing an instance of 'std::bad_alloc'
> what(): std::bad_alloc
>
> Using the formula you gave me, the space requirement is 23GB but I have
> enough memory (400GB) in my external drive where I run the 'cpptraj'
> command.
>
> Thank you for help,
> Mahendra Thapa
> University of Cincinnati,OH
>
> On Fri, Jan 2, 2015 at 2:01 PM, Thapa, Mahendra (thapamb) <
> thapamb.mail.uc.edu <javascript:;>> wrote:
>
> >
> >
> >
> > ________________________________________
> > From: Daniel Roe
> > Sent: Friday, January 2, 2015 1:00:45 PM (UTC-06:00) Central America
> > To: AMBER Mailing List
> > Subject: Re: [AMBER] FW: clustering problem in ambertool14
> >
> > Hi,
> >
> > According to your log output you haven't applied all updates. The very
> > first line of output is:
> >
> > CPPTRAJ: Trajectory Analysis. V14.00
> >
> > You need at least 14.17 for clustering with the traj data set to work
> > properly (and really you should have 14.22). After updates are applied
> > the code must be recompiled. Also if you are not using the full path
> > to cpptraj when executing make sure that the cpptraj you are actually
> > using is the up-to-date one (with e.g. the command 'which cpptraj`).
> >
> > Hope this helps,
> >
> > -Dan
> >
> >
> > On Fri, Jan 2, 2015 at 9:08 AM, Mahendra B Thapa <thapamb.mail.uc.edu
> <javascript:;>>
> > wrote:
> > > Dear Dr.Daniel
> > > Memory issues were solved when I followed the steps you suggested;
> thank
> > > you for that.
> > >
> > > A new problem appeared as seen in the screen:
> > >
> > > Internal Error: Metric is COORDS base but data set is not.
> > > Error: in Analysis # 0
> > > 1 errors encountered reading input.
> > >
> > > {{ Note: I have already fixed bugs for ambertool 14
> > > http://ambermd.org/bugfixes/AmberTools/14.0/update.17
> > > }}
> > >
> > > DATAFILES:
> > > cluster_out (Standard Data File): Cnum_00001
> > > Warning: Set 'Cnum_00001' contains no data.
> > > Warning: File 'cluster_out' has no sets containing data.
> > >
> > > Are these errors due to (i) a large numbers of frames (250000) and
> number
> > > of atoms (7584 atoms) ?
> > >
> > > In the previous post (http://archive.ambermd.org/201408/0214.html),
> > there
> > > is some discussion but I am assuming that I have been using stripped
> > > topology file to run cpptraj. I have attached the screen shot ( text
> > > file:TEST_LOG) with this email.
> > >
> > > Thank you for help,
> > > Mahendra Thapa
> > >
> > >
> > > On Tue, Dec 16, 2014 at 3:20 PM, Thapa, Mahendra (thapamb) <
> > > thapamb.mail.uc.edu <javascript:;>> wrote:
> > >
> > >>
> > >>
> > >>
> > >> ________________________________________
> > >> From: Daniel Roe
> > >> Sent: Tuesday, December 16, 2014 2:19:51 PM (UTC-06:00) Central
> America
> > >> To: AMBER Mailing List
> > >> Subject: Re: [AMBER] clustering problem in ambertool14
> > >>
> > >> Hi,
> > >>
> > >> Usually when you get this error message during a command that uses a
> > >> COORDS data set (cluster, 2drms, crdfluct etc) it's because you ran
> > >> out of memory. Here is a formula to estimate the amount of memory you
> > >> will need to hold a COORDS data set:
> > >>
> > >> memory_in_bytes = (F * A * 3) * 4
> > >>
> > >> where F is the number of frames, A is the number of atoms (after
> > >> stripping in this case), the 3 is from # of coords per atom and 4 is
> > >> bytes (COORDS are single precision). Divide by 1048576 to get the
> > >> result in MB. Add 6 to (F * A *3) if you have box coordinates, double
> > >> if you have velocities as well.
> > >>
> > >> However, in place of a COORDS data set cpptraj also lets you use what
> > >> is called a TRAJ data set (which leaves data on-disk). The only issue
> > >> with this is because it remains on the disk you cannot modify a TRAJ
> > >> data set, so you will have to pre-process your trajectory (i.e.
> > >> strip/image) first. This is a good idea to do in general since it will
> > >> make subsequent analyses faster. Here is some input as an example.
> > >>
> > >> # Step 1 - Preprocess
> > >> parm myparm.parm7
> > >> trajin mytraj.nc
> > >> strip :Na+,WAT nobox outprefix strip
> > >> autoimage
> > >> rms first mass .C,CA,N
> > >> trajout strip.mytraj.nc nobox
> > >>
> > >> A few things to note here. First is that I put the 'strip' command
> > >> before everything else; this way subsequent commands will be faster
> > >> because there are less atoms to deal with. Also note in my 'strip'
> > >> command I'm writing out a stripped topology for use with my stripped
> > >> trajectory. Finally and most importantly, because you are rms-fitting
> > >> you will no longer be able to image anyway, so I'm getting rid of any
> > >> box coordinates.
> > >>
> > >> # Step 2 - Cluster
> > >> parm strip.myparm.parm7
> > >> trajin strip.mytraj.nc
> > >> loadtraj name MYTRAJ
> > >> cluster crdset MYTRAJ :1-291.CA,N,C,O mass clusters 10 out
> cluster_out
> > >> nofit averagelinkage \
> > >> summary summary_out info Cluster_info repout box2.rep repfmt pdb
> > >> clusterout cluster.nc clusterfmt netcdf
> > >>
> > >> The 'loadtraj' command in this case is taking all loaded trajectories
> > >> from 'trajin' statements and putting them into a TRAJ data set named
> > >> MYTRAJ, which stays on-disk and can subsequently be used by the
> > >> 'cluster' command.
> > >>
> > >> One more thing to keep in mind is that even though the coordinates
> > >> will be kept on disk, you will still need enough memory to hold the
> > >> pairwise distance matrix:
> > >>
> > >> memory_in_bytes = ((F * (F-1)) / 2) * 4
> > >>
> > >> If you don't have enough memory to hold the pairwise distance matrix
> > >> try using the 'sieve' keyword to reduce the number of frames being
> > >> clustered in the first pass. This will also speed up the actual
> > >> clustering a bit. Last and most importantly make sure you are using
> > >> the most up-to-date version of cpptraj (14.22).
> > >>
> > >> Hope this helps,
> > >>
> > >> -Dan
> > >>
> > >> On Tue, Dec 16, 2014 at 11:28 AM, Mahendra B Thapa <
> thapamb.mail.uc.edu <javascript:;>
> > >
> > >> wrote:
> > >> > Dear Amber users
> > >> > I used following command for clustering 50ns all-atom simulated
> data.
> > >> > cpptraj -i input_file -p para_top
> > >> > where 'input_file' consists of
> > >> >
> > >> > trajin mdcrd_files
> > >> > autoimage
> > >> > rms first mass .C,CA,N
> > >> > strip :Na+,WAT
> > >> > cluster :1-291.CA,N,C,O mass clusters 10 out cluster_out nofit
> > >> > averagelinkage \
> > >> > summary summary_out info Cluster_info repout box2.rep repfmt pdb
> > >> > clusterout cluster.nc clusterfmt netcdf
> > >> > go
> > >> >
> > >> > After running the command, I got following message without any
> output
> > >> files:
> > >> >
> > >> > 1]terminate called after throwing an instance of 'std::bad_alloc'
> > >> > what(): std::bad_alloc
> > >> > Aborted
> > >> >
> > >> > 2] Warning: One or more analyses requested creation of default
> COORDS
> > >> > DataSet.
> > >> > CREATECRD: Saving coordinates from Top to file to "_DEFAULTCRD_"
> > >> >
> > >> >
> > >> > 3]Warning: Coordinates are being rotated and box coordinates are
> > present.
> > >> > Warning: Unit cell vectors are NOT rotated; imaging will not be
> > possible
> > >> > Warning: after the RMS-fit is performed.
> > >> >
> > >> > Any comments and suggestion will be very useful.
> > >> >
> > >> > Thank you,
> > >> > Mahendra Thapa
> > >> > University of Cincinnati
> > >> > _______________________________________________
> > >> > AMBER mailing list
> > >> > AMBER.ambermd.org <javascript:;>
> > >> > http://lists.ambermd.org/mailman/listinfo/amber
> > >>
> > >>
> > >>
> > >> --
> > >> -------------------------
> > >> Daniel R. Roe, PhD
> > >> Department of Medicinal Chemistry
> > >> University of Utah
> > >> 30 South 2000 East, Room 307
> > >> Salt Lake City, UT 84112-5820
> > >> http://home.chpc.utah.edu/~cheatham/
> > >> (801) 587-9652
> > >> (801) 585-6208 (Fax)
> > >>
> > >> _______________________________________________
> > >> AMBER mailing list
> > >> AMBER.ambermd.org <javascript:;>
> > >> http://lists.ambermd.org/mailman/listinfo/amber
> > >>
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org <javascript:;>
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> >
> >
> >
> > --
> > -------------------------
> > Daniel R. Roe, PhD
> > Department of Medicinal Chemistry
> > University of Utah
> > 30 South 2000 East, Room 307
> > Salt Lake City, UT 84112-5820
> > http://home.chpc.utah.edu/~cheatham/
> > (801) 587-9652
> > (801) 585-6208 (Fax)
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org <javascript:;>
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org <javascript:;>
> http://lists.ambermd.org/mailman/listinfo/amber
>


-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jan 02 2015 - 14:00:02 PST
Custom Search