Re: [AMBER] FW: FW: clustering problem in ambertool14

From: Chris Moth <cmoth08.gmail.com>
Date: Fri, 2 Jan 2015 15:03:50 -0600

I believe his formula was for RAM memory. The 400GB on disk is not solving
the problem. You may need to find a machine with more RAM - or further
strip out atoms. One way to possibly determine if more RAM memory might
solve the problem, try running with 50% fewer frames. If that "works", you
can be reasonably confident that you are indeed running out of RAM as the
calculation proceeds.

On Fri, Jan 2, 2015 at 2:48 PM, Mahendra B Thapa <thapamb.mail.uc.edu>
wrote:

> Dear Dr. Daniel
>
> Thank you for the suggestion; I successfully updated to 'CPPTRAJ:
> Trajectory Analysis. V14.22'
>
> But, again previous error message (as you answered in the first time)
> appeared as
> terminate called after throwing an instance of 'std::bad_alloc'
> what(): std::bad_alloc
>
> Using the formula you gave me, the space requirement is 23GB but I have
> enough memory (400GB) in my external drive where I run the 'cpptraj'
> command.
>
> Thank you for help,
> Mahendra Thapa
> University of Cincinnati,OH
>
> On Fri, Jan 2, 2015 at 2:01 PM, Thapa, Mahendra (thapamb) <
> thapamb.mail.uc.edu> wrote:
>
> >
> >
> >
> > ________________________________________
> > From: Daniel Roe
> > Sent: Friday, January 2, 2015 1:00:45 PM (UTC-06:00) Central America
> > To: AMBER Mailing List
> > Subject: Re: [AMBER] FW: clustering problem in ambertool14
> >
> > Hi,
> >
> > According to your log output you haven't applied all updates. The very
> > first line of output is:
> >
> > CPPTRAJ: Trajectory Analysis. V14.00
> >
> > You need at least 14.17 for clustering with the traj data set to work
> > properly (and really you should have 14.22). After updates are applied
> > the code must be recompiled. Also if you are not using the full path
> > to cpptraj when executing make sure that the cpptraj you are actually
> > using is the up-to-date one (with e.g. the command 'which cpptraj`).
> >
> > Hope this helps,
> >
> > -Dan
> >
> >
> > On Fri, Jan 2, 2015 at 9:08 AM, Mahendra B Thapa <thapamb.mail.uc.edu>
> > wrote:
> > > Dear Dr.Daniel
> > > Memory issues were solved when I followed the steps you suggested;
> thank
> > > you for that.
> > >
> > > A new problem appeared as seen in the screen:
> > >
> > > Internal Error: Metric is COORDS base but data set is not.
> > > Error: in Analysis # 0
> > > 1 errors encountered reading input.
> > >
> > > {{ Note: I have already fixed bugs for ambertool 14
> > > http://ambermd.org/bugfixes/AmberTools/14.0/update.17
> > > }}
> > >
> > > DATAFILES:
> > > cluster_out (Standard Data File): Cnum_00001
> > > Warning: Set 'Cnum_00001' contains no data.
> > > Warning: File 'cluster_out' has no sets containing data.
> > >
> > > Are these errors due to (i) a large numbers of frames (250000) and
> number
> > > of atoms (7584 atoms) ?
> > >
> > > In the previous post (http://archive.ambermd.org/201408/0214.html),
> > there
> > > is some discussion but I am assuming that I have been using stripped
> > > topology file to run cpptraj. I have attached the screen shot ( text
> > > file:TEST_LOG) with this email.
> > >
> > > Thank you for help,
> > > Mahendra Thapa
> > >
> > >
> > > On Tue, Dec 16, 2014 at 3:20 PM, Thapa, Mahendra (thapamb) <
> > > thapamb.mail.uc.edu> wrote:
> > >
> > >>
> > >>
> > >>
> > >> ________________________________________
> > >> From: Daniel Roe
> > >> Sent: Tuesday, December 16, 2014 2:19:51 PM (UTC-06:00) Central
> America
> > >> To: AMBER Mailing List
> > >> Subject: Re: [AMBER] clustering problem in ambertool14
> > >>
> > >> Hi,
> > >>
> > >> Usually when you get this error message during a command that uses a
> > >> COORDS data set (cluster, 2drms, crdfluct etc) it's because you ran
> > >> out of memory. Here is a formula to estimate the amount of memory you
> > >> will need to hold a COORDS data set:
> > >>
> > >> memory_in_bytes = (F * A * 3) * 4
> > >>
> > >> where F is the number of frames, A is the number of atoms (after
> > >> stripping in this case), the 3 is from # of coords per atom and 4 is
> > >> bytes (COORDS are single precision). Divide by 1048576 to get the
> > >> result in MB. Add 6 to (F * A *3) if you have box coordinates, double
> > >> if you have velocities as well.
> > >>
> > >> However, in place of a COORDS data set cpptraj also lets you use what
> > >> is called a TRAJ data set (which leaves data on-disk). The only issue
> > >> with this is because it remains on the disk you cannot modify a TRAJ
> > >> data set, so you will have to pre-process your trajectory (i.e.
> > >> strip/image) first. This is a good idea to do in general since it will
> > >> make subsequent analyses faster. Here is some input as an example.
> > >>
> > >> # Step 1 - Preprocess
> > >> parm myparm.parm7
> > >> trajin mytraj.nc
> > >> strip :Na+,WAT nobox outprefix strip
> > >> autoimage
> > >> rms first mass .C,CA,N
> > >> trajout strip.mytraj.nc nobox
> > >>
> > >> A few things to note here. First is that I put the 'strip' command
> > >> before everything else; this way subsequent commands will be faster
> > >> because there are less atoms to deal with. Also note in my 'strip'
> > >> command I'm writing out a stripped topology for use with my stripped
> > >> trajectory. Finally and most importantly, because you are rms-fitting
> > >> you will no longer be able to image anyway, so I'm getting rid of any
> > >> box coordinates.
> > >>
> > >> # Step 2 - Cluster
> > >> parm strip.myparm.parm7
> > >> trajin strip.mytraj.nc
> > >> loadtraj name MYTRAJ
> > >> cluster crdset MYTRAJ :1-291.CA,N,C,O mass clusters 10 out
> cluster_out
> > >> nofit averagelinkage \
> > >> summary summary_out info Cluster_info repout box2.rep repfmt pdb
> > >> clusterout cluster.nc clusterfmt netcdf
> > >>
> > >> The 'loadtraj' command in this case is taking all loaded trajectories
> > >> from 'trajin' statements and putting them into a TRAJ data set named
> > >> MYTRAJ, which stays on-disk and can subsequently be used by the
> > >> 'cluster' command.
> > >>
> > >> One more thing to keep in mind is that even though the coordinates
> > >> will be kept on disk, you will still need enough memory to hold the
> > >> pairwise distance matrix:
> > >>
> > >> memory_in_bytes = ((F * (F-1)) / 2) * 4
> > >>
> > >> If you don't have enough memory to hold the pairwise distance matrix
> > >> try using the 'sieve' keyword to reduce the number of frames being
> > >> clustered in the first pass. This will also speed up the actual
> > >> clustering a bit. Last and most importantly make sure you are using
> > >> the most up-to-date version of cpptraj (14.22).
> > >>
> > >> Hope this helps,
> > >>
> > >> -Dan
> > >>
> > >> On Tue, Dec 16, 2014 at 11:28 AM, Mahendra B Thapa <
> thapamb.mail.uc.edu
> > >
> > >> wrote:
> > >> > Dear Amber users
> > >> > I used following command for clustering 50ns all-atom simulated
> data.
> > >> > cpptraj -i input_file -p para_top
> > >> > where 'input_file' consists of
> > >> >
> > >> > trajin mdcrd_files
> > >> > autoimage
> > >> > rms first mass .C,CA,N
> > >> > strip :Na+,WAT
> > >> > cluster :1-291.CA,N,C,O mass clusters 10 out cluster_out nofit
> > >> > averagelinkage \
> > >> > summary summary_out info Cluster_info repout box2.rep repfmt pdb
> > >> > clusterout cluster.nc clusterfmt netcdf
> > >> > go
> > >> >
> > >> > After running the command, I got following message without any
> output
> > >> files:
> > >> >
> > >> > 1]terminate called after throwing an instance of 'std::bad_alloc'
> > >> > what(): std::bad_alloc
> > >> > Aborted
> > >> >
> > >> > 2] Warning: One or more analyses requested creation of default
> COORDS
> > >> > DataSet.
> > >> > CREATECRD: Saving coordinates from Top to file to "_DEFAULTCRD_"
> > >> >
> > >> >
> > >> > 3]Warning: Coordinates are being rotated and box coordinates are
> > present.
> > >> > Warning: Unit cell vectors are NOT rotated; imaging will not be
> > possible
> > >> > Warning: after the RMS-fit is performed.
> > >> >
> > >> > Any comments and suggestion will be very useful.
> > >> >
> > >> > Thank you,
> > >> > Mahendra Thapa
> > >> > University of Cincinnati
> > >> > _______________________________________________
> > >> > AMBER mailing list
> > >> > AMBER.ambermd.org
> > >> > http://lists.ambermd.org/mailman/listinfo/amber
> > >>
> > >>
> > >>
> > >> --
> > >> -------------------------
> > >> Daniel R. Roe, PhD
> > >> Department of Medicinal Chemistry
> > >> University of Utah
> > >> 30 South 2000 East, Room 307
> > >> Salt Lake City, UT 84112-5820
> > >> http://home.chpc.utah.edu/~cheatham/
> > >> (801) 587-9652
> > >> (801) 585-6208 (Fax)
> > >>
> > >> _______________________________________________
> > >> AMBER mailing list
> > >> AMBER.ambermd.org
> > >> http://lists.ambermd.org/mailman/listinfo/amber
> > >>
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> >
> >
> >
> > --
> > -------------------------
> > Daniel R. Roe, PhD
> > Department of Medicinal Chemistry
> > University of Utah
> > 30 South 2000 East, Room 307
> > Salt Lake City, UT 84112-5820
> > http://home.chpc.utah.edu/~cheatham/
> > (801) 587-9652
> > (801) 585-6208 (Fax)
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jan 02 2015 - 13:30:03 PST
Custom Search