Re: [AMBER] FW: FW: FW: clustering problem in ambertool14

From: Mahendra B Thapa <thapamb.mail.uc.edu>
Date: Sun, 18 Jan 2015 11:01:46 -0500

Dear Dr. Daniel

When I run the following command in $AMBERHOME directory, cpptraj.OMP got
installed in the $AMBERHOME/bin directory as you informed.
./configure -openmp gnu
make install

When I tried to test ' cpptraj.OMP' with 5000 frames for clustering, I got
the following error message:
*Floating point exception*
I had enough space (500GB) while running it and I did not get that error
message while running 'cpptraj' in series even with 250000 frames.

Thank you for help,
Mahendra Thapa
University of Cincinnati,OH


On Sat, Jan 17, 2015 at 7:00 PM, Thapa, Mahendra (thapamb) <
thapamb.mail.uc.edu> wrote:

>
>
>
> ________________________________________
> From: Daniel Roe
> Sent: Saturday, January 17, 2015 5:58:55 PM (UTC-06:00) Central America
> To: AMBER Mailing List
> Subject: Re: [AMBER] FW: FW: clustering problem in ambertool14
>
> Hi,
>
> The information in that post is unfortunately several years out of
> date. The OpenMP-enabled cpptraj is built with the usual OpenMP
> AmberTools build that you configure with the '-openmp' flag. This will
> install cpptraj.OMP in the $AMBERHOME/bin directory. When you run it
> you will see this at the top of the output:
>
> CPPTRAJ: Trajectory Analysis. V14.22 OpenMP
>
> -Dan
>
>
> On Sat, Jan 17, 2015 at 9:01 AM, Mahendra B Thapa <thapamb.mail.uc.edu>
> wrote:
> > Dear Dr. Daniel,
> >
> > After compiling cpptraj in parallel, as mentioned in AMBER manual 14 and
> > previous post "http://dev-archive.ambermd.org/201107/0005.html", cpptraj
> > looks working well without any error message ( though it is not completed
> > yet for my system). How do I know that cpptraj is running in parallel
> mode
> > instead of series? I issued the same command as
> > cpptraj -i input_file -p test.top
> >
> > Thank you for help,
> > Mahendra
> >
> >
> >
> > On Tue, Jan 6, 2015 at 6:25 PM, Thapa, Mahendra (thapamb) <
> > thapamb.mail.uc.edu> wrote:
> >
> >>
> >>
> >>
> >> ________________________________________
> >> From: Daniel Roe
> >> Sent: Tuesday, January 6, 2015 5:24:28 PM (UTC-06:00) Central America
> >> To: AMBER Mailing List
> >> Subject: Re: [AMBER] FW: clustering problem in ambertool14
> >>
> >> Hi,
> >>
> >> On Tue, Jan 6, 2015 at 3:47 PM, Mahendra B Thapa <thapamb.mail.uc.edu>
> >> wrote:
> >> > With the use of 'sieve 10' , the 'cpptraj' command has been running
> >> > without any complain but the analysis is very slow for my case (50000
> >> > frames, 511 residues with 7584 atoms in each frame). A section of the
> >> > screenshot *after 24 hours *is as follows:
> >>
> >> This is an inherently time-consuming process, since you need to run
> >> N*(N-1) / 2 calculations (roughly 12.5 M). Depending on your processor
> >> speed this can take a long time, particularly if you have a big
> >> system. I typically use OpenMP-compiled cpptraj for this, since the
> >> pairwise calc is one of the things that is parallelized. To give you
> >> an idea of what timings I see, for 22084 sieved frames with 8 threads
> >> I can complete the pairwise portion of the calculation (RMSD selecting
> >> 67 atoms) in 97 seconds (CPU is 2x Xeon X5660 . 2.8 GHz).
> >>
> >> Your best bet is to first use a very small number of frames (as a
> >> test) to get an idea of how long things will take, and also to make
> >> sure that when your clustering completes you get all the output you
> >> are expecting. It's pretty awful when you do an expensive clustering
> >> calc and realize you forgot you wanted cluster numbers vs time etc.
> >>
> >> One thing that can help speed up subsequent clustering calcs is to use
> >> the loadpairdist/savepairdist keywords to re-use calculated pairwise
> >> distances. Some care must be taken when doing this though - everything
> >> pertaining to the distance metric (sieve, mask, etc) MUST remain the
> >> same or you will get bad results. Cpptraj does some checking for this
> >> but can't always catch everything.
> >>
> >> Hope this helps,
> >>
> >> -Dan
> >>
> >> >
> >> > ANALYSIS: Performing 1 analyses:
> >> > 0: [cluster crdset MYTRAJ :1-511.CA,N,C,O mass clusters 10 out
> >> > cluster_out nofit averagelinkage summary summary_out info Cluster_info
> >> > sieve 10 repout box2.rep repfmt pdb clusterout cluster.nc clusterfmt
> >> netcdf]
> >> > Starting clustering.
> >> > Mask [:1-511.CA,N,C,O] corresponds to 2044 atoms.
> >> > Calculating pair-wise distances.
> >> > Pair-wise matrix set up with sieve, 50000 frames, 5000 sieved
> frames.
> >> > 0%
> >> >
> >> > Thank you for help,
> >> > Mahendra Thapa
> >> > University of Cincinnati,OH
> >> >
> >> >
> >> > On Fri, Jan 2, 2015 at 4:35 PM, Thapa, Mahendra (thapamb) <
> >> > thapamb.mail.uc.edu> wrote:
> >> >
> >> >>
> >> >>
> >> >>
> >> >> ________________________________________
> >> >> From: Daniel Roe
> >> >> Sent: Friday, January 2, 2015 3:35:17 PM (UTC-06:00) Central America
> >> >> To: AMBER Mailing List
> >> >> Subject: Re: [AMBER] clustering problem in ambertool14
> >> >>
> >> >> Hi,
> >> >>
> >> >> I suspect that if you are running out of memory even after using
> >> >> 'loadtraj', the issue may be with the pairwise distance matrix.
> >> >>
> >> >> To reduce the amount of memory needed by the pairwise distance matrix
> >> use
> >> >> the 'sieve' keyword. Try 'sieve 10' to start. Increase the sieve
> value
> >> as
> >> >> necessary.
> >> >>
> >> >> -Dan
> >> >>
> >> >> On Friday, January 2, 2015, Mahendra B Thapa <thapamb.mail.uc.edu>
> >> wrote:
> >> >>
> >> >> > Dear Dr. Daniel
> >> >> >
> >> >> > Thank you for the suggestion; I successfully updated to 'CPPTRAJ:
> >> >> > Trajectory Analysis. V14.22'
> >> >> >
> >> >> > But, again previous error message (as you answered in the first
> time)
> >> >> > appeared as
> >> >> > terminate called after throwing an instance of 'std::bad_alloc'
> >> >> > what(): std::bad_alloc
> >> >> >
> >> >> > Using the formula you gave me, the space requirement is 23GB but I
> >> have
> >> >> > enough memory (400GB) in my external drive where I run the
> 'cpptraj'
> >> >> > command.
> >> >> >
> >> >> > Thank you for help,
> >> >> > Mahendra Thapa
> >> >> > University of Cincinnati,OH
> >> >> >
> >> >> > On Fri, Jan 2, 2015 at 2:01 PM, Thapa, Mahendra (thapamb) <
> >> >> > thapamb.mail.uc.edu <javascript:;>> wrote:
> >> >> >
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > ________________________________________
> >> >> > > From: Daniel Roe
> >> >> > > Sent: Friday, January 2, 2015 1:00:45 PM (UTC-06:00) Central
> America
> >> >> > > To: AMBER Mailing List
> >> >> > > Subject: Re: [AMBER] FW: clustering problem in ambertool14
> >> >> > >
> >> >> > > Hi,
> >> >> > >
> >> >> > > According to your log output you haven't applied all updates. The
> >> very
> >> >> > > first line of output is:
> >> >> > >
> >> >> > > CPPTRAJ: Trajectory Analysis. V14.00
> >> >> > >
> >> >> > > You need at least 14.17 for clustering with the traj data set to
> >> work
> >> >> > > properly (and really you should have 14.22). After updates are
> >> applied
> >> >> > > the code must be recompiled. Also if you are not using the full
> path
> >> >> > > to cpptraj when executing make sure that the cpptraj you are
> >> actually
> >> >> > > using is the up-to-date one (with e.g. the command 'which
> cpptraj`).
> >> >> > >
> >> >> > > Hope this helps,
> >> >> > >
> >> >> > > -Dan
> >> >> > >
> >> >> > >
> >> >> > > On Fri, Jan 2, 2015 at 9:08 AM, Mahendra B Thapa <
> >> thapamb.mail.uc.edu
> >> >> > <javascript:;>>
> >> >> > > wrote:
> >> >> > > > Dear Dr.Daniel
> >> >> > > > Memory issues were solved when I followed the steps you
> suggested;
> >> >> > thank
> >> >> > > > you for that.
> >> >> > > >
> >> >> > > > A new problem appeared as seen in the screen:
> >> >> > > >
> >> >> > > > Internal Error: Metric is COORDS base but data set is not.
> >> >> > > > Error: in Analysis # 0
> >> >> > > > 1 errors encountered reading input.
> >> >> > > >
> >> >> > > > {{ Note: I have already fixed bugs for ambertool 14
> >> >> > > > http://ambermd.org/bugfixes/AmberTools/14.0/update.17
> >> >> > > > }}
> >> >> > > >
> >> >> > > > DATAFILES:
> >> >> > > > cluster_out (Standard Data File): Cnum_00001
> >> >> > > > Warning: Set 'Cnum_00001' contains no data.
> >> >> > > > Warning: File 'cluster_out' has no sets containing data.
> >> >> > > >
> >> >> > > > Are these errors due to (i) a large numbers of frames (250000)
> and
> >> >> > number
> >> >> > > > of atoms (7584 atoms) ?
> >> >> > > >
> >> >> > > > In the previous post (
> http://archive.ambermd.org/201408/0214.html
> >> ),
> >> >> > > there
> >> >> > > > is some discussion but I am assuming that I have been using
> >> stripped
> >> >> > > > topology file to run cpptraj. I have attached the screen shot (
> >> text
> >> >> > > > file:TEST_LOG) with this email.
> >> >> > > >
> >> >> > > > Thank you for help,
> >> >> > > > Mahendra Thapa
> >> >> > > >
> >> >> > > >
> >> >> > > > On Tue, Dec 16, 2014 at 3:20 PM, Thapa, Mahendra (thapamb) <
> >> >> > > > thapamb.mail.uc.edu <javascript:;>> wrote:
> >> >> > > >
> >> >> > > >>
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> ________________________________________
> >> >> > > >> From: Daniel Roe
> >> >> > > >> Sent: Tuesday, December 16, 2014 2:19:51 PM (UTC-06:00)
> Central
> >> >> > America
> >> >> > > >> To: AMBER Mailing List
> >> >> > > >> Subject: Re: [AMBER] clustering problem in ambertool14
> >> >> > > >>
> >> >> > > >> Hi,
> >> >> > > >>
> >> >> > > >> Usually when you get this error message during a command that
> >> uses a
> >> >> > > >> COORDS data set (cluster, 2drms, crdfluct etc) it's because
> you
> >> ran
> >> >> > > >> out of memory. Here is a formula to estimate the amount of
> memory
> >> >> you
> >> >> > > >> will need to hold a COORDS data set:
> >> >> > > >>
> >> >> > > >> memory_in_bytes = (F * A * 3) * 4
> >> >> > > >>
> >> >> > > >> where F is the number of frames, A is the number of atoms
> (after
> >> >> > > >> stripping in this case), the 3 is from # of coords per atom
> and
> >> 4 is
> >> >> > > >> bytes (COORDS are single precision). Divide by 1048576 to get
> the
> >> >> > > >> result in MB. Add 6 to (F * A *3) if you have box coordinates,
> >> >> double
> >> >> > > >> if you have velocities as well.
> >> >> > > >>
> >> >> > > >> However, in place of a COORDS data set cpptraj also lets you
> use
> >> >> what
> >> >> > > >> is called a TRAJ data set (which leaves data on-disk). The
> only
> >> >> issue
> >> >> > > >> with this is because it remains on the disk you cannot modify
> a
> >> TRAJ
> >> >> > > >> data set, so you will have to pre-process your trajectory
> (i.e.
> >> >> > > >> strip/image) first. This is a good idea to do in general
> since it
> >> >> will
> >> >> > > >> make subsequent analyses faster. Here is some input as an
> >> example.
> >> >> > > >>
> >> >> > > >> # Step 1 - Preprocess
> >> >> > > >> parm myparm.parm7
> >> >> > > >> trajin mytraj.nc
> >> >> > > >> strip :Na+,WAT nobox outprefix strip
> >> >> > > >> autoimage
> >> >> > > >> rms first mass .C,CA,N
> >> >> > > >> trajout strip.mytraj.nc nobox
> >> >> > > >>
> >> >> > > >> A few things to note here. First is that I put the 'strip'
> >> command
> >> >> > > >> before everything else; this way subsequent commands will be
> >> faster
> >> >> > > >> because there are less atoms to deal with. Also note in my
> >> 'strip'
> >> >> > > >> command I'm writing out a stripped topology for use with my
> >> stripped
> >> >> > > >> trajectory. Finally and most importantly, because you are
> >> >> rms-fitting
> >> >> > > >> you will no longer be able to image anyway, so I'm getting
> rid of
> >> >> any
> >> >> > > >> box coordinates.
> >> >> > > >>
> >> >> > > >> # Step 2 - Cluster
> >> >> > > >> parm strip.myparm.parm7
> >> >> > > >> trajin strip.mytraj.nc
> >> >> > > >> loadtraj name MYTRAJ
> >> >> > > >> cluster crdset MYTRAJ :1-291.CA,N,C,O mass clusters 10 out
> >> >> > cluster_out
> >> >> > > >> nofit averagelinkage \
> >> >> > > >> summary summary_out info Cluster_info repout box2.rep
> repfmt
> >> pdb
> >> >> > > >> clusterout cluster.nc clusterfmt netcdf
> >> >> > > >>
> >> >> > > >> The 'loadtraj' command in this case is taking all loaded
> >> >> trajectories
> >> >> > > >> from 'trajin' statements and putting them into a TRAJ data set
> >> named
> >> >> > > >> MYTRAJ, which stays on-disk and can subsequently be used by
> the
> >> >> > > >> 'cluster' command.
> >> >> > > >>
> >> >> > > >> One more thing to keep in mind is that even though the
> >> coordinates
> >> >> > > >> will be kept on disk, you will still need enough memory to
> hold
> >> the
> >> >> > > >> pairwise distance matrix:
> >> >> > > >>
> >> >> > > >> memory_in_bytes = ((F * (F-1)) / 2) * 4
> >> >> > > >>
> >> >> > > >> If you don't have enough memory to hold the pairwise distance
> >> matrix
> >> >> > > >> try using the 'sieve' keyword to reduce the number of frames
> >> being
> >> >> > > >> clustered in the first pass. This will also speed up the
> actual
> >> >> > > >> clustering a bit. Last and most importantly make sure you are
> >> using
> >> >> > > >> the most up-to-date version of cpptraj (14.22).
> >> >> > > >>
> >> >> > > >> Hope this helps,
> >> >> > > >>
> >> >> > > >> -Dan
> >> >> > > >>
> >> >> > > >> On Tue, Dec 16, 2014 at 11:28 AM, Mahendra B Thapa <
> >> >> > thapamb.mail.uc.edu <javascript:;>
> >> >> > > >
> >> >> > > >> wrote:
> >> >> > > >> > Dear Amber users
> >> >> > > >> > I used following command for clustering 50ns all-atom
> simulated
> >> >> > data.
> >> >> > > >> > cpptraj -i input_file -p para_top
> >> >> > > >> > where 'input_file' consists of
> >> >> > > >> >
> >> >> > > >> > trajin mdcrd_files
> >> >> > > >> > autoimage
> >> >> > > >> > rms first mass .C,CA,N
> >> >> > > >> > strip :Na+,WAT
> >> >> > > >> > cluster :1-291.CA,N,C,O mass clusters 10 out cluster_out
> nofit
> >> >> > > >> > averagelinkage \
> >> >> > > >> > summary summary_out info Cluster_info repout box2.rep
> repfmt
> >> pdb
> >> >> > > >> > clusterout cluster.nc clusterfmt netcdf
> >> >> > > >> > go
> >> >> > > >> >
> >> >> > > >> > After running the command, I got following message without
> any
> >> >> > output
> >> >> > > >> files:
> >> >> > > >> >
> >> >> > > >> > 1]terminate called after throwing an instance of
> >> 'std::bad_alloc'
> >> >> > > >> > what(): std::bad_alloc
> >> >> > > >> > Aborted
> >> >> > > >> >
> >> >> > > >> > 2] Warning: One or more analyses requested creation of
> default
> >> >> > COORDS
> >> >> > > >> > DataSet.
> >> >> > > >> > CREATECRD: Saving coordinates from Top to file to
> >> >> "_DEFAULTCRD_"
> >> >> > > >> >
> >> >> > > >> >
> >> >> > > >> > 3]Warning: Coordinates are being rotated and box coordinates
> >> are
> >> >> > > present.
> >> >> > > >> > Warning: Unit cell vectors are NOT rotated; imaging will
> not be
> >> >> > > possible
> >> >> > > >> > Warning: after the RMS-fit is performed.
> >> >> > > >> >
> >> >> > > >> > Any comments and suggestion will be very useful.
> >> >> > > >> >
> >> >> > > >> > Thank you,
> >> >> > > >> > Mahendra Thapa
> >> >> > > >> > University of Cincinnati
> >> >> > > >> > _______________________________________________
> >> >> > > >> > AMBER mailing list
> >> >> > > >> > AMBER.ambermd.org <javascript:;>
> >> >> > > >> > http://lists.ambermd.org/mailman/listinfo/amber
> >> >> > > >>
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> --
> >> >> > > >> -------------------------
> >> >> > > >> Daniel R. Roe, PhD
> >> >> > > >> Department of Medicinal Chemistry
> >> >> > > >> University of Utah
> >> >> > > >> 30 South 2000 East, Room 307
> >> >> > > >> Salt Lake City, UT 84112-5820
> >> >> > > >> http://home.chpc.utah.edu/~cheatham/
> >> >> > > >> (801) 587-9652
> >> >> > > >> (801) 585-6208 (Fax)
> >> >> > > >>
> >> >> > > >> _______________________________________________
> >> >> > > >> AMBER mailing list
> >> >> > > >> AMBER.ambermd.org <javascript:;>
> >> >> > > >> http://lists.ambermd.org/mailman/listinfo/amber
> >> >> > > >>
> >> >> > > >
> >> >> > > > _______________________________________________
> >> >> > > > AMBER mailing list
> >> >> > > > AMBER.ambermd.org <javascript:;>
> >> >> > > > http://lists.ambermd.org/mailman/listinfo/amber
> >> >> > > >
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > --
> >> >> > > -------------------------
> >> >> > > Daniel R. Roe, PhD
> >> >> > > Department of Medicinal Chemistry
> >> >> > > University of Utah
> >> >> > > 30 South 2000 East, Room 307
> >> >> > > Salt Lake City, UT 84112-5820
> >> >> > > http://home.chpc.utah.edu/~cheatham/
> >> >> > > (801) 587-9652
> >> >> > > (801) 585-6208 (Fax)
> >> >> > >
> >> >> > > _______________________________________________
> >> >> > > AMBER mailing list
> >> >> > > AMBER.ambermd.org <javascript:;>
> >> >> > > http://lists.ambermd.org/mailman/listinfo/amber
> >> >> > >
> >> >> > _______________________________________________
> >> >> > AMBER mailing list
> >> >> > AMBER.ambermd.org <javascript:;>
> >> >> > http://lists.ambermd.org/mailman/listinfo/amber
> >> >> >
> >> >>
> >> >>
> >> >> --
> >> >> -------------------------
> >> >> Daniel R. Roe, PhD
> >> >> Department of Medicinal Chemistry
> >> >> University of Utah
> >> >> 30 South 2000 East, Room 307
> >> >> Salt Lake City, UT 84112-5820
> >> >> http://home.chpc.utah.edu/~cheatham/
> >> >> (801) 587-9652
> >> >> (801) 585-6208 (Fax)
> >> >> _______________________________________________
> >> >> AMBER mailing list
> >> >> AMBER.ambermd.org
> >> >> http://lists.ambermd.org/mailman/listinfo/amber
> >> >>
> >> > _______________________________________________
> >> > AMBER mailing list
> >> > AMBER.ambermd.org
> >> > http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >>
> >> --
> >> -------------------------
> >> Daniel R. Roe, PhD
> >> Department of Medicinal Chemistry
> >> University of Utah
> >> 30 South 2000 East, Room 307
> >> Salt Lake City, UT 84112-5820
> >> http://home.chpc.utah.edu/~cheatham/
> >> (801) 587-9652
> >> (801) 585-6208 (Fax)
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> --
> -------------------------
> Daniel R. Roe, PhD
> Department of Medicinal Chemistry
> University of Utah
> 30 South 2000 East, Room 307
> Salt Lake City, UT 84112-5820
> http://home.chpc.utah.edu/~cheatham/
> (801) 587-9652
> (801) 585-6208 (Fax)
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Jan 18 2015 - 08:30:03 PST
Custom Search