Re: [AMBER] FW: FW: FW: FW: FW: clustering problem in ambertool14

From: Jason Swails <jason.swails.gmail.com>
Date: Mon, 26 Jan 2015 11:21:30 -0500

> On Jan 26, 2015, at 10:42 AM, Mahendra B Thapa <thapamb.mail.uc.edu> wrote:
>
> Dear Dr. Daniel
>
> "loadcrd" worked nicely with small data set that you suggested in the last
> email. But, the process (i.e., cpptraj.OMP) killed without any error
> message for very large data set (250000 frames which occupied 144 GB space
> and each frame was without water molecules) in dual core machine with
> nvidia graphics.The message seen for these large frames was as

If you are using NetCDF trajectories, the file size is a good estimate of how much RAM will be needed. If you are using a different format (here it looks like PDB files), the file will be *much* larger than the amount of RAM that’s needed. (NetCDF is typically a much better file format to use to store trajectories than PDB or Amber ASCII mdcrd files).

But depending on how many atoms you have, it’s quite possible that your computer only has enough memory for 20-30% of the frames in your trajectory. A good way of estimating how much RAM is needed to store a trajectory completely in memory is to calculate the space needed to hold all of the coordinates.

3 * # of atoms * # of frames * 4 bytes per float

That gives the number of bytes needed. Divide that by 1024 for KB (and again for MB and again for GB). For 250,000 frames and about 5000 atoms, you get 3*5000*250000*4/(1023^3) = ~14 GB of RAM that you need just for the atomic coordinates. Of course cpptraj needs a bit more space to store stuff like the system topology and whatever data sets you generate, but the big one is the coordinate sets.

HTH,
Jason

P.S. -- to give you an idea of the compression you get with NetCDF (or any binary file) -- each PDB file has 80-character lines (each char is 1 byte in the standard ASCII encoding, or 1/4 the size of a floating point number and 1/8 the size of a double precision floating point number). Each line in the PDB file contains 3 floating point numbers (in low precision) in those 80 characters (among other information that is not used by cpptraj). So while the PDB file uses ~80 bytes for each atom, NetCDF uses 3*4 = 12 bytes... almost an order of magnitude lower.

--
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Jan 26 2015 - 08:30:04 PST
Custom Search