Re: [AMBER] Memory footprint goes on increasing: pytraj traj.iterchunk used, distances between bonds calculation

From: Hai Nguyen <nhai.qn.gmail.com>
Date: Sun, 12 Mar 2017 17:30:15 -0400

Hi,

so I looked further at your profile.log and saw:

 153 455.2 MiB 15.0 MiB bonds_val = pt.distance(chunk,
bnd_list, dtype='ndarray')

Isn't it expected since you save the distance val to bonds_val?

Can you try to replace bonds_val = pt.distance(...)
by
bonds_val = some_equal_size_numpy_array
and do the profiling again?

Hai

On Sun, Mar 12, 2017 at 5:13 PM, SHAILESH KUMAR <shaile27_sit.jnu.ac.in>
wrote:

> Hi,
>
> Thank you for reply.
>
>
> Problem does not seem to be with iterchunk, because earlier I tried to do
> chunking by calling ptraj.iterload inside loop emulating chunks and frame
> slicing but had similar problems.
>
> If ptraj.iterload is called inside loop with varying frame_slice combined
> with range based extraction of frames, it should have not done problem, but
> it did.
>
> I suspect pytraj.distance, pytraj.angles and pytraj.dihedrals are causing
> memory leaks. Apart from this, as you suggested using iterframes
> it would be to costly in terms of performance to use because trajectory
> have milions of frames and I am doing it to actually for coordinate system
> conversion
> for the molecule from cartesian to internal coordinate system. I intend to
> write these bonds, angles, dihedrals sets as NETCDF4/HDF5 files with
> chunking support
> so that it can be efficiently read dimension wise(individual bonds/angles
> /torsions) for all the frames.
>
> Additionally, I am using netCDF4 package for writing NetCDF4 (currently
> disabled for spoting memory leaks) trajectory in inernal coordinate, I
> would prefer to use pytraj itself if I could write NetCDF4 with chunking
> with it.
>
> and reduce list of dependencies.
>
> On Sun, Mar 12, 2017 at 9:41 PM, Nhai <nhai.qn.gmail.com> wrote:
>
> > Hi
> >
> > iterchunk is not well written and I don't find it useful much.
> >
> > can you try:
> >
> > for frame in traj:
> > dosomething(...)
> >
> > But I will have a look at the iterchunk stuff. Thanks.
> >
> > Hai
> >
> > > On Mar 12, 2017, at 4:17 PM, SHAILESH KUMAR <shaile27_sit.jnu.ac.in>
> > wrote:
> > >
> > > Dear all,
> > >
> > > I am trying to process a big trajectory which can not fit in memory my
> > > computer. So, I tried using Iterating over the full trajectory using
> > > iterchunk method in pytraj. In each iteration bond lengths are
> calculated
> > > for the frames in chunk, and can be written to a file (which currently
> is
> > > disable for memory profiling purpose) for further analysis because
> > memory
> > > footprint of the process keeps on growing.
> > >
> > > Its pseudo code can be as follows:
> > >
> > > traj = ptraj.iterload(trajfile, prmtopfile, frame_slice=slice_info)
> > >
> > > for chunk in traj.iterchunk(chunk_size, start=0, start=-1):
> > > bnd_vals = pt.distance(chunk, bnd_list, dtype='ndarray')
> > > # do process bnd_vals() ## curently disabled
> > > gc.collect()
> > >
> > > But on memory profiling it was observed that memory keeps on increasing
> > in
> > > ievery iteration of chunks. Which indicates memory leak in
> > > bnd_vals = pt.distance(chunk, bnd_list, dtype='ndarray')
> > > but why is not clear to me. May be I am doing something wrong and not
> > able
> > > to spot it, or there is memory leak somewhere (may be in api).
> > >
> > >
> > > Now I ask for help, if any one can help me to sort it out, it would be
> a
> > > great favor. For reproduciblity i am attaching simplified code, with
> > > necessary input files except trajectory, i can share it using dropbox
> > when
> > > needed. This test dataset corresponds to a small molecule. Actual
> problem
> > > is to do similar for actual protein molecules.
> > > <profile.log>
> > > <dummy-code.py>
> > > <INR.DFS.tree>
> > > <INR.lig.gas.leap.prmtop>
> > > <INR.pdb>
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Mar 12 2017 - 15:00:02 PDT
Custom Search