Re: [AMBER] [cpptraj] Changes in RMSF functionality?

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Mon, 19 Oct 2015 11:10:36 -0600

Hi,

I think I know what is going on here. The issue is that a TRAJ data
set and a COORDS data set differ in a very important way. COORDS data
sets are loaded into memory, so they can be modified. TRAJ data sets
on the other hand remain on-disk, so they can not be modified
(otherwise every time you ran e.g. an 'rms' command you would
irrevocably change your trajectory). When you use this input:

> 1. parm top.prmtop
> 2. trajin 05_Production_*.nc
> 3. loadtraj name loaded_trajs
> 4. crdaction loaded_trajs average crdset average_structure .CA,C,O,N,H
> 5. crdaction loaded_trajs rms ref average_structure .CA,C,O,N,H
> 6. crdaction loaded_trajs atomicfluct out rmsf_OPT1.dat .CA,C,O,N,H byres
> 7. run

In line 3 you are loading your trajectories as a TRAJ data set named
loaded_trajs. Then in line 5 you run an 'rms' command, but since you
are operating on a TRAJ data set the coordinates are not modified, so
when you run your 'atomicfluct' command on line 6 you are running it
on the original non-rms-fit frames. (Note that 'run' on line 7 is
unnecessary here because you have run everything with 'crdaction'; you
have not queued up any actions or analyses).

Given this input:

> 1. parm top.prmtop
> 2. loadcrd full_lenght.nc
> 3. crdaction full_lenght.nc average avg.pdb .CA,C,O,N,H
> 4. parm avg.pdb
> 5. reference avg.pdb parm avg.pdb
> 6. crdaction full_lenght.nc rms reference .CA,C,O,N,H
> 7. crdaction full_lenght.nc atomicfluct out rmsf_OPT2.dat .CA,C,O,N,H byres
> 8. run

Here in line 2 you load the entire trajectory as a COORDS data set; it
is all in-memory and can be manipulated. So when you run 'rms' in line
6 the coordinates are modified, so when you run your 'atomicfluct'
command on line 7 you are running it on the rms-fit frames, which is
why the results look better.

I think in this case, if you want everything to remain on-disk anyway,
instead of using a TRAJ data set just use the "normal"
'trajin'/Actions workflow and break it up into two runs. (Also note
you probably want to rms-fit before creating your average structure as
well). For example:

# Step 1 - create average structure, rms-fit to first frame
parm top.prmtop
trajin 05_Production_*.nc
rms first .CA,C,O,N,H
average crdset average_structure .CA,C,O,N,H
run
# Step 2 - rms fit to average and calculate atomic flucts
rms ref average_structure .CA,C,O,N,H
atomicfluct out rmsf_OPT1.dat .CA,C,O,N,H byres
run

The TRAJ data set is really around as a convenience for certain
Analyses that require COORDS data sets but where memory might be a
problem (e.g. '2drms', 'cluster', etc). You shouldn't have to use them
normally. However, the manual and output from cpptraj should make it
very clear that coordinates in TRAJ data sets cannot be manipulated.
I'll make sure to update both, thanks for the report on that.

Hope this helps. Let me know if you have any more questions.

-Dan


On Mon, Oct 19, 2015 at 10:30 AM, Juan Eiros Zamora
<j.eiros-zamora14.imperial.ac.uk> wrote:
> Dear Amber experts,
>
> I have been using for some time now a cpptraj script that calculates the
> per-residue RMSF of my protein of interest. It's a trimeric protein with
> standard amino acids, 419 in total. It has some regions that are flexible
> and I usually take a look at the RMSF profile to quantify that. I switched
> to AmberTools 15 soon after it was released.
>
> The issue is that some days ago I ran the same cpptraj script I always use
> to some new trajectories and saw a big increase in the RMSF profile. These
> new trajectories were run with ff14SB forcefield, while the old ones were
> run with ff99SB. I've been trying to narrow down what could possibly be the
> source of change, but I've been unable to spot it. Visualizing the new
> trajectories with Chimera or VMD showed no signs of such a drastic increase
> in the RMSF values (similar kind of movements when compared to the old
> ff99SB trajectories). What is more worrying, I reran the same cpptraj
> commands on the old trajectories and got similar bigger values as well (?!)
> This is what makes me think that there could have been a change in the way
> cpptraj behaves.
>
> Here's the commands I've been using to do this analysis. My idea is to get
> an average structure of the backbone atoms of the full trajectory, rms fit
> to that and then calculate the per-residue fluctuations:
>
>> parm top.prmtop
>> trajin 05_Production_*.nc
>> loadtraj name loaded_trajs
>> crdaction loaded_trajs average crdset average_structure .CA,C,O,N,H
>> crdaction loaded_trajs rms ref average_structure .CA,C,O,N,H
>> crdaction loaded_trajs atomicfluct out rmsf_OPT1.dat .CA,C,O,N,H byres
>> run
>
>
> Now, I've been playing around with cpptraj trying to solve this and found
> that if I concatenate the trajectories together into one (I usually have the
> runs split into chunks of 50 ns) and modify the commands, I get an RMSF
> profile that is similar to what I used to obtain (with the commands above,
> mind you) in terms of values and profile. This second set of commands
> effectively do the same thing as the ones above, only loading the full
> trajectory as a coords set and then outputting the average structure.
>
>> parm top.prmtop
>> loadcrd full_lenght.nc
>> crdaction full_lenght.nc average avg.pdb .CA,C,O,N,H
>> parm avg.pdb
>> reference avg.pdb parm avg.pdb
>> crdaction full_lenght.nc rms reference .CA,C,O,N,H
>> crdaction full_lenght.nc atomicfluct out rmsf_OPT2.dat .CA,C,O,N,H
>> byres
>> run
>
> I'm attaching a plot showing the differences between the cpptraj outputs.
> The red line (opt1) corresponds to the first cpptraj commands on the new
> trajectories. The green line (opt2) corresponds to the second commands on
> the same new data. The blue line is from another independent old run.
>
> I expect changes in the RMSF values and profiles between independent runs,
> but not so big when the only thing I've done is changing the forcefield.
> Also, running the same commands on old trajectories is also generating
> different RMSF values than the ones I had previously done.
>
> Could somebody give me their opinion on this?
>
> All the best,
>
> Juan
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Oct 19 2015 - 10:30:03 PDT
Custom Search