# Re: [AMBER] Analysis of percentage contact in residue-ligand interaction with CPPTRAJ

From: Wesley Michael Botello-Smith <wmsmith.uci.edu>
Date: Mon, 20 Nov 2017 17:25:44 -0800

I think you can probably use the cpptraj 'nativecontacts' to accomplish
this task (or at least generate the needed data to do so).

eg:
```----
cpptraj systemTopology.parm7
trajin trajectory.nc
nativecontacts ':ligandResidNumbers' ':receptorResidNumbers' writecontacts
residueContactInfo.dat
go
---
The file 'residueContactInfo.dat' would then be a 6 column file of the form:
#   Contact   Nframes  Frac.   Avg   Stdev
You could then use the 'Frac' column to figure out what you need.
This column would contain an entry for each atom in the first mask
(:ligandResidNumbers) that was in contact with an atom in the second mask
(:receptorResidNumbers) which corresponds to the percentage of frames for
which the two atoms were in contact.
If you need this by residue instead of by atom, you would need to use the
'series' and 'seriesout' options (see the amber manual) this gets you the
breakdown of each atom for each frame.
You would then need to summarize over both atoms and frames to get the per
residue contact fraction.
I've had to do something similar to this quite a few times recently so I
wrote an R script to analyze the results.
Assuming you name the series data file 'contactData.dat' then to do so in
R, you could use the following:
------
require(dplyr)
require(tidyr)
require(tibble)
require(reshape2)
require(Matrix)
#load the data into an R data.frame
#format the data to be a 4 column format "FRAME X.RESID Y.RESID
CONTACT_COUNT"
#and compute the contact sums over each atom in a given X or Y RESID
parsedData<-testDat %>%
rename(FRAME=X.FRAME) %>%
melt(id.vars=c("FRAME"),variable.name="CONTACT_PAIR",value.name="CONTACT_COUNT")
%>%
separate(CONTACT_PAIR,c("X","Y"),sep="_")%>%
mutate(X=gsub("X[.]","",X),Y=gsub("^[.]","",Y))%>%
separate(X,c("X.RESID","X.ATOM")) %>%
separate(Y,c("Y.RESID","Y.ATOM")) %>%
group_by(FRAME,X.RESID) %>%
summarise(CONTACT_COUNT=sum(CONTACT_COUNT))
#summarise over frames and ligand resids, assuming you used the second mask
#to be your receptor ids when generating your series in cpptraj, this means
that
#Y.RESID will contain the residue id for each receptor residue of a given
contact
#you then need to take the average over all ligand RESIDs with which it had
contact
#this yields the 2 column format table "RESID   CONTACT_FREQ"
testSummary <- parsedData %>%
group_by(Y.RESID) %>%
summarise(CONTACT_FREQ=mean(CONTACT_COUNT>0))
#write the data to an output file
testSummary %>% write.table("ReceptorContactData.dat")
----
hope that helps.
On Wed, Nov 15, 2017 at 9:36 PM, anu chandra <anu80125.gmail.com> wrote:
> Dear Amber users,
>
> I am working with protein-ligand system and there are ten residues lining
> the binding pocket which have potential to interact with the ligand. Here,
> I would like to calculate the percentage/fraction each of the ten residues
> interact with ligand  during the course of simulation (for e.g., how many
> frames out of ,say, 100 frames have residue-1 interact with ligand). I
> wonder if there is way I can do such calculation in CPPTRAJ.
>
> Any suggestion would be highly appreciated
>
>