Re: [AMBER] CPPTRAJ, Counting Per Residue Pair H-Bonding

From: Kenneth Huang <khuang8.student.gsu.edu>
Date: Wed, 16 Mar 2022 16:30:26 +0000

If I run the commands you mentioned, and sort the output from the last command by the first column, I only see for residue 85:
85 66 0.9946
85 87 0.9598

Whoops, that's from the first awk portion actually-

awk 'NR>1 && NR<279 {print $1,$3,$5}' hbond_avg.*.out | awk '$3 > .05 {print ;}' > test

Where I had it drop anything that was under 5% (0.05) after pulling out columns 1, 3, and 5, assuming it wasn't worth looking at. So the version you'd want would just be

awk 'NR>1 && NR<279 {print $1,$3,$5}' hbond_avg.*.out > test

Which won't apply a cutoff unless you add it back in.

the list from your commands is only ~273 lines long, whereas the original out file is >3000 for the entire protein

That might be from the first part, which specifies to only look at lines between 1 and 279-

awk 'NR>1 && NR<279
Since in my case, there were solvent-solute bonds also listed (starting at line 279)- if you don't have solvent-solute bonds as part of the hbond command, you can just drop it down to

awk 'NR>1 {print $1,$3,$5}' hbond_avg.*.out > test

Best,
Kenneth
________________________________
From: Matthew Guberman-Pfeffer <matthew.guberman-pfeffer.uconn.edu>
Sent: Wednesday, March 16, 2022 11:00 AM
To: AMBER Mailing List <amber.ambermd.org>
Subject: Re: [AMBER] CPPTRAJ, Counting Per Residue Pair H-Bonding

Hi Kenneth,

Thanks for the awk commands. I have a question about the second command.

In my avg.out file, I have:
ASP_85.OD2 THR_66.HG1 THR_66.OG1 371 0.9946 2.6259 161.8941
ASP_85.OD1 GLY_87.H GLY_87.N 358 0.9598 2.8142 157.8910
ASP_85.OD2 GLN_79.H GLN_79.N 203 0.5442 2.8921 162.1535
ASP_85.OD1 THR_66.H THR_66.N 66 0.1769 2.8812 146.1349
ASP_85.O TRP_88.H TRP_88.N 25 0.0670 2.9352 167.3955
ASP_85.OD2 GLN_79.HE22 GLN_79.NE2 21 0.0563 2.8330 153.7837
ASP_85.OD2 LEU_78.H LEU_78.N 4 0.0107 2.8733 148.6609
ASP_85.OD1 THR_66.HG1 THR_66.OG1 1 0.0027 2.9728 142.6676

If I run the commands you mentioned, and sort the output from the last command by the first column, I only see for residue 85:
85 66 0.9946
85 87 0.9598

Why does it seem like H-bonding interactions are missing? (In Fact, the list from your commands is only ~273 lines long, whereas the original out file is >3000 for the entire protein.) What am I missing or not understanding about the second awk command?

Best,
Matthew







> On Mar 16, 2022, at 10:37 AM, Kenneth Huang <khuang8.student.gsu.edu> wrote:
>
> *Message sent from a system outside of UConn.*
>
>
> Hi,
>
> I think something like this might do the trick of what you're looking for if I'm understanding your question correctly?
>
> hbond DNA out hbond.DNA.dat :1-46 angle 135 dist 3.0 avgout hbond_avg.DNA.out bridgeout hbond_brid.DNA.dat solventdonor :WAT solventacceptor :WAT.O nointramol
>
> Should automatically get all the hbonds in the residue mask (1-46) so you don't have to do a pairwise matrix by loop, and will dump the breakdown fractions of each combination into the avg.out file, which you can make a lot more readable with
>
> #print out from input in range- defined to avoid solute solvent hbond; solvent starts at line 279 here
> awk 'NR>1 && NR<279 {print $1,$3,$5}' hbond_avg.*.out | awk '$3 > .05 {print ;}' > test
>
> #make the output to residue# residue# and Hbond%
> awk -F"_" '$1=$1' OFS=" " test | awk -F"." '$1=$1' OFS="\t" | awk '{print $2,$5,$7}' | awk '!_[$1,$2]++' > test2a
> Which will get you the fraction of the trajectory the hbond is present- not quite a 'count' per pair, but since the pairs are all unique, you can just count the number of entries to get total, and do some more awk/sort manipulation of the residue numbers to see what residue is involved in multiple hbonds, etc, which I think is what you want?
>
> Best,
> Kenneth
> ________________________________
> From: Matthew Guberman-Pfeffer <matthew.guberman-pfeffer.uconn.edu>
> Sent: Wednesday, March 16, 2022 9:45 AM
> To: AMBER Mailing List <amber.ambermd.org>
> Subject: [AMBER] CPPTRAJ, Counting Per Residue Pair H-Bonding
>
> Dear Amber Community,
>
> I need to quantify the total number of H-bonds for each pair of residues in my protein. That would mean a 1321 x 1321 matrix. I am having trouble in writing a bash script to process the CPPTRAJ hbond output (e.g. All.UU.avg.dat) to compute each matrix element. I’ve thought of running CPPTRAJ for every possible pair of donormask and acceptormask, but this is truly an ugly brute-force approach. Does anyone have suggestions on how I can compute the total number of H-bonds between each pair of residues?
>
> Best,
> Matthew
>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.ambermd.org%2Fmailman%2Flistinfo%2Famber&amp;data=04%7C01%7Ckhuang8%40student.gsu.edu%7C43a6d3f043654bad4e7d08da075dc95c%7C704d822c358a47849a1649e20b75f941%7C0%7C1%7C637830396658248071%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ruK7JPidUkIJQzwHZeqcf8AnlPnDiDzlBcb4PlAC29I%3D&amp;reserved=0
> CAUTION: This email was sent from someone outside of the university. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.ambermd.org%2Fmailman%2Flistinfo%2Famber&amp;data=04%7C01%7Ckhuang8%40student.gsu.edu%7C43a6d3f043654bad4e7d08da075dc95c%7C704d822c358a47849a1649e20b75f941%7C0%7C1%7C637830396658248071%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ruK7JPidUkIJQzwHZeqcf8AnlPnDiDzlBcb4PlAC29I%3D&amp;reserved=0

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.ambermd.org%2Fmailman%2Flistinfo%2Famber&amp;data=04%7C01%7Ckhuang8%40student.gsu.edu%7C43a6d3f043654bad4e7d08da075dc95c%7C704d822c358a47849a1649e20b75f941%7C0%7C1%7C637830396658248071%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=ruK7JPidUkIJQzwHZeqcf8AnlPnDiDzlBcb4PlAC29I%3D&amp;reserved=0
CAUTION: This email was sent from someone outside of the university. Do not click links or open attachments unless you recognize the sender and know the content is safe.

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Mar 16 2022 - 10:00:02 PDT
Custom Search