Skip to content

Commit

Permalink
Updated Extra MD Analysis notebook
Browse files Browse the repository at this point in the history
- Formatted header
- Added TOC and sections
- added some explanatory text for each section
- added reference to MDAnalysis manual
  • Loading branch information
DEGIACOMI committed Sep 6, 2024
1 parent fecb90f commit 29532e7
Showing 1 changed file with 74 additions and 103 deletions.
177 changes: 74 additions & 103 deletions 5_Analysis_MDAnalysis/5_Extra_p24_analysis.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,40 @@
"**Author**: Dr Matteo Degiacomi ([email protected])"
]
},
{
"cell_type": "markdown",
"id": "3d12646a-486f-43b8-aaee-76157acd66cf",
"metadata": {},
"source": [
"**Jupyter cheat sheet:**\n",
"- to run the currently highlighted cell, hold <kbd>&#x21E7; Shift</kbd> and press <kbd>&#x23ce; Enter</kbd>;\n",
"- to get help for a specific function, place the cursor within the function's brackets, hold <kbd>&#x21E7; Shift</kbd>, and press <kbd>&#x21E5; Tab</kbd>;\n",
"\n",
"<div class=\"alert alert-info\"><b> Remember: variables persist between cells</b> \n",
" \n",
"Be aware that it is the order of execution of cells that is important in a Jupyter notebook, not the <em>order</em> in which they appear. Python will remember <em>all</em> the code that was run previously, including any variables you have defined, irrespective of the order in the notebook. Therefore if you define variables lower down the notebook and then (re)run cells further up, those defined further down will still be present. </div> "
]
},
{
"cell_type": "markdown",
"id": "8e969ae8-2e99-48d4-aa1d-c0c090c274e9",
"metadata": {},
"source": [
"## Table of Contents\n",
"\n",
"[1. Introduction](#intro) \n",
"[2. Root Mean Square Deviations (RMSDs)](#rmsd) \n",
"[3. Pairwise RMSD](#p_rmsd) \n",
"[4. Root Mean Square Fluctuation (RMSF)](#rmsf) \n",
"[5. Radius of gyration and end-to-end distance](#rgyr) "
]
},
{
"cell_type": "markdown",
"id": "ea2fe146-eea8-46f7-8696-ff5fa5cb823d",
"metadata": {},
"source": [
"## Google Colab setup\n",
"## 0. Google Colab setup\n",
"<div class=\"alert alert-warning\">\n",
"<b>Attention:</b> Please only run the following cells if you are using Colab! These cells install necessary packages and download data.</div>"
]
Expand Down Expand Up @@ -65,33 +93,13 @@
"os.chdir(f\"CCP5_Simulation_of_BioMolecules{os.sep}4_Analysis_MDAnalysis\")"
]
},
{
"cell_type": "markdown",
"id": "3d12646a-486f-43b8-aaee-76157acd66cf",
"metadata": {},
"source": [
"## Jupyter cheat sheet\n",
"\n",
"- to run the currently highlighted cell, hold <kbd>&#x21E7; Shift</kbd> and press <kbd>&#x23ce; Enter</kbd>;\n",
"- to get help for a specific function, place the cursor within the function's brackets, hold <kbd>&#x21E7; Shift</kbd>, and press <kbd>&#x21E5; Tab</kbd>;"
]
},
{
"cell_type": "markdown",
"id": "1d143168-1643-4caa-907e-15c87cdfb52d",
"metadata": {},
"source": [
"<div class=\"alert alert-warning\"><b> REMEMBER: variables persist between cells</b> \n",
" \n",
"Be aware that it is the order of execution of cells that is important in a Jupyter notebook, not the <em>order</em> in which they appear. Python will remember <em>all</em> the code that was run previously, including any variables you have defined, irrespective of the order in the notebook. Therefore if you define variables lower down the notebook and then (re)run cells further up, those defined further down will still be present. </div> "
]
},
{
"cell_type": "markdown",
"id": "fe8671d2-93ec-4fb3-b733-a6d47f818a2b",
"metadata": {},
"source": [
"## Introduction"
"## 1. Introduction\n",
"<a id='intro'></a>"
]
},
{
Expand Down Expand Up @@ -141,15 +149,16 @@
"id": "30b29017-8118-4ea9-9476-b0471cfb10c0",
"metadata": {},
"source": [
"## Root Mean Square Deviations (RMSD)"
"## 2. Root Mean Square Deviations (RMSDs)\n",
"<a id='rmsd'></a>"
]
},
{
"cell_type": "markdown",
"id": "47d163d5-4f5c-4397-9610-e26a38e3bae5",
"metadata": {},
"source": [
"Let's demonstrate how the time evolution of RMSD with respect to the first frame changes with different alignments and atoms of interest."
"Let's demonstrate how the time evolution of [RMSD](https://docs.mdanalysis.org/1.1.1/documentation_pages/analysis/rms.html) with respect to the first frame changes with different alignments and atoms of interest."
]
},
{
Expand All @@ -172,6 +181,14 @@
"R_D2.run()"
]
},
{
"cell_type": "markdown",
"id": "2147a729-4b24-42e6-b289-4e51eda09c5f",
"metadata": {},
"source": [
"Now, let's plot everything!"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -229,7 +246,7 @@
"id": "eb99eb0f-fc8a-4a2a-b74b-f575b26bbc07",
"metadata": {},
"source": [
"For the first slide on RMSD, let's also plot only a single RMSD profile"
"For the first slide in the lecture featuring RMSD, let's also plot only a single RMSD profile"
]
},
{
Expand Down Expand Up @@ -257,7 +274,16 @@
"id": "408d030b-a5c6-446f-9840-bf67820e6443",
"metadata": {},
"source": [
"## pairwise RMSD"
"## 3. Pairwise RMSD\n",
"<a id='p_rmsd'></a>"
]
},
{
"cell_type": "markdown",
"id": "ee67f243-2f0c-495b-a310-ea141b7b6f3a",
"metadata": {},
"source": [
"Now, let's generate a [pairwise RMSD](https://userguide.mdanalysis.org/stable/examples/analysis/alignment_and_rms/pairwise_rmsd.html) plot, i.e., a surface plot reporting on the RMSD of each conformation vs each other conformation."
]
},
{
Expand Down Expand Up @@ -292,15 +318,16 @@
"id": "9d30c163-f7a7-489a-b9fe-06608d9cd245",
"metadata": {},
"source": [
"## Root Mean Square Fluctuations (RMSF)"
"## 4. Root Mean Square Fluctuations (RMSF)\n",
"<a id='rmsf'></a>"
]
},
{
"cell_type": "markdown",
"id": "e4e4c72a-4255-4fd2-8ecf-502d212c697b",
"metadata": {},
"source": [
"We start by defining a function that aligns the trajectory and calculates the RMSF of a selection of interest."
"The Root Mean Square Fluctuation ([RMSF](https://userguide.mdanalysis.org/stable/examples/analysis/alignment_and_rms/rmsf.html)) reports on the amount of displacement of an amino acid w.r.t. its mean position during the simulation. We start by defining a function that aligns the trajectory and calculates the RMSF of a selection of interest."
]
},
{
Expand Down Expand Up @@ -395,7 +422,16 @@
"id": "453510f6-2bd4-4856-a871-3c28bafd4f49",
"metadata": {},
"source": [
"## Radius of gyration and end-to-end distance"
"## 5. Radius of gyration and end-to-end distance\n",
"<a id='rgyr'></a>"
]
},
{
"cell_type": "markdown",
"id": "d9e0f916-9497-42e6-a782-cbd51e7e12ec",
"metadata": {},
"source": [
"To calculate radius of gyration (Rg) and end-to-end distance of a protein, we will create a few <code>AtomGroup</code>s. The radius of gyration is a quantity that can be directly extracted from any <code>AtomGroup</code> (here, we will select the whole protein). N-terminus and C-terminus coordinates, necessary to calculate the end-to-end distance, can be extracted as the first and last atom in <code>AtomGroup</code>s containing coordinates of N and C atoms, respectively."
]
},
{
Expand All @@ -405,7 +441,6 @@
"metadata": {},
"outputs": [],
"source": [
"#u = mda.Universe(\"trajectory_formatted.pdb\")\n",
"nterm = u.select_atoms('name N')[0]\n",
"cterm = u.select_atoms('name C')[-1]\n",
"bb = u.select_atoms('protein')\n",
Expand All @@ -421,6 +456,14 @@
" rg.append(rgyr)"
]
},
{
"cell_type": "markdown",
"id": "c1dfd9f2-2d55-4ab1-a9d2-13685b853947",
"metadata": {},
"source": [
"Let's now plot the quantities we have extracted for each simulation snapshot!"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -444,78 +487,6 @@
"fig.savefig(\"rg_dist_p24.png\")"
]
},
{
"cell_type": "markdown",
"id": "86865f4e-ea12-4fda-860d-99ba2f250daf",
"metadata": {},
"source": [
"## Hydrogen bonds"
]
},
{
"cell_type": "markdown",
"id": "ecae01ea-b687-43ac-ae3c-3b86d37a4859",
"metadata": {},
"source": [
"The function below can work the hydrogen bonds in your protein. Can you work out how to use it?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1657a552-0442-4c57-a25a-f1bc019c30e1",
"metadata": {},
"outputs": [],
"source": [
"def hbonds(hydrogens, acceptors):\n",
" \n",
" \"\"\" this function calculates hydrogen bonds \"\"\"\n",
" \n",
" acc_idx, hyd_idx = idx.T\n",
" \n",
" idx, dists = mda.lib.distances.capped_distance(acceptors.positions, \n",
" hydrogens.positions, \n",
" max_cutoff=3.0,\n",
" box=acceptors.dimensions) \n",
"\n",
" \n",
" acc_idx, hyd_idx = idx.T\n",
"\n",
" # select potential hydrogen bonds to check angles\n",
" potential_hbond_acceptors = acceptors[acc_idx]\n",
" potential_hbond_hydrogens = hydrogens[hyd_idx]\n",
"\n",
" # select hydrogen bond donors by looping over hydrogens and selecting the bonded oxygens\n",
" potential_hbond_donors = sum(h.bonded_atoms[0] for h in potential_hbond_hydrogens)\n",
" \n",
" angles = mda.lib.distances.calc_angles(potential_hbond_acceptors.positions,\n",
" potential_hbond_hydrogens.positions,\n",
" potential_hbond_donors.positions, \n",
" box=u.dimensions)\n",
" #convert to degrees\n",
" angles = np.rad2deg(angles)\n",
" \n",
" #check angles are larger than 130 degrees\n",
" angle_idx = np.where(angles >= 130.0)\n",
" \n",
" hbond_acceptors = potential_hbond_acceptors[angle_idx]\n",
" hbond_hydrogens = potential_hbond_hydrogens[angle_idx]\n",
" hbond_donors = potential_hbond_donors[angle_idx]\n",
" \n",
" return hbond_acceptors, hbond_hydrogens, hbond_donors"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0f7002d1-05a0-473c-93eb-384676f12848",
"metadata": {},
"outputs": [],
"source": [
"# Try using the hbonds function here!\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "960746cd-7b96-4f92-82ed-608e23dea2d7",
Expand Down

0 comments on commit 29532e7

Please sign in to comment.