T cell receptors (TCRs) play a crucial role in the adaptive immune system. They are responsible for recognising antigenic peptides presented at the surface of infected cells by major histocompatibility complex (MHC) molecules, and triggering downstream immune responses to fight off disease.
This repository contains structural data of TCRs and pMHC complexes crystallised in bound and unbound form, serving as a set of benchmark test cases for computational docking tools. This dataset contains the 30 TCR-pMHC docking cases present in the original TCR benchmark, as well as 14 new cases identified in the STCRDab and Protein Databank databases. This data was curated for work described in Peacock and Chain (2021).
Structures in the raw folder contain structural data as stored in the Protein Databank. Original names for these PDB structures can be found in the table below.
Structures in the imgt folder have been preprocessed. Solvent and heteroatoms have been removed, as have disordered atoms. Residues with non-standard names in PDB format have been updated to their standard equivalents. Raw data that contained multiple co-crystalised protein structures has been reduced such that only one appears in each PDB file. Structures with missing atoms or residues at the binding interface were repaired using Modeller using a protocol described in the Modeller-Repair repository. TCRs in both the unbound and bound files have been re-numbered according to the IMGT numbering scheme. TCR chains have been relabelled with the IDs D and E, MHC chains with IDs A and B, and peptide chains with ID C. The position and orientation of each structure has been randomised.
Each docking case is named after the PDB code of the raw bound TCR-pMHC structure. Bound TCR-pMHC structures contain the suffix _b.pdb, unbound TCRs contain the suffix _r_u.pdb (where r stands for "receptor", and u stands for "unbound") and unbound pMHCs contain the suffix _l_u.pdb (where l stands for "ligand", and u stands for "unbound"). Corresponding codes for the unbound TCR and unbound pMHC structures can be found in the table below.
| Bound Complex | Unbound TCR | Unbound pMHC | MHC Class | IRMSD | Fnon-nat | Difficulty |
|---|---|---|---|---|---|---|
| 1AO7 * | 3QH3 | 1DUZ | I | 1.25 | 0.33 | rigid |
| 1MI5 * | 1KGC | 1M05 | I | 1.25 | 0.48 | medium |
| 1MWA * | 1TCR | 1LEK | I | 1.14 | 0.3 | rigid |
| 1OGA * | 2VLM | 2VLL | I | 1.36 | 0.43 | medium |
| 2BNR * | 2BNU | 1S9W | I | 0.72 | 0.23 | rigid |
| 2CKB * ‡ | 1TCR | 1LEG | I | 1.17 | 0.45 | medium |
| 2IAM * | 2IAL | 1KLG | II | 0.87 | 0.24 | rigid |
| 2IAN * | 2IAL | 1KLU | II | 0.82 | 0.3 | rigid |
| 2NX5 * † | 2NW2 | 1ZSD | I | 1.16 | 0.38 | rigid |
| 2OI9 * | 1TCR | 3ERY | I | 1.1 | 0.41 | medium |
| 2PXY * | 2Z35 | 1K2D | II | 1.18 | 0.55 | medium |
| 2PYE * | 2PYF | 1S9W | I | 0.88 | 0.3 | rigid |
| 3DXA * † | 3DX9 | 3DX8 | I | 1.48 | 0.39 | rigid |
| 3H9S * | 3QH3 | 3H7B | I | 1.31 | 0.42 | medium |
| 3KPR * | 1KGC | 3KPQ | I | 1.37 | 0.55 | medium |
| 3KPS * | 1KGC | 3KPP | I | 1.31 | 0.48 | medium |
| 3PWP * | 3QH3 | 3PWL | I | 1.24 | 0.36 | rigid |
| 3QDG * † ‡ | 3QEU | 1JF1 | I | 0.91 | 0.31 | rigid |
| 3QDJ * ‡ | 3QEU | 2GUO | I | 0.94 | 0.28 | rigid |
| 3SJV * ‡ | 3SKN | 1M05 | I | 0.96 | 0.41 | medium |
| 3UTT * | 3UTP | 3UTQ | I | 0.75 | 0.4 | rigid |
| 3VXR | 3VXQ | 3VXN | I | 0.82 | 0.38 | rigid |
| 3VXS * † | 3VXQ | 3VXP | I | 0.89 | 0.35 | rigid |
| 3W0W * † ‡ | 3VXT | 3VXO | I | 0.94 | 0.42 | medium |
| 4JFD * † | 4JFH | 4JFP | I | 1.51 | 0.51 | medium |
| 4JFF | 4JFH | 1JF1 | I | 1.54 | 0.52 | medium |
| 5C07 | 3UTP | 5C0E | I | 0.57 | 0.15 | rigid |
| 5C08 | 3UTP | 5C0F | I | 0.65 | 0.43 | medium |
| 5C09 | 3UTP | 5C0G | I | 0.59 | 0.24 | rigid |
| 5C0A | 3UTP | 5N1Y | I | 0.5 | 0.3 | rigid |
| 5C0B | 3UTP | 5C0I | I | 0.59 | 0.25 | rigid |
| 5C0C | 3UTP | 5C0J | I | 0.64 | 0.35 | rigid |
| 5HHM | 2VLM | 5HHN | I | 1.42 | 0.51 | medium |
| 5HYJ | 3UTP | 5C0D | I | 0.55 | 0.34 | rigid |
| 5IVX | 5IW1 | 3ECB | I | 1.29 | 0.38 | rigid |
| 5NME | 5NMD | 2V2W | I | 1.07 | 0.34 | rigid |
| 5NMF * † | 5NMD | 5NMH | I | 1.05 | 0.38 | rigid |
| 5NMG * † ‡ | 5NMD | 5NMK | I | 1.07 | 0.42 | medium |
| 6AMU | 3QEU | 6AMT | I | 1.16 | 0.41 | medium |
| 6AVF * † | 6AT6 | 6AT5 | I | 1.95 | 0.72 | medium |
| 6CQL * † | 6CPH | 6CPN | II | 0.78 | 0.23 | rigid |
| 6CQQ * † | 6CPH | 6CPO | II | 0.83 | 0.21 | rigid |
| 6CQR * † | 6CPH | 6CQJ | II | 0.85 | 0.26 | rigid |
| 6EQB | 4JFH | 2GUO | I | 1.62 | 0.55 | medium |
* TCR docking cases that feature in the TCR3d database
† Cases that differ in I-RSMD score to those in the TCR3d database
‡ Cases that differ in docking difficulty category in the TCR3d database
- A structure for 2NX5_r_u with improved modelling of the missing atoms and residues in one of its CDR3 loops has been uploaded to the
imgtfolder, since the publication of Peacock and Chain (2021). Minor differences in the IRMSD and Fnon-nat values have been updated accordingly in the table above.