Al-Lazikani, B. &
Workman, P.
(2013)
Unpicking the Combination Lock for Mutant BRAF and RAS Melanomas. Cancer Discov, Vol.3(1),
pp.14-19,
Show Abstract
Summary: Large-scale, unbiased combinatorial drug screening has been used to identify effective genotype-selective therapeutic combinations that show promising activity in preclinical models of mutant BRAF andRAS melanoma that are resistant to the clinical BRAF inhibitor vemurafenib. Cancer Discov; 3(1); 14-9. ©2012 AACR.
Gonzalez de Castro, D.,
Clarke, PA.,
Al-Lazikani, B. &
Workman, P.
(2012)
Personalized Cancer Medicine: Molecular Diagnostics, Predictive biomarkers, and Drug Resistance. Clin Pharmacol Ther, Show Abstract
The progressive elucidation of the molecular pathogenesis of cancer has fueled the rational development of targeted drugs for patient populations stratified by genetic characteristics. Here we discuss general challenges relating to molecular diagnostics and describe predictive biomarkers for personalized cancer medicine. We also highlight resistance mechanisms for epidermal growth factor receptor (EGFR) kinase inhibitors in lung cancer. We envisage a future requiring the use of longitudinal genome sequencing and other omics technologies alongside combinatorial treatment to overcome cellular and molecular heterogeneity and prevent resistance caused by clonal evolution.Clinical Pharmacology & Therapeutics (2013); advance online publication 30 January 2013. doi:10.1038/clpt.2012.237.
Gaulton, A.,
Bellis, LJ.,
Bento, AP.,
Chambers, J.,
Davies, M.,
Hersey, A.,
Light, Y.,
McGlinchey, S.,
Michalovich, D.,
Al-Lazikani, B.,
et al.
(2012)
ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res, Vol.40(Database issue),
pp.D1100-D1107,
Full Text,
Show Abstract
ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compounds and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.
Halling-Brown, MD.,
Bulusu, KC.,
Patel, M.,
Tym, JE. &
Al-Lazikani, B.
(2012)
canSAR: an integrated cancer public translational research and drug discovery resource. Nucleic Acids Res, Vol.40(Database issue),
pp.D947-D956,
Full Text,
Show Abstract
canSAR is a fully integrated cancer research and drug discovery resource developed to utilize the growing publicly available biological annotation, chemical screening, RNA interference screening, expression, amplification and 3D structural data. Scientists can, in a single place, rapidly identify biological annotation of a target, its structural characterization, expression levels and protein interaction data, as well as suitable cell lines for experiments, potential tool compounds and similarity to known drug targets. canSAR has, from the outset, been completely use-case driven which has dramatically influenced the design of the back-end and the functionality provided through the interfaces. The Web interface at http://cansar.icr.ac.uk provides flexible, multipoint entry into canSAR. This allows easy access to the multidisciplinary data within, including target and compound synopses, bioactivity views and expert tools for chemogenomic, expression and protein interaction network data.
Al-Lazikani, B.,
Banerji, U. &
Workman, P.
(2012)
Combinatorial drug therapy for cancer in the post-genomic era. Nat Biotechnol, Vol.30(7),
pp.679-692,
Show Abstract
Over the past decade, whole genome sequencing and other 'omics' technologies have defined pathogenic driver mutations to which tumor cells are addicted. Such addictions, synthetic lethalities and other tumor vulnerabilities have yielded novel targets for a new generation of cancer drugs to treat discrete, genetically defined patient subgroups. This personalized cancer medicine strategy could eventually replace the conventional one-size-fits-all cytotoxic chemotherapy approach. However, the extraordinary intratumor genetic heterogeneity in cancers revealed by deep sequencing explains why de novo and acquired resistance arise with molecularly targeted drugs and cytotoxic chemotherapy, limiting their utility. One solution to the enduring challenge of polygenic cancer drug resistance is rational combinatorial targeted therapy.
Patel, MN.,
Halling-Brown, MD.,
Tym, JE.,
Workman, P. &
Al-Lazikani, B.
(2012)
Objective assessment of cancer genes for drug discovery. Nat Rev Drug Discov, Vol.12(1),
pp.35-50,
Show Abstract
Selecting the best targets is a key challenge for drug discovery, and achieving this effectively, efficiently and systematically is particularly important for prioritizing candidates from the sizeable lists of potential therapeutic targets that are now emerging from large-scale multi-omics initiatives, such as those in oncology. Here, we describe an objective, systematic, multifaceted computational assessment of biological and chemical space that can be applied to any human gene set to prioritize targets for therapeutic exploration. We use this approach to evaluate an exemplar set of 479 cancer-associated genes, reveal the tension between biological relevance and chemical tractability, and describe major gaps in available knowledge that could be addressed to aid objective decision-making. We also propose drug repurposing opportunities and identify potentially druggable cancer-associated proteins that have been poorly explored with regard to the discovery of small-molecule modulators, despite their biological relevance.
Orchard, S.,
Al-Lazikani, B.,
Bryant, S.,
Clark, D.,
Calder, E.,
Dix, I.,
Engkvist, O.,
Forster, M.,
Gaulton, A.,
Gilson, M.,
et al.
(2012)
Shouldn't enantiomeric purity be included in the 'minimum information about a bioactive entity? Response from the MIABE group. Nat Rev Drug Discov, Vol.11(9),
pp.730-,
Workman, P.,
Clarke, PA. &
Al-Lazikani, B.
(2012)
Personalized medicine: patient-predictive panel power. Cancer Cell, Vol.21(4),
pp.455-458,
Show Abstract
Two recent papers published in Nature demonstrate the power of systematic high-throughput pharmacologic profiling of very large, diverse, molecularly-characterized human cancer cell line panels to reveal linkages between genetic profile and targeted-drug sensitivity. Known oncogene addictions are confirmed while surprising complexities and biomarker relationships with clinical potential are revealed.
Bellis, LJ.,
Akhtar, R.,
Al-Lazikani, B.,
Atkinson, F.,
Bento, AP.,
Chambers, J.,
Davies, M.,
Gaulton, A.,
Hersey, A.,
Ikeda, K.,
et al.
(2011)
Collation and data-mining of literature bioactivity data for drug discovery. Biochem Soc Trans, Vol.39(5),
pp.1365-1370,
Show Abstract
The challenge of translating the huge amount of genomic and biochemical data into new drugs is a costly and challenging task. Historically, there has been comparatively little focus on linking the biochemical and chemical worlds. To address this need, we have developed ChEMBL, an online resource of small-molecule SAR (structure-activity relationship) data, which can be used to support chemical biology, lead discovery and target selection in drug discovery. The database contains the abstracted structures, properties and biological activities for over 700000 distinct compounds and in excess of more than 3 million bioactivity records abstracted from over 40000 publications. Additional public domain resources can be readily integrated into the same data model (e.g. PubChem BioAssay data). The compounds in ChEMBL are largely extracted from the primary medicinal chemistry literature, and are therefore usually 'drug-like' or 'lead-like' small molecules with full experimental context. The data cover a significant fraction of the discovery of modern drugs, and are useful in a wide range of drug design and discovery tasks. In addition to the compound data, ChEMBL also contains information for over 8000 protein, cell line and whole-organism 'targets', with over 4000 of those being proteins linked to their underlying genes. The database is searchable both chemically, using an interactive compound sketch tool, protein sequences, family hierarchies, SMILES strings, compound research codes and key words, and biologically, using a variety of gene identifiers, protein sequence similarity and protein families. The information retrieved can then be readily filtered and downloaded into various formats. ChEMBL can be accessed online at https://www.ebi.ac.uk/chembldb.
Orchard, S.,
Al-Lazikani, B.,
Bryant, S.,
Clark, D.,
Calder, E.,
Dix, I.,
Engkvist, O.,
Forster, M.,
Gaulton, A.,
Gilson, M.,
et al.
(2011)
Minimum information about a bioactive entity (MIABE). Nat Rev Drug Discov, Vol.10(9),
pp.661-669,
Show Abstract
Bioactive molecules such as drugs, pesticides and food additives are produced in large numbers by many commercial and academic groups around the world. Enormous quantities of data are generated on the biological properties and quality of these molecules. Access to such data - both on licensed and commercially available compounds, and also on those that fail during development - is crucial for understanding how improved molecules could be developed. For example, computational analysis of aggregated data on molecules that are investigated in drug discovery programmes has led to a greater understanding of the properties of successful drugs. However, the information required to perform these analyses is rarely published, and when it is made available it is often missing crucial data or is in a format that is inappropriate for efficient data-mining. Here, we propose a solution: the definition of reporting guidelines for bioactive entities - the Minimum Information About a Bioactive Entity (MIABE) - which has been developed by representatives of pharmaceutical companies, data resource providers and academic groups.
Suwaki, N.,
Vanhecke, E.,
Atkins, KM.,
Graf, M.,
Swabey, K.,
Huang, P.,
Schraml, P.,
Moch, H.,
Cassidy, AM.,
Brewer, D.,
et al.
(2011)
A HIF-regulated VHL-PTP1B-Src signaling axis identifies a therapeutic target in renal cell carcinoma. Sci Transl Med, Vol.3(85),
pp.85ra47-,
Full Text,
Show Abstract
Metastatic renal cell carcinoma (RCC) is a molecularly heterogeneous disease that is intrinsically resistant to chemotherapy and radiotherapy. Although therapies targeted to the molecules vascular endothelial growth factor and mammalian target of rapamycin have shown clinical effectiveness, their effects are variable and short-lived, underscoring the need for improved treatment strategies for RCC. Here, we used quantitative phosphoproteomics and immunohistochemical profiling of 346 RCC specimens and determined that Src kinase signaling is elevated in RCC cells that retain wild-type von Hippel-Lindau (VHL) protein expression. RCC cell lines and xenografts with wild-type VHL exhibited sensitivity to the Src inhibitor dasatinib, in contrast to cell lines that lacked the VHL protein, which were resistant. Forced expression of hypoxia-inducible factor (HIF) in RCC cells with wild-type VHL diminished Src signaling output by repressing transcription of the Src activator protein tyrosine phosphatase 1B (PTP1B), conferring resistance to dasatinib. Our results suggest that a HIF-regulated VHL-PTP1B-Src signaling pathway determines the sensitivity of RCC to Src inhibitors and that stratification of RCC patients with antibody-based profiling may identify patients likely to respond to Src inhibitors in RCC clinical trials.
Abad-Zapatero, C.,
Perišić, O.,
Wass, J.,
Bento, AP.,
Overington, J.,
Al-Lazikani, B. &
Johnson, ME.
(2010)
Ligand efficiency indices for an effective mapping of chemico-biological space: the concept of an atlas-like representation. Drug Discov Today, Vol.15(19-20),
pp.804-811,
Show Abstract
We propose a numerical framework that permits an effective atlas-like representation of chemico-biological space based on a series of Cartesian planes mapping the ligands with the corresponding targets connected by an affinity parameter (K(i) or related). The numerical framework is derived from the concept of ligand efficiency indices, which provide a natural coordinate system combining the potency toward the target (biological space) with the physicochemical properties of the ligand (chemical space). This framework facilitates navigation in the multidimensional drug discovery space using map-like representations based on pairs of combined variables related to the efficiency of the ligands per Dalton (molecular weight or number of non-hydrogen atoms) and per unit of polar surface area (or number of polar atoms).
Berriman, M.,
Haas, BJ.,
LoVerde, PT.,
Wilson, RA.,
Dillon, GP.,
Cerqueira, GC.,
Mashiyama, ST.,
Al-Lazikani, B.,
Andrade, LF.,
Ashton, PD.,
et al.
(2009)
The genome of the blood fluke Schistosoma mansoni. Nature, Vol.460(7253),
pp.352-358,
Full Text,
Show Abstract
Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. Here we present analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and new families of micro-exon genes that undergo frequent alternative splicing. As the first sequenced flatworm, and a representative of the Lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, and the identification of membrane receptors, ion channels and more than 300 proteases provide new insights into the biology of the life cycle and new targets. Bioinformatics approaches have identified metabolic chokepoints, and a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.
Agüero, F.,
Al-Lazikani, B.,
Aslett, M.,
Berriman, M.,
Buckner, FS.,
Campbell, RK.,
Carmona, S.,
Carruthers, IM.,
Chan, AW.,
Chen, F.,
et al.
(2008)
Genomic-scale prioritization of drug targets: the TDR Targets database. Nat Rev Drug Discov, Vol.7(11),
pp.900-907,
Full Text,
Show Abstract
The increasing availability of genomic data for pathogens that cause tropical diseases has created new opportunities for drug discovery and development. However, if the potential of such data is to be fully exploited, the data must be effectively integrated and be easy to interrogate. Here, we discuss the development of the TDR Targets database (http://tdrtargets.org), which encompasses extensive genetic, biochemical and pharmacological data related to tropical disease pathogens, as well as computationally predicted druggability for potential targets and compound desirability information. By allowing the integration and weighting of this information, this database aims to facilitate the identification and prioritization of candidate drug targets for pathogens.
Al-Lazikani, B.,
Hill, EE. &
Morea, V.
(2008)
Protein structure prediction. Methods Mol Biol, Vol.453
pp.33-85,
ISSN: 1064-3745,
Show Abstract
Protein structure prediction has matured over the past few years to the point that even fully automated methods can provide reasonably accurate three-dimensional models of protein structures. However, until now it has not been possible to develop programs able to perform as well as human experts, who are still capable of systematically producing better models than automated servers. Although the precise details of protein structure prediction procedures are different for virtually every protein, this chapter describes a generic procedure to obtain a three-dimensional protein model starting from the amino acid sequence. This procedure takes advantage both of programs and servers that have been shown to perform best in blind tests and of the current knowledge about evolutionary relationships between proteins, gained from detailed analyses of protein sequence, structure, and functional data.
Overington, JP.,
Al-Lazikani, B. &
Hopkins, AL.
(2006)
How many drug targets are there? Nat Rev Drug Discov, Vol.5(12),
pp.993-996,
ISSN: 1474-1776,
Show Abstract
For the past decade, the number of molecular targets for approved drugs has been debated. Here, we reconcile apparently contradictory previous reports into a comprehensive survey, and propose a consensus number of current drug targets for all classes of approved therapeutic drugs. One striking feature is the relatively constant historical rate of target innovation (the rate at which drugs against new targets are launched); however, the rate of developing drugs against new families is significantly lower. The recent approval of drugs that target protein kinases highlights two additional trends: an emerging realization of the importance of polypharmacology, and also the power of a gene-family-led approach in generating novel and important therapies.
Marks, DJ.,
Harbord, MW.,
MacAllister, R.,
Rahman, FZ.,
Young, J.,
Al-Lazikani, B.,
Lees, W.,
Novelli, M.,
Bloom, S. &
Segal, AW.
(2006)
Defective acute inflammation in Crohn's disease: a clinical investigation. Lancet, Vol.367(9511),
pp.668-678,
Show Abstract
The cause of Crohn's disease has not been mechanistically proven. We tested the hypothesis that the disease is a form of immunodeficiency caused by impaired innate immunity.
Freilich, S.,
Spriggs, RV.,
George, RA.,
Al-Lazikani, B.,
Swindells, M. &
Thornton, JM.
(2005)
The complement of enzymatic sets in different species. J Mol Biol, Vol.349(4),
pp.745-763,
ISSN: 0022-2836,
Show Abstract
We present here a comprehensive analysis of the complement of enzymes in a large variety of species. As enzymes are a relatively conserved group there are several classification systems available that are common to all species and link a protein sequence to an enzymatic function. Enzymes are therefore an ideal functional group to study the relationship between sequence expansion, functional divergence and phenotypic changes. By using information retrieved from the well annotated SWISS-PROT database together with sequence information from a variety of fully sequenced genomes and information from the EC functional scheme we have aimed here to estimate the fraction of enzymes in genomes, to determine the extent of their functional redundancy in different domains of life and to identify functional innovations and lineage specific expansions in the metazoa lineage. We found that prokaryote and eukaryote species differ both in the fraction of enzymes in their genomes and in the pattern of expansion of their enzymatic sets. We observe an increase in functional redundancy accompanying an increase in species complexity. A quantitative assessment was performed in order to determine the degree of functional redundancy in different species. Finally, we report a massive expansion in the number of mammalian enzymes involved in signalling and degradation.
George, RA.,
Spriggs, RV.,
Bartlett, GJ.,
Gutteridge, A.,
MacArthur, MW.,
Porter, CT.,
Al-Lazikani, B.,
Thornton, JM. &
Swindells, MB.
(2005)
Effective function annotation through catalytic residue conservation. Proc Natl Acad Sci U S A, Vol.102(35),
pp.12299-12304,
ISSN: 0027-8424,
Full Text,
Show Abstract
Because of the extreme impact of genome sequencing projects, protein sequences without accompanying experimental data now dominate public databases. Homology searches, by providing an opportunity to transfer functional information between related proteins, have become the de facto way to address this. Although a single, well annotated, close relationship will often facilitate sufficient annotation, this situation is not always the case, particularly if mutations are present in important functional residues. When only distant relationships are available, the transfer of function information is more tenuous, and the likelihood of encountering several well annotated proteins with different functions is increased. The consequence for a researcher is a range of candidate functions with little way of knowing which, if any, are correct. Here, we address the problem directly by introducing a computational approach to accurately identify and segregate related proteins into those with a functional similarity and those where function differs. This approach should find a wide range of applications, including the interpretation of genomics/proteomics data and the prioritization of targets for high-throughput structure determination. The method is generic, but here we concentrate on enzymes and apply high-quality catalytic site data. In addition to providing a series of comprehensive benchmarks to show the overall performance of our approach, we illustrate its utility with specific examples that include the correct identification of haptoglobin as a nonenzymatic relative of trypsin, discrimination of acid-d-amino acid ligases from a much larger ligase pool, and the successful annotation of BioH, a structural genomics target.
George, RA.,
Spriggs, RV.,
Thornton, JM.,
Al-Lazikani, B. &
Swindells, MB.
(2004)
SCOPEC: a database of protein catalytic domains. Bioinformatics, Vol.20 Suppl 1
pp.i130-i136,
Show Abstract
Domains are the units of protein structure, function and evolution. It is therefore essential to utilize knowledge of domains when studying the evolution of function, or when assigning function to genome sequence data. For this purpose, we have developed a database of catalytic domains, SCOPEC, by combining structural domain information from SCOP, full-length sequence information from Swiss-Prot, and verified functional information from the Enzyme Classification (EC) database. Two major problems need to be overcome to create a database of domain-function relationships; (1) for sequences, EC numbers are typically assigned to whole sequences rather than the functional unit, and (2) The Protein Data Bank (PDB) structures elucidated from a larger multi-domain protein will often have EC annotation although the relevant catalytic domain may lie elsewhere.
Sheinerman, FB.,
Al-Lazikani, B. &
Honig, B.
(2003)
Sequence, structure and energetic determinants of phosphopeptide selectivity of SH2 domains. J Mol Biol, Vol.334(4),
pp.823-841,
ISSN: 0022-2836,
Show Abstract
Here, we present an approach for the prediction of binding preferences of members of a large protein family for which structural information for a number of family members bound to a substrate is available. The approach involves a number of steps. First, an accurate multiple alignment of sequences of all members of a protein family is constructed on the basis of a multiple structural superposition of family members with known structure. Second, the methods of continuum electrostatics are used to characterize the energetic contribution of each residue in a protein to the binding of its substrate. Residues that make a significant contribution are mapped onto the protein sequence and are used to define a "binding site signature" for the complex being considered. Third, sequences whose structures have not been determined are checked to see if they have binding-site signatures similar to one of the known complexes. Predictions of binding affinity to a given substrate are based on similarities in binding-site signature. An important component of the approach is the introduction of a context-specific substitution matrix suitable for comparison of binding-site residues. The methods are applied to the prediction of phosphopeptide selectivity of SH2 domains. To this end, the energetic roles of all protein residues in 17 different complexes of SH2 domains with their cognate targets are analyzed. The total number of residues that make significant contributions to binding is found to vary from nine to 19 in different complexes. These energetically important residues are found to contribute to binding through a variety of mechanisms, involving both electrostatic and hydrophobic interactions. Binding-site signatures are found to involve residues in different positions in SH2 sequences, some of them as far as 9A away from a bound peptide. Surprisingly, similarities in the signatures of different domains do not correlate with whole-domain sequence identities unless the latter is greater than 50%. An extensive comparison with the optimal binding motifs determined by peptide library experiments, as well as other experimental data indicate that the similarity in binding preferences of different SH2 domains can be deduced on the basis of their binding-site signatures. The analysis provides a rationale for the empirically derived classification of SH2 domains described by Songyang & Cantley, in that proteins in the same group are found to have similar residues at positions important for binding. Confident predictions of binding preference can be made for about 85% of SH2 domain sequences found in SWISSPROT. The approach described in this work is quite general and can, in principle, be used to analyze binding preferences of members of large protein families for which structural information for a number of family members is available. It also offers a strategy for predicting cross-reactivity of compounds designed to bind to a particular target, for example in structure-based drug design.
Al-Lazikani, B.,
Jung, J.,
Xiang, Z. &
Honig, B.
(2001)
Protein structure prediction. Curr Opin Chem Biol, Vol.5(1),
pp.51-56,
ISSN: 1367-5931,
Show Abstract
The prediction of protein structure, based primarily on sequence and structure homology, has become an increasingly important activity. Homology models have become more accurate and their range of applicability has increased. Progress has come, in part, from the flood of sequence and structure information that has appeared over the past few years, and also from improvements in analysis tools. These include profile methods for sequence searches, the use of three-dimensional structure information in sequence alignment and new homology modeling tools, specifically in the prediction of loop and side-chain conformations. There have also been important advances in understanding the physical chemical basis of protein stability and the corresponding use of physical chemical potential functions to identify correctly folded from incorrectly folded protein conformations.
Al-Lazikani, B.,
Sheinerman, FB. &
Honig, B.
(2001)
Combining multiple structure and sequence alignments to improve sequence detection and alignment: application to the SH2 domains of Janus kinases. Proc Natl Acad Sci U S A, Vol.98(26),
pp.14796-14801,
ISSN: 0027-8424,
Full Text,
Show Abstract
In this paper, an approach is described that combines multiple structure alignments and multiple sequence alignments to generate sequence profiles for protein families. First, multiple sequence alignments are generated from sequences that are closely related to each sequence of known three-dimensional structure. These alignments then are merged through a multiple structure alignment of family members of known structure. The merged alignment is used to generate a Hidden Markov Model for the family in question. The Hidden Markov Model can be used to search for new family members or to improve alignments for distantly related family members that already have been identified. Application of a profile generated for SH2 domains indicates that the Janus family of nonreceptor protein tyrosine kinases contains SH2 domains. This conclusion is strongly supported by the results of secondary structure-prediction programs, threading calculations, and the analysis of comparative models generated for these domains. One of the Janus kinases, human TYK2, has an SH2 domain that contains a histidine instead of the conserved arginine at the key phosphotyrosine-binding position, betaB5. Calculations of the pK(a) values of the betaB5 arginines in a number of SH2 domains and of the betaB5 histidine in a homology model of TYK2 suggest that this histidine is likely to be neutral around pH 7, thus indicating that it may have lost the ability to bind phosphotyrosine. If this indeed is the case, TYK2 may contain a domain with an SH2 fold that has a modified binding specificity.
Sheinerman, FB.,
Al-Lazikani, B. &
Honig, B.
(2001)
Predicting phosphopeptide selectivity of SH2 domains BIOPHYS J, Vol.80(1),
pp.333A-333A,
ISSN: 0006-3495,
Al-Lazikani, B.,
Lesk, AM. &
Chothia, C.
(2000)
Canonical structures for the hypervariable regions of T cell alphabeta receptors. J Mol Biol, Vol.295(4),
pp.979-995,
ISSN: 0022-2836,
Show Abstract
T cell alphabeta receptors have binding sites for peptide-MHC complexes formed by six hypervariable regions. Analysis of the six atomic structures known for Valpha and for Vbeta domains shows that their first and second hypervariable regions have one of three or four different main-chain conformations (canonical structures). Six of these canonical structures have the same conformation in complexes with peptide-MHC complexes, the free receptor and/or in an isolated V domain. Thus, for at least the first and second hypervariable regions in the currently known structures, the conformation of the canonical structures is well defined in the free state and is conserved on formation of complexes with peptide-MHC. We identified the key residues that are mainly responsible for the conformation of each canonical structure. The first and second hypervariable regions of Valpha and Vbeta domains are encoded by the germline V segments. Humans have 37 functional Valpha segments and 47 Vbeta segments, and mice have 20 Vbeta segments. Inspection of the size of their hypervariable regions, and of sites that contain key residues, indicates that close to 70 % of Valpha segments and 90 % of Vbeta segments have hypervariable regions with a conformation of one of the known canonical structures. The alpha and beta V gene segments in both humans and mice have only a few combinations of different canonical structure in their first and second hypervariable regions. In human Vbeta domains, the number of different sequences with these canonical structure combinations is larger than in mice, whilst for Valpha domains it is probably smaller.
Al-Lazikani, B.,
Lesk, AM. &
Chothia, C.
(1997)
Standard conformations for the canonical structures of immunoglobulins. J Mol Biol, Vol.273(4),
pp.927-948,
ISSN: 0022-2836,
Show Abstract
A comparative analysis of the main-chain conformation of the L1, L2, L3, H1 and H2 hypervariable regions in 17 immunoglobulin structures that have been accurately determined at high resolution is described. This involves 79 hypervariable regions in all. We also analysed a part of the H3 region in 12 of the 15 VH domains considered here. On the basis of the residues at key sites the 79 hypervariable regions can be assigned to one of 18 different canonical structures. We show that 71 of these hypervariable regions have a conformation that is very close to what can be defined as a "standard" conformation of each canonical structure. These standard conformations are described in detail. The other eight hypervariable regions have small deviations from the standard conformations that, in six cases, involve only the rotation of a single peptide group. Most H3 hypervariable regions have the same conformation in the part that is close to the framework and the details of this conformation are also described here.