Lan, C., Li, J., Huang, X., Heindl, A., Wang, Y., Yan, S. & Yuan, Y.
(2019). Stromal cell ratio based on automated image analysis as a predictor for platinum-resistant recurrent ovarian cancer. Bmc cancer,
Barry, P., Vatsiou, A., Spiteri, I., Nichol, D., Cresswell, G.D., Acar, A., Trahearn, N., Hrebien, S., Garcia-Murillas, I., Chkhaidze, K., et al.
(2018). The Spatiotemporal Evolution of Lymph Node Spread in Early Breast Cancer. Clinical cancer research,
Zhang, A.W., McPherson, A., Milne, K., Kroeger, D.R., Hamilton, P.T., Miranda, A., Funnell, T., Little, N., de Souza, C.P., Laan, S., et al.
(2018). Interfaces of Malignant and Immunologic Clonal Dynamics in Ovarian Cancer. Cell,
Heindl, A., Khan, A.M., Rodrigues, D.N., Eason, K., Sadanandam, A., Orbegoso, C., Punta, M., Sottoriva, A., Lise, S., Banerjee, S., et al.
(2018). Microenvironmental niche divergence shapes BRCA1-dysregulated ovarian cancer morphological plasticity. Nature communications,
Maley, C.C., Aktipis, A., Graham, T.A., Sottoriva, A., Boddy, A.M., Janiszewska, M., Silva, A.S., Gerlinger, M., Yuan, Y., Pienta, K.J., et al.
(2017). Classifying the evolutionary and ecological features of neoplasms. Nature reviews cancer,
Heindl, A., Lan, C., Rodrigues, D.N., Koelble, K. & Yuan, Y.
(2016). Similarity and diversity of the tumor microenvironment in multiple metastases: critical implications for overall and progression-free survival of high-grade serous ovarian cancer. Oncotarget,
Savage, R.S. & Yuan, Y.
(2016). Predicting chemoinsensitivity in breast cancer with ’omics/digital pathology data fusion. Royal society open science,
Hill, D.K., Kim, E., Teruel, J.R., Jamin, Y., Widerøe, M., Søgaard, C.D., Størkersen, Ø., Rodrigues, D.N., Heindl, A., Yuan, Y., et al.
(2016). Diffusion-weighted MRI for early detection and characterization of prostate cancer in the transgenic adenocarcinoma of the mouse prostate model. Journal of magnetic resonance imaging,
(2016). Spatial Heterogeneity in the Tumor Microenvironment. Cold spring harbor perspectives in medicine,
Nawaz, S. & Yuan, Y.
(2016). Computational pathology: Exploring the spatial dimension of tumor ecology. Cancer letters,
Locard-Paulet, M., Lim, L., Veluscek, G., McMahon, K., Sinclair, J., van Weverwijk, A., Worboys, J.D., Yuan, Y., Isacke, C.M. & Jørgensen, C., et al.
(2016). Phosphoproteomic analysis of interacting tumor and endothelial cells identifies regulatory mechanisms of transendothelial migration. Science signaling,
Khan, A.M. & Yuan, Y.
(2016). Biopsy variability of lymphocytic infiltration in breast cancer subtypes and the ImmunoSkew score. Scientific reports,
Vollan, H.K., Rueda, O.M., Chin, S.-., Curtis, C., Turashvili, G., Shah, S., Lingjærde, O.C., Yuan, Y., Ng, C.K., Dunning, M.J., et al.
(2015). A tumor DNA complex aberration index is an independent predictor of survival in breast and ovarian cancer. Mol oncol,
Complex focal chromosomal rearrangements in cancer genomes, also called "firestorms", can be scored from DNA copy number data. The complex arm-wise aberration index (CAAI) is a score that captures DNA copy number alterations that appear as focal complex events in tumors, and has potential prognostic value in breast cancer. This study aimed to validate this DNA-based prognostic index in breast cancer and test for the first time its potential prognostic value in ovarian cancer. Copy number alteration (CNA) data from 1950 breast carcinomas (METABRIC cohort) and 508 high-grade serous ovarian carcinomas (TCGA dataset) were analyzed. Cases were classified as CAAI positive if at least one complex focal event was scored. Complex alterations were frequently localized on chromosome 8p (n = 159), 17q (n = 176) and 11q (n = 251). CAAI events on 11q were most frequent in estrogen receptor positive (ER+) cases and on 17q in estrogen receptor negative (ER-) cases. We found only a modest correlation between CAAI and the overall rate of genomic instability (GII) and number of breakpoints (r = 0.27 and r = 0.42, p < 0.001). Breast cancer specific survival (BCSS), overall survival (OS) and ovarian cancer progression free survival (PFS) were used as clinical end points in Cox proportional hazard model survival analyses. CAAI positive breast cancers (43%) had higher mortality: hazard ratio (HR) of 1.94 (95%CI, 1.62-2.32) for BCSS, and of 1.49 (95%CI, 1.30-1.71) for OS. Representations of the 70-gene and the 21-gene predictors were compared with CAAI in multivariable models and CAAI was independently significant with a Cox adjusted HR of 1.56 (95%CI, 1.23-1.99) for ER+ and 1.55 (95%CI, 1.11-2.18) for ER- disease. None of the expression-based predictors were prognostic in the ER- subset. We found that a model including CAAI and the two expression-based prognostic signatures outperformed a model including the 21-gene and 70-gene signatures but excluding CAAI. Inclusion of CAAI in the clinical prognostication tool PREDICT significantly improved its performance. CAAI positive ovarian cancers (52%) also had worse prognosis: HRs of 1.3 (95%CI, 1.1-1.7) for PFS and 1.3 (95%CI, 1.1-1.6) for OS. This study validates CAAI as an independent predictor of survival in both ER+ and ER- breast cancer and reveals a significant prognostic value for CAAI in high-grade serous ovarian cancer..
Heindl, A., Nawaz, S. & Yuan, Y.
(2015). Mapping spatial heterogeneity in the tumor microenvironment: a new era for digital pathology. Laboratory investigation,
(2015). Modelling the spatial heterogeneity and molecular correlates of lymphocytic infiltration in triple-negative breast cancer. Journal of the royal society interface,
Nawaz, S., Heindl, A., Koelble, K. & Yuan, Y.
(2015). Beyond immune density: critical role of spatial heterogeneity in estrogen receptor-negative breast cancer. Modern pathology,
Mardakheh, F.K., Paul, A., Kümper, S., Sadok, A., Paterson, H., Mccarthy, A., Yuan, Y. & Marshall, C.J.
(2015). Global Analysis of mRNA, Translation, and Protein Localization: Local Translation Is a Key Regulator of Cell Protrusions. Developmental cell,
Maley, C.C., Koelble, K., Natrajan, R., Aktipis, A. & Yuan, Y.
(2015). An ecological measure of immune-cancer colocalization as a prognostic factor for breast cancer. Breast cancer research,
Lan, C., Heindl, A., Huang, X., Xi, S., Banerjee, S., Liu, J. & Yuan, Y.
(2015). Quantitative histology analysis of the ovarian tumour microenvironment. Scientific reports,
Jäger, R., Migliorini, G., Henrion, M., Kandaswamy, R., Speedy, H.E., Heindl, A., Whiffin, N., Carnicer, M.J., Broome, L., Dryden, N., et al.
(2015). Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. Nat commun,
Multiple regulatory elements distant from their targets on the linear genome can influence the expression of a single gene through chromatin looping. Chromosome conformation capture implemented in Hi-C allows for genome-wide agnostic characterization of chromatin contacts. However, detection of functional enhancer-promoter interactions is precluded by its effective resolution that is determined by both restriction fragmentation and sensitivity of the experiment. Here we develop a capture Hi-C (cHi-C) approach to allow an agnostic characterization of these physical interactions on a genome-wide scale. Single-nucleotide polymorphisms associated with complex diseases often reside within regulatory elements and exert effects through long-range regulation of gene expression. Applying this cHi-C approach to 14 colorectal cancer risk loci allows us to identify key long-range chromatin interactions in cis and trans involving these loci. .
Worboys, J.D., Sinclair, J., Yuan, Y. & Jorgensen, C.
(2014). Systematic evaluation of quantotypic peptides for targeted analysis of the human kinome. Nature methods,
Yuan, Y., Curtis, C., Caldas, C. & Markowetz, F.
(2012). A sparse regulatory network of copy-number driven gene expression reveals putative breast cancer oncogenes. Ieee/acm trans comput biol bioinform,
UNLABELLED: Copy number aberrations are recognized to be important in cancer as they may localize to regions harboring oncogenes or tumor suppressors. Such genomic alterations mediate phenotypic changes through their impact on expression. Both cis- and transacting alterations are important since they may help to elucidate putative cancer genes. However, amidst numerous passenger genes, trans-effects are less well studied due to the computational difficulty in detecting weak and sparse signals in the data, and yet may influence multiple genes on a global scale. We propose an integrative approach to learn a sparse interaction network of DNA copy-number regions with their downstream transcriptional targets in breast cancer. With respect to goodness of fit on both simulated and real data, the performance of sparse network inference is no worse than other state-of-the-art models but with the advantage of simultaneous feature selection and efficiency. The DNA-RNA interaction network helps to distinguish copy-number driven expression alterations from those that are copy-number independent. Further, our approach yields a quantitative copy-number dependency score, which distinguishes cis- versus trans-effects. When applied to a breast cancer data set, numerous expression profiles were impacted by cis-acting copy-number alterations, including several known oncogenes such as GRB7, ERBB2, and LSM1. Several trans-acting alterations were also identified, impacting genes such as ADAM2 and BAGE, which warrant further investigation. AVAILABILITY: An R package named lol is available from www.markowetzlab.org/software/lol.html..
Curtis, C., Shah, S.P., Chin, S.-., Turashvili, G., Rueda, O.M., Dunning, M.J., Speed, D., Lynch, A.G., Samarajiwa, S., Yuan, Y., et al.
(2012). The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature,
The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in ~40% of genes, with the landscape dominated by cis- and trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome..
Yuan, Y., Failmezger, H., Rueda, O.M., Ali, H.R., Gräf, S., Chin, S.-., Schwarz, R.F., Curtis, C., Dunning, M.J., Bardwell, H., et al.
(2012). Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling. Sci transl med,
Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor-negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples..
Yuan, Y., Rueda, O.M., Curtis, C. & Markowetz, F.
(2011). Penalized regression elucidates aberration hotspots mediating subtype-specific transcriptional responses in breast cancer. Bioinformatics,
MOTIVATION: Copy number alterations (CNAs) associated with cancer are known to contribute to genomic instability and gene deregulation. Integrating CNAs with gene expression helps to elucidate the mechanisms by which CNAs act and to identify the transcriptional downstream targets of CNAs. Such analyses can help to sort functional driver events from the many accompanying passenger alterations. However, the way CNAs affect gene expression can vary in different cellular contexts, for example between different subtypes of the same cancer. Thus, it is important to develop computational approaches capable of inferring differential connectivity of regulatory networks in different cellular contexts. RESULTS: We propose a statistical deregulation model that integrates copy number and expression data of different disease subtypes to jointly model common and differential regulatory relationships. Our model not only identifies CNAs driving gene expression changes, but at the same time also predicts differences in regulation that distinguish one cancer subtype from the other. We implement our model in a penalized regression framework and demonstrate in a simulation study the feasibility and accuracy of our approach. Subsequently, we show that this model can identify both known and novel aspects of cross-talk between the ER and NOTCH pathways in ER-negative-specific deregulations, when compared with ER-positive breast cancer. This flexible model can be applied on other modalities such as methylation or microRNA and expression to disentangle cancer signaling pathways. AVAILABILITY: The Bioconductor-compliant R package DANCE is available from www.markowetzlab.org/software/ CONTACT: [email protected]; [email protected]
Yuan, Y., Li, C.-. & Windram, O.
(2011). Directed Partial Correlation: Inferring Large-Scale Gene Regulatory Network through Induced Topology Disruptions. Plos one,
Yuan, Y., Savage, R.S. & Markowetz, F.
(2011). Patient-specific data fusion defines prognostic cancer subtypes. Plos comput biol,
Different data types can offer complementary perspectives on the same biological phenomenon. In cancer studies, for example, data on copy number alterations indicate losses and amplifications of genomic regions in tumours, while transcriptomic data point to the impact of genomic and environmental events on the internal wiring of the cell. Fusing different data provides a more comprehensive model of the cancer cell than that offered by any single type. However, biological signals in different patients exhibit diverse degrees of concordance due to cancer heterogeneity and inherent noise in the measurements. This is a particularly important issue in cancer subtype discovery, where personalised strategies to guide therapy are of vital importance. We present a nonparametric Bayesian model for discovering prognostic cancer subtypes by integrating gene expression and copy number variation data. Our model is constructed from a hierarchy of Dirichlet Processes and addresses three key challenges in data fusion: (i) To separate concordant from discordant signals, (ii) to select informative features, (iii) to estimate the number of disease subtypes. Concordance of signals is assessed individually for each patient, giving us an additional level of insight into the underlying disease structure. We exemplify the power of our model in prostate cancer and breast cancer and show that it outperforms competing methods. In the prostate cancer data, we identify an entirely new subtype with extremely poor survival outcome and show how other analyses fail to detect it. In the breast cancer data, we find subtypes with superior prognostic value by using the concordant results. These discoveries were crucially dependent on our model's ability to distinguish concordant and discordant signals within each patient sample, and would otherwise have been missed. We therefore demonstrate the importance of taking a patient-specific approach, using highly-flexible nonparametric Bayesian methods..
Li, C.-., Yuan, Y. & Wilson, R.
(2008). An unsupervised conditional random fields approach for clustering gene expression time series. Bioinformatics,
MOTIVATION: There is a growing interest in extracting statistical patterns from gene expression time-series data, in which a key challenge is the development of stable and accurate probabilistic models. Currently popular models, however, would be computationally prohibitive unless some independence assumptions are made to describe large-scale data. We propose an unsupervised conditional random fields (CRF) model to overcome this problem by progressively infusing information into the labelling process through a small variable voting pool. RESULTS: An unsupervised CRF model is proposed for efficient analysis of gene expression time series and is successfully applied to gene class discovery and class prediction. The proposed model treats each time series as a random field and assigns an optimal cluster label to each time series, so as to partition the time series into clusters without a priori knowledge about the number of clusters and the initial centroids. Another advantage of the proposed method is the relaxation of independence assumptions..
Yuan, Y., Li, C.-. & Wilson, R.
(2008). Partial mixture model for tight clustering of gene expression time-course. Bmc bioinformatics,
BACKGROUND: Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to this area of research. On the other hand, there has been extensive use of maximum likelihood techniques for model parameter estimation. By contrast, the minimum distance estimator has been largely ignored. RESULTS: In this paper we show the inherent robustness of the minimum distance estimator that makes it a powerful tool for parameter estimation in model-based time-course clustering. To apply minimum distance estimation, a partial mixture model that can naturally incorporate replicate information and allow scattered genes is formulated. We provide experimental results of simulated data fitting, where the minimum distance estimator demonstrates superior performance to the maximum likelihood estimator. Both biological and statistical validations are conducted on a simulated dataset and two real gene expression datasets. Our proposed partial regression clustering algorithm scores top in Gene Ontology driven evaluation, in comparison with four other popular clustering algorithms. CONCLUSION: For the first time partial mixture model is successfully extended to time-course data analysis. The robustness of our partial regression clustering algorithm proves the suitability of the combination of both partial mixture model and minimum distance estimator in this field. We show that tight clustering not only is capable to generate more profound understanding of the dataset under study well in accordance to established biological knowledge, but also presents interesting new hypotheses during interpretation of clustering results. In particular, we provide biological evidences that scattered genes can be relevant and are interesting subjects for study, in contrast to prevailing opinion..
Yuan, Y. & Li, C.-.
(2008). A Bayes random field approach for integrative large-scale regulatory network analysis. J integr bioinform,
We present a Bayes-Random Fields framework which is capable of integrating unlimited data sources for discovering relevant network architecture of large-scale networks. The random field potential function is designed to impose a cluster constraint, teamed with a full Bayesian approach for incorporating heterogenous data sets. The probabilistic nature of our framework facilitates robust analysis in order to minimize the influence of noise inherent in the data on the inferred structure in a seamless and coherent manner. This is later proved in its applications to both large-scale synthetic data sets and Saccharomyces Cerevisiae data sets. The analytical and experimental results reveal the varied characteristic of different types of data and refelct their discriminative ability in terms of identifying direct gene interactions..
Li, C.-. & Yuan, Y.
(2006). Digital watermarking scheme exploiting nondeterministic dependence for image authentication. Optical engineering,
Natrajan, R., Sailem, H., Mardakheh, F.K., Arias Garcia, M., Tape, C.J., Dowsett, M., Bakal, C. & Yuan, Y.
Microenvironmental Heterogeneity Parallels Breast Cancer Progression: A Histology–Genomic Integration Analysis. Plos medicine,
Heindl, A., Sestak, I., Naidoo, K., Cuzick, J., Dowsett, M. & Yuan, Y.
Relevance of spatial heterogeneity of immune infiltration for predicting risk of recurrence after endocrine therapy of ER+ breast cancer. Journal of the national cancer institute,
Naidoo, K., Wai, P., Maguire, S., Daley, F., Haider, S., Kriplani, D., Campbell, J., Mirza, H., Grigoriadis, A., Tutt, A., et al.
Evaluation of CDK12 Protein Expression as a Potential Novel Biomarker for DNA Damage Response Targeted Therapies in Breast Cancer. Molecular cancer therapeutics,
Hill, D.K., Heindl, A., Zormpas-Petridis, K., Collins, D.J., Euceda, L.R., Rodrigues, D.N., Moestue, S.A., Jamin, Y., Koh, D.-., Yuan, Y., et al.
Non-Invasive Prostate Cancer Characterization with Diffusion-Weighted MRI: Insight from In silico Studies of a Transgenic Mouse Model. Frontiers in oncology,
Barry, P., Vatsiou, A., Spiteri Sagastume, I., Nichol, D., Cresswell, G., Acar, A., Trahearn, N., Hrebien, S., Garcia-Murillas, I., Chkhaidze, K., et al.
The spatio-temporal evolution of lymph node spread in early breast cancer. Clinical cancer research,
MRI imaging of the hemodynamic vasculature of neuroblastoma predicts response to anti-angiogenic treatment. Cancer research,
Li, J., Zormpas-Petridis, K., Boult, J., Reeves, E., Chesler, L., Jones, C., Bamber, J., Yuan, Y., Jamin, Y. & Robinson, S., et al.
Investigating the Contribution of Collagen to the Tumor Biomechanical Phenotype with Non-invasive Magnetic Resonance Elastography. Cancer research,
Zormpas-Petridis, K., Failmezger, H., Raza, S.E., Roxanis, I., Jamin, Y. & Yuan, Y.
Superpixel-Based Conditional Random Fields (SuperCRF): Incorporating Global and Local Context for Enhanced Deep Learning in Melanoma Histopathology. Frontiers in oncology,
Booth, T.C., Larkin, T.J., Yuan, Y., Kettunen, M.I., Dawson, S.N., Scoffings, D., Canuto, H.C., Vowler, S.L., Kirschenlohr, H., Hobson, M.P., et al.
Analysis of heterogeneity in T2-weighted MR images can differentiate pseudoprogression from progression in glioblastoma. Plos one,