ϟ

Jill P. Mesirov

Here are all the papers by Jill P. Mesirov that you can download and read on OA.mg.
Jill P. Mesirov’s last known institution is . Download Jill P. Mesirov PDFs here.

Claim this Profile →
DOI: 10.1073/pnas.0506580102
2005
Cited 39,052 times
Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles
Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
DOI: 10.1038/35057062
2001
Cited 21,735 times
Initial sequencing and analysis of the human genome
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
DOI: 10.1038/nbt.1754
2011
Cited 11,601 times
Integrative genomics viewer
Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.
DOI: 10.1126/science.286.5439.531
1999
Cited 10,930 times
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring
Although cancer classification has improved over the past 30 years, there has been no general approach for identifying new cancer classes (class discovery) or for assigning tumors to known classes (class prediction). Here, a generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case. A class discovery procedure automatically discovered the distinction between acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) without previous knowledge of these classes. An automatically derived class predictor was able to determine the class of new leukemia cases. The results demonstrate the feasibility of cancer classification based solely on gene expression monitoring and suggest a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
DOI: 10.1038/ng1180
2003
Cited 8,289 times
PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes
DOI: 10.1016/j.cels.2015.12.004
2015
Cited 7,793 times
The Molecular Signatures Database Hallmark Gene Set Collection
The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of "hallmark" gene sets as part of MSigDB. Each hallmark in this collection consists of a "refined" gene set, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.
DOI: 10.1093/bib/bbs017
2012
Cited 7,009 times
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration
Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.
DOI: 10.1038/nature11003
2012
Cited 6,591 times
The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity
The Cancer Cell Line Encyclopedia presents the first results from a large-scale screen of some 947 cancer cell lines with 24 anticancer drugs, with the aim of identifying specific genomic alterations and gene expression profiles associated with selective sensitivity or resistance to potential therapeutic agents. Cancer cell lines are widely used as preclinical models to gain mechanistic and therapeutic insight. Two manuscripts in this issue describe the large-scale genetic and pharmacological characterization of human cancer cell lines. Each group characterized collections of several-hundred cell lines using different platforms and analytical methods. Their results are complementary, and confirm that many human cell lines capture the genomic diversity of their respective cancers. Initial findings include the identification of a number of potential markers of drug sensitivity and resistance. For example, Garnett et al. report an association between EWS-FLI1 gene translocations, frequently found in Ewing's sarcoma, and sensitivity to PARP inhibitors, a class of drug currently in clinical trials for other cancer types. Barretina et al. report a possible association between SLFN11 expression and sensitivity to topoisomerase inhibitors. The systematic translation of cancer genomic data into knowledge of tumour biology and therapeutic possibilities remains challenging. Such efforts should be greatly aided by robust preclinical model systems that reflect the genomic diversity of human cancers and for which detailed genetic and pharmacological annotation is available1. Here we describe the Cancer Cell Line Encyclopedia (CCLE): a compilation of gene expression, chromosomal copy number and massively parallel sequencing data from 947 human cancer cell lines. When coupled with pharmacological profiles for 24 anticancer drugs across 479 of the cell lines, this collection allowed identification of genetic, lineage, and gene-expression-based predictors of drug sensitivity. In addition to known predictors, we found that plasma cell lineage correlated with sensitivity to IGF1 receptor inhibitors; AHR expression was associated with MEK inhibitor efficacy in NRAS-mutant lines; and SLFN11 expression predicted sensitivity to topoisomerase inhibitors. Together, our results indicate that large, annotated cell-line collections may help to enable preclinical stratification schemata for anticancer agents. The generation of genetic predictions of drug response in the preclinical setting and their incorporation into cancer clinical trial design could speed the emergence of ‘personalized’ therapeutic regimens2.
DOI: 10.1016/j.ccr.2009.12.020
2010
Cited 6,223 times
Integrated Genomic Analysis Identifies Clinically Relevant Subtypes of Glioblastoma Characterized by Abnormalities in PDGFRA, IDH1, EGFR, and NF1
The Cancer Genome Atlas Network recently cataloged recurrent genomic abnormalities in glioblastoma multiforme (GBM). We describe a robust gene expression-based molecular classification of GBM into Proneural, Neural, Classical, and Mesenchymal subtypes and integrate multidimensional genomic data to establish patterns of somatic mutations and DNA copy number. Aberrations and gene expression of EGFR, NF1, and PDGFRA/IDH1 each define the Classical, Mesenchymal, and Proneural subtypes, respectively. Gene signatures of normal brain cell types show a strong relationship between subtypes and different neural lineages. Additionally, response to aggressive therapy differs by subtype, with the greatest benefit in the Classical subtype and no benefit in the Proneural subtype. We provide a framework that unifies transcriptomic and genomic dimensions for GBM molecular stratification with important implications for future studies.
DOI: 10.1093/bioinformatics/btr260
2011
Cited 4,732 times
Molecular signatures database (MSigDB) 3.0
Abstract Motivation: Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets. Results: We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site. Availability and Implementation: MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb. Contact: gsea@broadinstitute.org
DOI: 10.1073/pnas.96.6.2907
1999
Cited 2,840 times
Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation
Array technologies have made it straightforward to monitor simultaneously the expression pattern of thousands of genes. The challenge now is to interpret such massive data sets. The first step is to extract the fundamental patterns of gene expression inherent in the data. This paper describes the application of self-organizing maps, a type of mathematical cluster analysis that is particularly well suited for recognizing and classifying features in complex, multidimensional data. The method has been implemented in a publicly available computer package, GENECLUSTER, that performs the analytical calculations and provides easy data visualization. To illustrate the value of such analysis, the approach is applied to hematopoietic differentiation in four well studied models (HL-60, U937, Jurkat, and NB4 cells). Expression patterns of some 6,000 human genes were assayed, and an online database was created. GENECLUSTER was used to organize the genes into biologically relevant clusters that suggest novel hypotheses about hematopoietic differentiation-for example, highlighting certain genes and pathways involved in "differentiation therapy" used in the treatment of acute promyelocytic leukemia.
DOI: 10.1038/nature08460
2009
Cited 2,716 times
Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1
The proto-oncogene KRAS is mutated in a wide array of human cancers, most of which are aggressive and respond poorly to standard therapies. Although the identification of specific oncogenes has led to the development of clinically effective, molecularly targeted therapies in some cases, KRAS has remained refractory to this approach. A complementary strategy for targeting KRAS is to identify gene products that, when inhibited, result in cell death only in the presence of an oncogenic allele. Here we have used systematic RNA interference to detect synthetic lethal partners of oncogenic KRAS and found that the non-canonical IkappaB kinase TBK1 was selectively essential in cells that contain mutant KRAS. Suppression of TBK1 induced apoptosis specifically in human cancer cell lines that depend on oncogenic KRAS expression. In these cells, TBK1 activated NF-kappaB anti-apoptotic signals involving c-Rel and BCL-XL (also known as BCL2L1) that were essential for survival, providing mechanistic insights into this synthetic lethal interaction. These observations indicate that TBK1 and NF-kappaB signalling are essential in KRAS mutant tumours, and establish a general approach for the rational identification of co-dependent pathways in cancer.
DOI: 10.1038/nm0102-68
2002
Cited 2,297 times
Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning
DOI: 10.1038/415436a
2002
Cited 2,242 times
Prediction of central nervous system embryonal tumour outcome based on gene expression
DOI: 10.1073/pnas.211566398
2001
Cited 1,882 times
Multiclass cancer diagnosis using tumor gene expression signatures
The optimal treatment of patients with cancer depends on establishing accurate diagnoses by using a complex combination of clinical and histopathological data. In some instances, this task is difficult or impossible because of atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multiclass classifier based on a support vector machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared with their well differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multiclass molecular cancer classification and suggest a strategy for future clinical implementation of molecular cancer diagnostics.
DOI: 10.1038/nature03025
2004
Cited 1,824 times
Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype
Tetraodon nigroviridis is a freshwater puffer fish with the smallest known vertebrate genome. Here, we report a draft genome sequence with long-range linkage and substantial anchoring to the 21 Tetraodon chromosomes. Genome analysis provides a greatly improved fish gene catalogue, including identifying key genes previously thought to be absent in fish. Comparison with other vertebrates and a urochordate indicates that fish proteins have diverged markedly faster than their mammalian homologues. Comparison with the human genome suggests approximately 900 previously unannotated human genes. Analysis of the Tetraodon and human genomes shows that whole-genome duplication occurred in the teleost fish lineage, subsequent to its divergence from mammals. The analysis also makes it possible to infer the basic structure of the ancestral bony vertebrate genome, which was composed of 12 chromosomes, and to reconstruct much of the evolutionary history of ancient and recent chromosome rearrangements leading to the modern human karyotype.
DOI: 10.1038/ng0506-500
2006
Cited 1,804 times
GenePattern 2.0
DOI: 10.1073/pnas.0308531101
2004
Cited 1,738 times
Metagenes and molecular pattern discovery using matrix factorization
We describe here the use of nonnegative matrix factorization (NMF), an algorithm based on decomposition by parts that can reduce the dimension of expression data from thousands of genes to a handful of metagenes. Coupled with a model selection mechanism, adapted to work for any stochastic clustering algorithm, NMF is an efficient method for identification of distinct molecular patterns and provides a powerful method for class discovery. We demonstrate the ability of NMF to recover meaningful biological information from cancer-related microarray data. NMF appears to have advantages over other methods such as hierarchical clustering or self-organizing maps. We found it less sensitive to a priori selection of genes or initial conditions and able to detect alternative or context-dependent patterns of gene expression in complex biological systems. This ability, similar to semantic polysemy in text, provides a general method for robust molecular pattern discovery.
DOI: 10.1038/nature23007
2017
Cited 1,238 times
Dependency of a therapy-resistant state of cancer cells on a lipid peroxidase pathway
Plasticity of the cell state has been proposed to drive resistance to multiple classes of cancer therapies, thereby limiting their effectiveness. A high-mesenchymal cell state observed in human tumours and cancer cell lines has been associated with resistance to multiple treatment modalities across diverse cancer lineages, but the mechanistic underpinning for this state has remained incompletely understood. Here we molecularly characterize this therapy-resistant high-mesenchymal cell state in human cancer cell lines and organoids and show that it depends on a druggable lipid-peroxidase pathway that protects against ferroptosis, a non-apoptotic form of cell death induced by the build-up of toxic lipid peroxides. We show that this cell state is characterized by activity of enzymes that promote the synthesis of polyunsaturated lipids. These lipids are the substrates for lipid peroxidation by lipoxygenase enzymes. This lipid metabolism creates a dependency on pathways converging on the phospholipid glutathione peroxidase (GPX4), a selenocysteine-containing enzyme that dissipates lipid peroxides and thereby prevents the iron-mediated reactions of peroxides that induce ferroptotic cell death. Dependency on GPX4 was found to exist across diverse therapy-resistant states characterized by high expression of ZEB1, including epithelial-mesenchymal transition in epithelial-derived carcinomas, TGFβ-mediated therapy-resistance in melanoma, treatment-induced neuroendocrine transdifferentiation in prostate cancer, and sarcomas, which are fixed in a mesenchymal state owing to their cells of origin. We identify vulnerability to ferroptic cell death induced by inhibition of a lipid peroxidase pathway as a feature of therapy-resistant cancer cells across diverse mesenchymal cell-state contexts.
DOI: 10.1016/j.cels.2015.07.012
2016
Cited 1,237 times
Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom
Hi-C experiments study how genomes fold in 3D, generating contact maps containing features as small as 20 bp and as large as 200 Mb. Here we introduce Juicebox, a tool for exploring Hi-C and other contact map data. Juicebox allows users to zoom in and out of Hi-C maps interactively, just as a user of Google Earth might zoom in and out of a geographic map. Maps can be compared to one another, or to 1D tracks or 2D feature sets.
DOI: 10.1093/bioinformatics/btm369
2007
Cited 1,155 times
<i>GSEA-P</i>: a desktop application for Gene Set Enrichment Analysis
Gene Set Enrichment Analysis (GSEA) is a computational method that assesses whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states. We report the availability of a new version of the Java based software (GSEA-P 2.0) that represents a major improvement on the previous release through the addition of a leading edge analysis component, seamless integration with the Molecular Signature Database (MSigDB) and an embedded browser that allows users to search for gene sets and map them to a variety of microarray platform formats. This functionality makes it possible for users to directly import gene sets from MSigDB for analysis with GSEA. We have also improved the visualizations in GSEA-P 2.0 and added links to a new form of concise gene set annotations called Gene Set Cards. These additions, as well as other improvements suggested by over 3500 users who have downloaded the software over the past year have been incorporated into this new release of the GSEA-P Java desktop program.GSEA-P 2.0 is freely available for academic and commercial users and can be downloaded from http://www.broad.mit.edu/GSEA
DOI: 10.1158/0008-5472.can-17-0337
2017
Cited 826 times
Variant Review with the Integrative Genomics Viewer
Abstract Manual review of aligned reads for confirmation and interpretation of variant calls is an important step in many variant calling pipelines for next-generation sequencing (NGS) data. Visual inspection can greatly increase the confidence in calls, reduce the risk of false positives, and help characterize complex events. The Integrative Genomics Viewer (IGV) was one of the first tools to provide NGS data visualization, and it currently provides a rich set of tools for inspection, validation, and interpretation of NGS datasets, as well as other types of genomic data. Here, we present a short overview of IGV's variant review features for both single-nucleotide variants and structural variants, with examples from both cancer and germline datasets. IGV is freely available at https://www.igv.org. Cancer Res; 77(21); e31–34. ©2017 AACR.
DOI: 10.1158/0008-5472.can-08-0943
2008
Cited 795 times
Carcinoma-Associated Fibroblast–Like Differentiation of Human Mesenchymal Stem Cells
Carcinoma-associated fibroblasts (CAF) have recently been implicated in important aspects of epithelial solid tumor biology, such as neoplastic progression, tumor growth, angiogenesis, and metastasis. However, neither the source of CAFs nor the differences between CAFs and fibroblasts from nonneoplastic tissue have been well defined. In this study, we show that human bone marrow-derived mesenchymal stem cells (hMSCs) exposed to tumor-conditioned medium (TCM) over a prolonged period of time assume a CAF-like myofibroblastic phenotype. More importantly, these cells exhibit functional properties of CAFs, including sustained expression of stromal-derived factor-1 (SDF-1) and the ability to promote tumor cell growth both in vitro and in an in vivo coimplantation model, and expression of myofibroblast markers, including alpha-smooth muscle actin and fibroblast surface protein. hMSCs induced to differentiate to a myofibroblast-like phenotype using 5-azacytidine do not promote tumor cell growth as efficiently as hMSCs cultured in TCM nor do they show increased SDF-1 expression. Furthermore, gene expression profiling revealed similarities between TCM-exposed hMSCs and CAFs. Taken together, these data suggest that hMSCs are a source of CAFs and can be used in the modeling of tumor-stroma interactions. To our knowledge, this is the first report showing that hMSCs become activated and resemble carcinoma-associated myofibroblasts on prolonged exposure to conditioned medium from MDAMB231 human breast cancer cells.
DOI: 10.1038/nature11329
2012
Cited 676 times
Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations
Medulloblastoma is the most common brain tumour in children; using exome sequencing of tumour samples the authors show that these cancers have low mutation rates and identify 12 significantly mutated genes, among them the gene encoding RNA helicase DDX3X. Medulloblastoma is the most common malignant brain tumour in children. Four papers published in the 2 August 2012 issue of Nature use whole-genome and other sequencing techniques to produce a detailed picture of the genetics and genomics of this condition. Notable findings include the identification of recurrent mutations in genes not previously implicated in medulloblastoma, with significant genetic differences associated with the four biologically distinct subgroups and clinical outcomes in each. Potential avenues for therapy are suggested by the identification of targetable somatic copy-number alterations, including recurrent events targeting TGFβ signalling in Group 3, and NF-κB signalling in Group 4 medulloblastomas. Medulloblastomas are the most common malignant brain tumours in children1. Identifying and understanding the genetic events that drive these tumours is critical for the development of more effective diagnostic, prognostic and therapeutic strategies. Recently, our group and others described distinct molecular subtypes of medulloblastoma on the basis of transcriptional and copy number profiles2,3,4,5. Here we use whole-exome hybrid capture and deep sequencing to identify somatic mutations across the coding regions of 92 primary medulloblastoma/normal pairs. Overall, medulloblastomas have low mutation rates consistent with other paediatric tumours, with a median of 0.35 non-silent mutations per megabase. We identified twelve genes mutated at statistically significant frequencies, including previously known mutated genes in medulloblastoma such as CTNNB1, PTCH1, MLL2, SMARCA4 and TP53. Recurrent somatic mutations were newly identified in an RNA helicase gene, DDX3X, often concurrent with CTNNB1 mutations, and in the nuclear co-repressor (N-CoR) complex genes GPS2, BCOR and LDB1. We show that mutant DDX3X potentiates transactivation of a TCF promoter and enhances cell viability in combination with mutant, but not wild-type, β-catenin. Together, our study reveals the alteration of WNT, hedgehog, histone methyltransferase and now N-CoR pathways across medulloblastomas and within specific subtypes of this disease, and nominates the RNA helicase DDX3X as a component of pathogenic β-catenin signalling in medulloblastoma.
DOI: 10.1016/j.cell.2012.11.026
2012
Cited 649 times
β-Catenin-Driven Cancers Require a YAP1 Transcriptional Complex for Survival and Tumorigenesis
Wnt/β-catenin signaling plays a key role in the pathogenesis of colon and other cancers; emerging evidence indicates that oncogenic β-catenin regulates several biological processes essential for cancer initiation and progression. To decipher the role of β-catenin in transformation, we classified β-catenin activity in 85 cancer cell lines in which we performed genome-scale loss-of-function screens and found that β-catenin active cancers are dependent on a signaling pathway involving the transcriptional regulator YAP1. Specifically, we found that YAP1 and the transcription factor TBX5 form a complex with β-catenin. Phosphorylation of YAP1 by the tyrosine kinase YES1 leads to localization of this complex to the promoters of antiapoptotic genes, including BCL2L1 and BIRC5. A small-molecule inhibitor of YES1 impeded the proliferation of β-catenin-dependent cancers in both cell lines and animal models. These observations define a β-catenin-YAP1-TBX5 complex essential to the transformation and survival of β-catenin-driven cancers.
DOI: 10.1200/jco.2010.28.5148
2011
Cited 632 times
Integrative Genomic Analysis of Medulloblastoma Identifies a Molecular Subgroup That Drives Poor Clinical Outcome
Purpose Medulloblastomas are heterogeneous tumors that collectively represent the most common malignant brain tumor in children. To understand the molecular characteristics underlying their heterogeneity and to identify whether such characteristics represent risk factors for patients with this disease, we performed an integrated genomic analysis of a large series of primary tumors. Patients and Methods We profiled the mRNA transcriptome of 194 medulloblastomas and performed high-density single nucleotide polymorphism array and miRNA analysis on 115 and 98 of these, respectively. Non-negative matrix factorization–based clustering of mRNA expression data was used to identify molecular subgroups of medulloblastoma; DNA copy number, miRNA profiles, and clinical outcomes were analyzed for each. We additionally validated our findings in three previously published independent medulloblastoma data sets. Results Identified are six molecular subgroups of medulloblastoma, each with a unique combination of numerical and structural chromosomal aberrations that globally influence mRNA and miRNA expression. We reveal the relative contribution of each subgroup to clinical outcome as a whole and show that a previously unidentified molecular subgroup, characterized genetically by c-MYC copy number gains and transcriptionally by enrichment of photoreceptor pathways and increased miR-183∼96∼182 expression, is associated with significantly lower rates of event-free and overall survivals. Conclusion Our results detail the complex genomic heterogeneity of medulloblastomas and identify a previously unrecognized molecular subgroup with poor clinical outcome for which more effective therapeutic strategies should be developed.
DOI: 10.1073/pnas.191368598
2001
Cited 626 times
Chemosensitivity prediction by transcriptional profiling
In an effort to develop a genomics-based approach to the prediction of drug response, we have developed an algorithm for classification of cell line chemosensitivity based on gene expression profiles alone. Using oligonucleotide microarrays, the expression levels of 6,817 genes were measured in a panel of 60 human cancer cell lines (the NCI-60) for which the chemosensitivity profiles of thousands of chemical compounds have been determined. We sought to determine whether the gene expression signatures of untreated cells were sufficient for the prediction of chemosensitivity. Gene expression-based classifiers of sensitivity or resistance for 232 compounds were generated and then evaluated on independent sets of data. The classifiers were designed to be independent of the cells' tissue of origin. The accuracy of chemosensitivity prediction was considerably better than would be expected by chance. Eighty-eight of 232 expression-based classifiers performed accurately (with P < 0.05) on an independent test set, whereas only 12 of the 232 would be expected to do so by chance. These results suggest that at least for a subset of compounds genomic approaches to chemosensitivity prediction are feasible.
DOI: 10.1101/gr.208902
2002
Cited 578 times
High-Throughput Gene Mapping in <i>Caenorhabditis elegans</i>
We describe a new computer system, called ARACHNE, for assembling genome sequence using paired-end whole-genome shotgun reads. ARACHNE has several key features, including an efficient and sensitive procedure for finding read overlaps, a procedure for scoring overlaps that achieves high accuracy by correcting errors before assembly, read merger based on forward-reverse links, and detection of repeat contigs by forward-reverse link inconsistency. To test ARACHNE, we created simulated reads providing approximately 10-fold coverage of the genomes of H. influenzae, S. cerevisiae, and D. melanogaster, as well as human chromosomes 21 and 22. The assemblies of these simulated reads yielded nearly complete coverage of the respective genomes, with a small number of contigs joined into a smaller number of supercontigs (or scaffolds). For example, analysis of the D. melanogaster genome yielded approximately 98% coverage with an N50 contig length of 324 kb and an N50 supercontig length of 5143 kb. The assembly accuracy was high, although not perfect: small errors occurred at a frequency of roughly 1 per 1 Mb (typically, deletion of approximately 1 kb in size), with a very small number of other misassemblies. The assembly was rapid: the Drosophila assembly required only 21 hours on a single 667 MHz processor and used 8.4 Gb of memory.
DOI: 10.1016/j.cell.2016.04.028
2016
Cited 526 times
RNA Duplex Map in Living Cells Reveals Higher-Order Transcriptome Structure
RNA has the intrinsic property to base pair, forming complex structures fundamental to its diverse functions. Here, we develop PARIS, a method based on reversible psoralen crosslinking for global mapping of RNA duplexes with near base-pair resolution in living cells. PARIS analysis in three human and mouse cell types reveals frequent long-range structures, higher-order architectures, and RNA-RNA interactions in trans across the transcriptome. PARIS determines base-pairing interactions on an individual-molecule level, revealing pervasive alternative conformations. We used PARIS-determined helices to guide phylogenetic analysis of RNA structures and discovered conserved long-range and alternative structures. XIST, a long noncoding RNA (lncRNA) essential for X chromosome inactivation, folds into evolutionarily conserved RNA structural domains that span many kilobases. XIST A-repeat forms complex inter-repeat duplexes that nucleate higher-order assembly of the key epigenetic silencing protein SPEN. PARIS is a generally applicable and versatile method that provides novel insights into the RNA structurome and interactome. VIDEO ABSTRACT.
DOI: 10.1038/ng.2007.10
2007
Cited 505 times
Efficient mapping of mendelian traits in dogs through genome-wide association
DOI: 10.1158/2159-8290.cd-13-0424
2014
Cited 445 times
A Melanoma Cell State Distinction Influences Sensitivity to MAPK Pathway Inhibitors
Most melanomas harbor oncogenic BRAF(V600) mutations, which constitutively activate the MAPK pathway. Although MAPK pathway inhibitors show clinical benefit in BRAF(V600)-mutant melanoma, it remains incompletely understood why 10% to 20% of patients fail to respond. Here, we show that RAF inhibitor-sensitive and inhibitor-resistant BRAF(V600)-mutant melanomas display distinct transcriptional profiles. Whereas most drug-sensitive cell lines and patient biopsies showed high expression and activity of the melanocytic lineage transcription factor MITF, intrinsically resistant cell lines and biopsies displayed low MITF expression but higher levels of NF-κB signaling and the receptor tyrosine kinase AXL. In vitro, these MITF-low/NF-κB-high melanomas were resistant to inhibition of RAF and MEK, singly or in combination, and ERK. Moreover, in cell lines, NF-κB activation antagonized MITF expression and induced both resistance marker genes and drug resistance. Thus, distinct cell states characterized by MITF or NF-κB activity may influence intrinsic resistance to MAPK pathway inhibitors in BRAF(V600)-mutant melanoma.Although most BRAF(V600)-mutant melanomas are sensitive to RAF and/or MEK inhibitors, a subset fails to respond to such treatment. This study characterizes a transcriptional cell state distinction linked to MITF and NF-κB that may modulate intrinsic sensitivity of melanomas to MAPK pathway inhibitors.
DOI: 10.1371/journal.pone.0001195
2007
Cited 430 times
Subclass Mapping: Identifying Common Subtypes in Independent Disease Data Sets
Whole genome expression profiles are widely used to discover molecular subtypes of diseases. A remaining challenge is to identify the correspondence or commonality of subtypes found in multiple, independent data sets generated on various platforms. While model-based supervised learning is often used to make these connections, the models can be biased to the training data set and thus miss inherent, relevant substructure in the test data. Here we describe an unsupervised subclass mapping method (SubMap), which reveals common subtypes between independent data sets. The subtypes within a data set can be determined by unsupervised clustering or given by predetermined phenotypes before applying SubMap. We define a measure of correspondence for subtypes and evaluate its significance building on our previous work on gene set enrichment analysis. The strength of the SubMap method is that it does not impose the structure of one data set upon another, but rather uses a bi-directional approach to highlight the common substructures in both. We show how this method can reveal the correspondence between several cancer-related data sets. Notably, it identifies common subtypes of breast cancer associated with estrogen receptor status, and a subgroup of lymphoma patients who share similar survival patterns, thus improving the accuracy of a clinical outcome predictor.
DOI: 10.1073/pnas.1109363108
2011
Cited 395 times
Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific dependencies in ovarian cancer
A comprehensive understanding of the molecular vulnerabilities of every type of cancer will provide a powerful roadmap to guide therapeutic approaches. Efforts such as The Cancer Genome Atlas Project will identify genes with aberrant copy number, sequence, or expression in various cancer types, providing a survey of the genes that may have a causal role in cancer. A complementary approach is to perform systematic loss-of-function studies to identify essential genes in particular cancer cell types. We have begun a systematic effort, termed Project Achilles, aimed at identifying genetic vulnerabilities across large numbers of cancer cell lines. Here, we report the assessment of the essentiality of 11,194 genes in 102 human cancer cell lines. We show that the integration of these functional data with information derived from surveying cancer genomes pinpoints known and previously undescribed lineage-specific dependencies across a wide spectrum of cancers. In particular, we found 54 genes that are specifically essential for the proliferation and viability of ovarian cancer cells and also amplified in primary tumors or differentially overexpressed in ovarian cancer cell lines. One such gene, PAX8, is focally amplified in 16% of high-grade serous ovarian cancers and expressed at higher levels in ovarian tumors. Suppression of PAX8 selectively induces apoptotic cell death of ovarian cancer cells. These results identify PAX8 as an ovarian lineage-specific dependency. More generally, these observations demonstrate that the integration of genome-scale functional and structural studies provides an efficient path to identify dependencies of specific cancer types on particular genes and pathways.
DOI: 10.1038/ng1490
2004
Cited 395 times
An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis
DOI: 10.1172/jci65833
2012
Cited 391 times
Prognostically relevant gene signatures of high-grade serous ovarian carcinoma
Because of the high risk of recurrence in high-grade serous ovarian carcinoma (HGS-OvCa), the development of outcome predictors could be valuable for patient stratification.Using the catalog of The Cancer Genome Atlas (TCGA), we developed subtype and survival gene expression signatures, which, when combined, provide a prognostic model of HGS-OvCa classification, named "Classification of Ovarian Cancer" (CLOVAR).We validated CLOVAR on an independent dataset consisting of 879 HGS-OvCa expression profiles.The worst outcome group, accounting for 23% of all cases, was associated with a median survival of 23 months and a platinum resistance rate of 63%, versus a median survival of 46 months and platinum resistance rate of 23% in other cases.Associating the outcome prediction model with BRCA1/BRCA2 mutation status, residual disease after surgery, and disease stage further optimized outcome classification.Ovarian cancer is a disease in urgent need of more effective therapies.The spectrum of outcomes observed here and their association with CLOVAR signatures suggests variations in underlying tumor biology.Prospective validation of the CLOVAR model in the context of additional prognostic variables may provide a rationale for optimal combination of patient and treatment regimens.
DOI: 10.1101/gr.10.7.950
2000
Cited 354 times
Human and Mouse Gene Structure: Comparative Analysis and Application to Exon Prediction
We describe a novel analytical approach to gene recognition based on cross-species comparison. We first undertook a comparison of orthologous genomic loci from human and mouse, studying the extent of similarity in the number, size and sequence of exons and introns. We then developed an approach for recognizing genes within such orthologous regions by first aligning the regions using an iterative global alignment system and then identifying genes based on conservation of exonic features at aligned positions in both species. The alignment and gene recognition are performed by new programs called and, respectively. performed well at exact identification of coding exons in 117 orthologous pairs tested.
DOI: 10.1038/sdata.2014.35
2014
Cited 351 times
Parallel genome-scale loss of function screens in 216 cancer cell lines for the identification of context-specific genetic dependencies
Abstract Using a genome-scale, lentivirally delivered shRNA library, we performed massively parallel pooled shRNA screens in 216 cancer cell lines to identify genes that are required for cell proliferation and/or viability. Cell line dependencies on 11,000 genes were interrogated by 5 shRNAs per gene. The proliferation effect of each shRNA in each cell line was assessed by transducing a population of 11M cells with one shRNA-virus per cell and determining the relative enrichment or depletion of each of the 54,000 shRNAs after 16 population doublings using Next Generation Sequencing. All the cell lines were screened using standardized conditions to best assess differential genetic dependencies across cell lines. When combined with genomic characterization of these cell lines, this dataset facilitates the linkage of genetic dependencies with specific cellular contexts (e.g., gene mutations or cell lineage). To enable such comparisons, we developed and provided a bioinformatics tool to identify linear and nonlinear correlations between these features.
DOI: 10.1073/pnas.0903028106
2009
Cited 326 times
Automated high-dimensional flow cytometric data analysis
Flow cytometric analysis allows rapid single cell interrogation of surface and intracellular determinants by measuring fluorescence intensity of fluorophore-conjugated reagents. The availability of new platforms, allowing detection of increasing numbers of cell surface markers, has challenged the traditional technique of identifying cell populations by manual gating and resulted in a growing need for the development of automated, high-dimensional analytical methods. We present a direct multivariate finite mixture modeling approach, using skew and heavy-tailed distributions, to address the complexities of flow cytometric analysis and to deal with high-dimensional cytometric data without the need for projection or transformation. We demonstrate its ability to detect rare populations, to model robustly in the presence of outliers and skew, and to perform the critical task of matching cell populations across samples that enables downstream analysis. This advance will facilitate the application of flow cytometry to new, complex biological and clinical problems.
DOI: 10.1016/j.cels.2018.01.001
2018
Cited 277 times
Juicebox.js Provides a Cloud-Based Visualization System for Hi-C Data
Contact mapping experiments such as Hi-C explore how genomes fold in 3D. Here, we introduce Juicebox.js, a cloud-based web application for exploring the resulting datasets. Like the original Juicebox application, Juicebox.js allows users to zoom in and out of such datasets using an interface similar to Google Earth. Juicebox.js also has many features designed to facilitate data reproducibility and sharing. Furthermore, Juicebox.js encodes the exact state of the browser in a shareable URL. Creating a public browser for a new Hi-C dataset does not require coding and can be accomplished in under a minute. The web app also makes it possible to create interactive figures online that can complement or replace ordinary journal figures. When combined with Juicer, this makes the entire process of data analysis transparent, insofar as every step from raw reads to published figure is publicly available as open source code.
DOI: 10.1038/nbt.1524
2009
Cited 273 times
Prediction of high-responding peptides for targeted protein assays by mass spectrometry
Development of sensitive mass spectrometry–based assays for complex biofluids depends on the ability to identify signature peptides that produce the strongest signals. Fusaro et al. use protein physicochemical properties to predict high-responding peptides in data obtained from complex samples such as plasma. Protein biomarker discovery produces lengthy lists of candidates that must subsequently be verified in blood or other accessible biofluids. Use of targeted mass spectrometry (MS) to verify disease- or therapy-related changes in protein levels requires the selection of peptides that are quantifiable surrogates for proteins of interest. Peptides that produce the highest ion-current response (high-responding peptides) are likely to provide the best detection sensitivity. Identification of the most effective signature peptides, particularly in the absence of experimental data, remains a major resource constraint in developing targeted MS–based assays. Here we describe a computational method that uses protein physicochemical properties to select high-responding peptides and demonstrate its utility in identifying signature peptides in plasma, a complex proteome with a wide range of protein concentrations. Our method, which employs a Random Forest classifier, facilitates the development of targeted MS–based assays for biomarker verification or any application where protein levels need to be measured.
DOI: 10.1126/science.1179653
2010
Cited 246 times
Accessible Reproducible Research
As use of computation in research grows, new tools are needed to expand recording, reporting, and reproduction of methods and data.
DOI: 10.1016/j.immuni.2015.12.006
2016
Cited 231 times
Compendium of Immune Signatures Identifies Conserved and Species-Specific Biology in Response to Inflammation
Gene-expression profiling has become a mainstay in immunology, but subtle changes in gene networks related to biological processes are hard to discern when comparing various datasets. For instance, conservation of the transcriptional response to sepsis in mouse models and human disease remains controversial. To improve transcriptional analysis in immunology, we created ImmuneSigDB: a manually annotated compendium of ∼5,000 gene-sets from diverse cell states, experimental manipulations, and genetic perturbations in immunology. Analysis using ImmuneSigDB identified signatures induced in activated myeloid cells and differentiating lymphocytes that were highly conserved between humans and mice. Sepsis triggered conserved patterns of gene expression in humans and mouse models. However, we also identified species-specific biological processes in the sepsis transcriptional response: although both species upregulated phagocytosis-related genes, a mitosis signature was specific to humans. ImmuneSigDB enables granular analysis of transcriptomic data to improve biological understanding of immune processes of the human and mouse immune systems.
DOI: 10.1038/nm.2251
2010
Cited 227 times
Loss of the tumor suppressor Snf5 leads to aberrant activation of the Hedgehog-Gli pathway
Aberrant activation of the Hedgehog (Hh) pathway can drive tumorigenesis. To investigate the mechanism by which glioma-associated oncogene family zinc finger-1 (GLI1), a crucial effector of Hh signaling, regulates Hh pathway activation, we searched for GLI1-interacting proteins. We report that the chromatin remodeling protein SNF5 (encoded by SMARCB1, hereafter called SNF5), which is inactivated in human malignant rhabdoid tumors (MRTs), interacts with GLI1. We show that Snf5 localizes to Gli1-regulated promoters and that loss of Snf5 leads to activation of the Hh-Gli pathway. Conversely, re-expression of SNF5 in MRT cells represses GLI1. Consistent with this, we show the presence of a Hh-Gli-activated gene expression profile in primary MRTs and show that GLI1 drives the growth of SNF5-deficient MRT cells in vitro and in vivo. Therefore, our studies reveal that SNF5 is a key mediator of Hh signaling and that aberrant activation of GLI1 is a previously undescribed targetable mechanism contributing to the growth of MRT cells.
DOI: 10.1016/j.cell.2012.07.023
2012
Cited 216 times
Cancer Vulnerabilities Unveiled by Genomic Loss
Due to genome instability, most cancers exhibit loss of regions containing tumor suppressor genes and collateral loss of other genes. To identify cancer-specific vulnerabilities that are the result of copy number losses, we performed integrated analyses of genome-wide copy number and RNAi profiles and identified 56 genes for which gene suppression specifically inhibited the proliferation of cells harboring partial copy number loss of that gene. These CYCLOPS (copy number alterations yielding cancer liabilities owing to partial loss) genes are enriched for spliceosome, proteasome, and ribosome components. One CYCLOPS gene, PSMC2, encodes an essential member of the 19S proteasome. Normal cells express excess PSMC2, which resides in a complex with PSMC1, PSMD2, and PSMD5 and acts as a reservoir protecting cells from PSMC2 suppression. Cells harboring partial PSMC2 copy number loss lack this complex and die after PSMC2 suppression. These observations define a distinct class of cancer-specific liabilities resulting from genome instability.
DOI: 10.1038/nature12564
2013
Cited 206 times
Criteria for the use of omics-based predictors in clinical trials
The US National Cancer Institute (NCI), in collaboration with scientists representing multiple areas of expertise relevant to 'omics'-based test development, has developed a checklist of criteria that can be used to determine the readiness of omics-based tests for guiding patient care in clinical trials. The checklist criteria cover issues relating to specimens, assays, mathematical modelling, clinical trial design, and ethical, legal and regulatory aspects. Funding bodies and journals are encouraged to consider the checklist, which they may find useful for assessing study quality and evidence strength. The checklist will be used to evaluate proposals for NCI-sponsored clinical trials in which omics tests will be used to guide therapy.
DOI: 10.1074/mcp.tir118.000943
2019
Cited 206 times
A Curated Resource for Phosphosite-specific Signature Analysis
Signaling pathways are orchestrated by post-translational modifications (PTMs) such as phosphorylation. However, pathway analysis of PTM data sets generated by mass spectrometry (MS)-based proteomics is typically performed at a gene-centric level because of the lack of appropriately curated PTM signature databases and bioinformatic tools that leverage PTM site-specific information. Here we present the first version of PTMsigDB, a database of modification site-specific signatures of perturbations, kinase activities and signaling pathways curated from more than 2,500 publications. We adapted the widely used single sample Gene Set Enrichment Analysis approach to utilize PTMsigDB, enabling PTMSignature Enrichment Analysis (PTM-SEA) of quantitative MS data. We used a well-characterized data set of epidermal growth factor (EGF)-perturbed cancer cells to evaluate our approach and demonstrated better representation of signaling events compared with gene-centric methods. We then applied PTM-SEA to analyze the phosphoproteomes of cancer cells treated with cell-cycle inhibitors and detected mechanism-of-action specific signatures of cell cycle kinases. We also applied our methods to analyze the phosphoproteomes of PI3K-inhibited human breast cancer cells and detected signatures of compounds inhibiting PI3K as well as targets downstream of PI3K (AKT, MAPK/ERK) covering a substantial fraction of the PI3K pathway. PTMsigDB and PTM-SEA can be freely accessed at https://github.com/broadinstitute/ssGSEA2.0.
DOI: 10.1158/2159-8290.cd-13-0646
2014
Cited 170 times
Inhibition of <i>KRAS</i>-Driven Tumorigenicity by Interruption of an Autocrine Cytokine Circuit
Abstract Although the roles of mitogen-activated protein kinase (MAPK) and phosphoinositide 3-kinase (PI3K) signaling in KRAS-driven tumorigenesis are well established, KRAS activates additional pathways required for tumor maintenance, the inhibition of which are likely to be necessary for effective KRAS-directed therapy. Here, we show that the IκB kinase (IKK)–related kinases Tank-binding kinase-1 (TBK1) and IKKϵ promote KRAS-driven tumorigenesis by regulating autocrine CCL5 and interleukin (IL)-6 and identify CYT387 as a potent JAK/TBK1/IKKϵ inhibitor. CYT387 treatment ablates RAS-associated cytokine signaling and impairs Kras-driven murine lung cancer growth. Combined CYT387 treatment and MAPK pathway inhibition induces regression of aggressive murine lung adenocarcinomas driven by Kras mutation and p53 loss. These observations reveal that TBK1/IKKϵ promote tumor survival by activating CCL5 and IL-6 and identify concurrent inhibition of TBK1/IKKϵ, Janus-activated kinase (JAK), and MEK signaling as an effective approach to inhibit the actions of oncogenic KRAS. Significance: In addition to activating MAPK and PI3K, oncogenic KRAS engages cytokine signaling to promote tumorigenesis. CYT387, originally described as a selective JAK inhibitor, is also a potent TBK/IKKϵ inhibitor that uniquely disrupts a cytokine circuit involving CCL5, IL-6, and STAT3. The efficacy of CYT387-based treatment in murine Kras-driven lung cancer models uncovers a novel therapeutic approach for these refractory tumors with immediate translational implications. Cancer Discov; 4(4); 452–65. ©2014 AACR. This article is highlighted in the In This Issue feature, p. 377
DOI: 10.12688/f1000research.4492.2
2014
Cited 161 times
Cytoscape: the network visualization tool for GenomeSpace workflows
<ns4:p>Modern genomic analysis often requires workflows incorporating multiple best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms that create data flows between them. One such tool is Cytoscape 3, a popular application that enables analysis and visualization of graph-oriented genomic networks. As Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required special care in creating a seamless user experience and enabling appropriate data flows. In this paper, we present the design and operation of the Cytoscape GenomeSpace app, which accomplishes this integration, thereby providing critical analysis and visualization functionality for GenomeSpace users. It has been downloaded over 850 times since the release of its first version in September, 2013.</ns4:p>
DOI: 10.1016/j.ccell.2015.02.005
2015
Cited 154 times
A Functional Landscape of Resistance to ALK Inhibition in Lung Cancer
We conducted a large-scale functional genetic study to characterize mechanisms of resistance to ALK inhibition in ALK-dependent lung cancer cells. We identify members of known resistance pathways and additional putative resistance drivers. Among the latter were members of the P2Y purinergic receptor family of G-protein-coupled receptors (P2Y1, P2Y2, and P2Y6). P2Y receptors mediated resistance in part through a protein-kinase-C (PKC)-dependent mechanism. Moreover, PKC activation alone was sufficient to confer resistance to ALK inhibitors, whereas combined ALK and PKC inhibition restored sensitivity. We observed enrichment of gene signatures associated with several resistance drivers (including P2Y receptors) in crizotinib-resistant ALK-rearranged lung tumors compared to treatment-naive controls, supporting a role for these identified mechanisms in clinical ALK inhibitor resistance.
DOI: 10.1016/j.ccell.2018.08.004
2018
Cited 152 times
Proteomics, Post-translational Modifications, and Integrative Analyses Reveal Molecular Heterogeneity within Medulloblastoma Subgroups
There is a pressing need to identify therapeutic targets in tumors with low mutation rates such as the malignant pediatric brain tumor medulloblastoma. To address this challenge, we quantitatively profiled global proteomes and phospho-proteomes of 45 medulloblastoma samples. Integrated analyses revealed that tumors with similar RNA expression vary extensively at the post-transcriptional and post-translational levels. We identified distinct pathways associated with two subsets of SHH tumors, and found post-translational modifications of MYC that are associated with poor outcomes in group 3 tumors. We found kinases associated with subtypes and showed that inhibiting PRKDC sensitizes MYC-driven cells to radiation. Our study shows that proteomics enables a more comprehensive, functional readout, providing a foundation for future therapeutic strategies.
DOI: 10.1093/bioinformatics/btac830
2022
Cited 134 times
igv.js: an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV)
igv.js is an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV). It can be easily dropped into any web page with a single line of code and has no external dependencies. The viewer runs completely in the web browser, with no backend server and no data pre-processing required.The igv.js JavaScript component can be installed from NPM at https://www.npmjs.com/package/igv. The source code is available at https://github.com/igvteam/igv.js under the MIT open-source license. IGV-Web, the end-user application built around igv.js, is available at https://igv.org/app. The source code is available at https://github.com/igvteam/igv-webapp under the MIT open-source license.Supplementary information is available at Bioinformatics online.
DOI: 10.1038/s41467-022-31941-w
2022
Cited 46 times
Lymphatic-preserving treatment sequencing with immune checkpoint inhibition unleashes cDC1-dependent antitumor immunity in HNSCC
Despite the promise of immune checkpoint inhibition (ICI), therapeutic responses remain limited. This raises the possibility that standard of care treatments delivered in concert may compromise the tumor response. To address this, we employ tobacco-signature head and neck squamous cell carcinoma murine models in which we map tumor-draining lymphatics and develop models for regional lymphablation with surgery or radiation. We find that lymphablation eliminates the tumor ICI response, worsening overall survival and repolarizing the tumor- and peripheral-immune compartments. Mechanistically, within tumor-draining lymphatics, we observe an upregulation of conventional type I dendritic cells and type I interferon signaling and show that both are necessary for the ICI response and lost with lymphablation. Ultimately, we provide a mechanistic understanding of how standard oncologic therapies targeting regional lymphatics impact the tumor response to immune-oncology therapy in order to define rational, lymphatic-preserving treatment sequences that mobilize systemic antitumor immunity, achieve optimal tumor responses, control regional metastatic disease, and confer durable antitumor immunity.
DOI: 10.1038/s41592-023-02014-7
2023
Cited 20 times
Extending support for mouse data in the Molecular Signatures Database (MSigDB)
DOI: 10.1101/gr.828403
2003
Cited 294 times
Whole-Genome Sequence Assembly for Mammalian Genomes: Arachne 2
We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal changes were simultaneously made and applied to the assembly of the mouse genome, during a six-month period of development: (1) Supercontigs (scaffolds) were iteratively broken and rejoined using several criteria, yielding a 64-fold increase in length (N50), and apparent elimination of all global misjoins; (2) gaps between contigs in supercontigs were filled (partially or completely) by insertion of reads, as suggested by pairing within the supercontig, increasing the N50 contig length by 50%; (3) memory usage was reduced fourfold. The outcome of this mouse assembly and its analysis are described in (Mouse Genome Sequencing Consortium 2002).
DOI: 10.1089/106652703321825928
2003
Cited 254 times
Estimating Dataset Size Requirements for Classifying DNA Microarray Data
A statistical methodology for estimating dataset size requirements for classifying microarray data using learning curves is introduced. The goal is to use existing classification results to estimate dataset size requirements for future classification experiments and to evaluate the gain in accuracy and significance of classifiers built with additional data. The method is based on fitting inverse power-law models to construct empirical learning curves. It also includes a permutation test procedure to assess the statistical significance of classification performance for a given dataset size. This procedure is applied to several molecular classification problems representing a broad spectrum of levels of complexity.
DOI: 10.1093/bioinformatics/17.suppl_1.s316
2001
Cited 245 times
Molecular classification of multiple tumor types
Using gene expression data to classify tumor types is a very promising tool in cancer diagnosis. Previous works show several pairs of tumor types can be successfully distinguished by their gene expression patterns (Golub et al. 1999, Ben-Dor et al. 2000, Alizadeh et al. 2000). However, the simultaneous classification across a heterogeneous set of tumor types has not been well studied yet. We obtained 190 samples from 14 tumor classes and generated a combined expression dataset containing 16063 genes for each of those samples. We performed multi-class classification by combining the outputs of binary classifiers. Three binary classifiers (k-nearest neighbors, weighted voting, and support vector machines) were applied in conjunction with three combination scenarios (one-vs-all, all-pairs, hierarchical partitioning). We achieved the best cross validation error rate of 18.75% and the best test error rate of 21.74% by using the one-vs-all support vector machine algorithm. The results demonstrate the feasibility of performing clinically useful classification from samples of multiple tumor types.
DOI: 10.1038/nature06311
2007
Cited 241 times
Distinct physiological states of Plasmodium falciparum in malaria-infected patients
A major puzzle in understanding malaria is the wide range of clinical conditions seen in infected children — from mild flu-like symptoms to coma and death. A large-scale transcriptional analysis of malaria parasites isolated from human patients has uncovered a possible clue to this variation: Plasmodium falciparum exists in its human host in three different physiological states. These can be described as active growth, a response to starvation, and an environmental stress response. This finding has important implications both for treatment with current drugs and for future drug and vaccine development. This study presents the first large scale transcriptional analysis of malaria parasites isolated from human patients, and defines three distinct transcriptional patterns that can be described as active growth, response to starvation and environmental stress response. Infection with the malaria parasite Plasmodium falciparum leads to widely different clinical conditions in children, ranging from mild flu-like symptoms to coma and death1. Despite the immense medical implications, the genetic and molecular basis of this diversity remains largely unknown2. Studies of in vitro gene expression have found few transcriptional differences between different parasite strains3. Here we present a large study of in vivo expression profiles of parasites derived directly from blood samples from infected patients. The in vivo expression profiles define three distinct transcriptional states. The biological basis of these states can be interpreted by comparison with an extensive compendium of expression data in the yeast Saccharomyces cerevisiae. The three states in vivo closely resemble, first, active growth based on glycolytic metabolism, second, a starvation response accompanied by metabolism of alternative carbon sources, and third, an environmental stress response. The glycolytic state is highly similar to the known profile of the ring stage in vitro, but the other states have not been observed in vitro. The results reveal a previously unknown physiological diversity in the in vivo biology of the malaria parasite, in particular evidence for a functional mitochondrion in the asexual-stage parasite, and indicate in vivo and in vitro studies to determine how this variation may affect disease manifestations and treatment.
DOI: 10.1158/1541-7786.mcr-07-0344
2008
Cited 219 times
Gene Expression Changes in an Animal Melanoma Model Correlate with Aggressiveness of Human Melanoma Metastases
Abstract Metastasis is the deadliest phase of cancer progression. Experimental models using immunodeficient mice have been used to gain insights into the mechanisms of metastasis. We report here the identification of a “metastasis aggressiveness gene expression signature” derived using human melanoma cells selected based on their metastatic potentials in a xenotransplant metastasis model. Comparison with expression data from human melanoma patients shows that this metastasis gene signature correlates with the aggressiveness of melanoma metastases in human patients. Many genes encoding secreted and membrane proteins are included in the signature, suggesting the importance of tumor-microenvironment interactions during metastasis. (Mol Cancer Res 2008;6(5):760–9)
DOI: 10.1016/0022-2836(92)90104-r
1992
Cited 211 times
Hybrid system for protein secondary structure prediction
We have developed a hybrid system to predict the secondary structures (α-helix, β-sheet and coil) of proteins and achieved 66.4% accuracy, with correlation coefficients of Ccoil = 0.429, Cα = 0.470 and Cβ = 0.387. This system contains three subsystems (“experts”): a neural network module, a statistical module and a memory-based reasoning module. First, the three experts independently learn the mapping between amino acid sequences and secondary structures from the known protein structures, then a Combiner learns to combine automatically the outputs of the experts to make final predictions. The hybrid system was tested with 107 protein structures through k-way cross-validation. Its performance was better than each expert and all previously reported methods with greater than 0.99 statistical significance. It was observed that for 20% of the residues, all three experts produced the same but wrong predictions. This may suggest an upper bound on the accuracy of secondary structure predictions based on local information from the currently available protein structures, and indicate places where non-local interactions may play a dominant role in conformation. For 64% of the residues, at least two experts were the same and correct, which shows that the Combiner performed better than majority vote. For 77 % of the residues, at least one expert was correct, thus there may still be room for improvement in this hybrid approach. Rigorous evaluation procedures were used in testing the hybrid system, and statistical significance measures were developed in analyzing the differences among different methods. When measured in terms of the number of secondary structures (rather than the number of residues) that were predicted correctly, the prediction produced by the hybrid system was also better than those of individual experts.
DOI: 10.1073/pnas.0509014102
2005
Cited 195 times
Inactivation of the Snf5 tumor suppressor stimulates cell cycle progression and cooperates with p53 loss in oncogenic transformation
Snf5 (Ini1/Baf47/Smarcb1), a core member of the Swi/Snf chromatin remodeling complex, is a potent tumor suppressor whose mechanism of action is largely unknown. Biallelic loss of Snf5 leads to the onset of aggressive cancers in both humans and mice. We have developed an innovative and widely applicable analytical technique for cross-species validation of cancer models and show that the gene expression profiles of our Snf5 murine models closely resemble those of human Snf5-deficient rhabdoid tumors. We exploit this system to produce what we believe to be the first report documenting the effects on gene expression of inactivating a Swi/Snf subunit in normal mammalian cells and to identify the transcriptional pathways regulated by Snf5. We demonstrate that the tumor suppressor activity of Snf5 depends on its regulation of cell cycle progression; Snf5 inactivation leads to aberrant up-regulation of E2F targets and increased levels of p53 that are accompanied by apoptosis, polyploidy, and growth arrest. Further, conditional mouse models demonstrate that inactivation of p16Ink4a or Rb (retinoblastoma) does not accelerate tumor formation in Snf5 conditional mice, whereas mutation of p53 leads to a dramatic acceleration of tumor formation.
DOI: 10.1158/0008-5472.can-07-0539
2007
Cited 191 times
High Expression of Lymphocyte-Associated Genes in Node-Negative HER2+ Breast Cancers Correlates with Lower Recurrence Rates
Gene expression analysis has identified biologically relevant subclasses of breast cancer. However, most classification schemes do not robustly cluster all HER2+ breast cancers, in part due to limitations and bias of clustering techniques used. In this article, we propose an alternative approach that first separates the HER2+ tumors using a gene amplification signal for Her2/neu amplicon genes and then applies consensus ensemble clustering separately to the HER2+ and HER2- clusters to look for further substructure. We applied this procedure to a microarray data set of 286 early-stage breast cancers treated only with surgery and radiation and identified two basal and four luminal subtypes in the HER2- tumors, as well as two novel and robust HER2+ subtypes. HER2+ subtypes had median distant metastasis-free survival of 99 months [95% confidence interval (95% CI), 83-118 months] and 33 months (95% CI, 11-54 months), respectively, and recurrence rates of 11% and 58%, respectively. The low recurrence subtype had a strong relative overexpression of lymphocyte-associated genes and was also associated with a prominent lymphocytic infiltration on histologic analysis. These data suggest that early-stage HER2+ cancers associated with lymphocytic infiltration are a biologically distinct subtype with an improved natural history.
DOI: 10.1101/gr.3722605
2005
Cited 183 times
Assembly of polymorphic genomes: Algorithms and application to <i>Ciona savignyi</i>
Whole-genome assembly is now used routinely to obtain high-quality draft sequence for the genomes of species with low levels of polymorphism. However, genome assembly remains extremely challenging for highly polymorphic species. The difficulty arises because two divergent haplotypes are sequenced together, making it difficult to distinguish alleles at the same locus from paralogs at different loci. We present here a method for assembling highly polymorphic diploid genomes that involves assembling the two haplotypes separately and then merging them to obtain a reference sequence. Our method was developed to assemble the genome of the sea squirt Ciona savignyi, which was sequenced to a depth of 12.7 x from a single wild individual. By comparing finished clones of the two haplotypes we determined that the sequenced individual had an extremely high heterozygosity rate, averaging 4.6% with significant regional variation and rearrangements at all physical scales. Applied to these data, our method produced a reference assembly covering 157 Mb, with N50 contig and scaffold sizes of 47 kb and 989 kb, respectively. Alignment of ESTs indicates that 88% of loci are present at least once and 81% exactly once in the reference assembly. Our method represented loci in a single copy more reliably and achieved greater contiguity than a conventional whole-genome assembly method.
DOI: 10.1073/pnas.0914203107
2010
Cited 163 times
MYC regulation of a “poor-prognosis” metastatic cancer cell state
Gene expression signatures are used in the clinic as prognostic tools to determine the risk of individual patients with localized breast tumors developing distant metastasis. We lack a clear understanding, however, of whether these correlative biomarkers link to a common biological network that regulates metastasis. We find that the c-MYC oncoprotein coordinately regulates the expression of 13 different "poor-outcome" cancer signatures. In addition, functional inactivation of MYC in human breast cancer cells specifically inhibits distant metastasis in vivo and invasive behavior in vitro of these cells. These results suggest that MYC oncogene activity (as marked by "poor-prognosis" signature expression) may be necessary for the translocation of poor-outcome human breast tumors to distant sites.
DOI: 10.1126/scitranslmed.3003778
2012
Cited 158 times
Targeted Tumor-Penetrating siRNA Nanocomplexes for Credentialing the Ovarian Cancer Oncogene <i>ID4</i>
Tumor-penetrating siRNA nanocomplexes credential ID4 as a therapeutic oncogene target in human ovarian cancer.
DOI: 10.1093/bioinformatics/btv034
2015
Cited 143 times
Quantitative visualization of alternative exon expression from RNA-seq data
Analysis of RNA sequencing (RNA-Seq) data revealed that the vast majority of human genes express multiple mRNA isoforms, produced by alternative pre-mRNA splicing and other mechanisms, and that most alternative isoforms vary in expression between human tissues. As RNA-Seq datasets grow in size, it remains challenging to visualize isoform expression across multiple samples.To help address this problem, we present Sashimi plots, a quantitative visualization of aligned RNA-Seq reads that enables quantitative comparison of exon usage across samples or experimental conditions. Sashimi plots can be made using the Broad Integrated Genome Viewer or with a stand-alone command line program.Software code and documentation freely available here: http://miso.readthedocs.org/en/fastmiso/sashimi.html
DOI: 10.1073/pnas.0701068104
2007
Cited 137 times
Metagene projection for cross-platform, cross-species characterization of global transcriptional states
The high dimensionality of global transcription profiles, the expression level of 20,000 genes in a much small number of samples, presents challenges that affect the sensitivity and general applicability of analysis results. In principle, it would be better to describe the data in terms of a small number of metagenes, positive linear combinations of genes, which could reduce noise while still capturing the invariant biological features of the data. Here, we describe how to accomplish such a reduction in dimension by a metagene projection methodology, which can greatly reduce the number of features used to characterize microarray data. We show, in applications to the analysis of leukemia and lung cancer data sets, how this approach can help assess and interpret similarities and differences between independent data sets, enable cross-platform and cross-species analysis, improve clustering and class prediction, and provide a computational means to detect and remove sample contamination.
DOI: 10.1172/jci75661
2014
Cited 131 times
Targeting an IKBKE cytokine network impairs triple-negative breast cancer growth
Triple-negative breast cancers (TNBCs) are a heterogeneous set of cancers that are defined by the absence of hormone receptor expression and HER2 amplification. Here, we found that inducible IκB kinase-related (IKK-related) kinase IKBKE expression and JAK/STAT pathway activation compose a cytokine signaling network in the immune-activated subset of TNBC. We found that treatment of cultured IKBKE-driven breast cancer cells with CYT387, a potent inhibitor of TBK1/IKBKE and JAK signaling, impairs proliferation, while inhibition of JAK alone does not. CYT387 treatment inhibited activation of both NF-κB and STAT and disrupted expression of the protumorigenic cytokines CCL5 and IL-6 in these IKBKE-driven breast cancer cells. Moreover, in 3D culture models, the addition of CCL5 and IL-6 to the media not only promoted tumor spheroid dispersal but also stimulated proliferation and migration of endothelial cells. Interruption of cytokine signaling by CYT387 in vivo impaired the growth of an IKBKE-driven TNBC cell line and patient-derived xenografts (PDXs). A combination of CYT387 therapy with a MEK inhibitor was particularly effective, abrogating tumor growth and angiogenesis in an aggressive PDX model of TNBC. Together, these findings reveal that IKBKE-associated cytokine signaling promotes tumorigenicity of immune-driven TNBC and identify a potential therapeutic strategy using clinically available compounds.
DOI: 10.1101/gr.143586.112
2012
Cited 116 times
ATARiS: Computational quantification of gene suppression phenotypes from multisample RNAi screens
Genome-scale RNAi libraries enable the systematic interrogation of gene function. However, the interpretation of RNAi screens is complicated by the observation that RNAi reagents designed to suppress the mRNA transcripts of the same gene often produce a spectrum of phenotypic outcomes due to differential on-target gene suppression or perturbation of off-target transcripts. Here we present a computational method, Analytic Technique for Assessment of RNAi by Similarity (ATARiS), that takes advantage of patterns in RNAi data across multiple samples in order to enrich for RNAi reagents whose phenotypic effects relate to suppression of their intended targets. By summarizing only such reagent effects for each gene, ATARiS produces quantitative, gene-level phenotype values, which provide an intuitive measure of the effect of gene suppression in each sample. This method is robust for data sets that contain as few as 10 samples and can be used to analyze screens of any number of targeted genes. We used this analytic approach to interrogate RNAi data derived from screening more than 100 human cancer cell lines and identified HNF1B as a transforming oncogene required for the survival of cancer cells that harbor HNF1B amplifications. ATARiS is publicly available at http://broadinstitute.org/ataris.
DOI: 10.1158/0008-5472.can-13-1616
2013
Cited 115 times
Integrative Radiogenomic Profiling of Squamous Cell Lung Cancer
Abstract Radiotherapy is one of the mainstays of anticancer treatment, but the relationship between the radiosensitivity of cancer cells and their genomic characteristics is still not well defined. Here, we report the development of a high-throughput platform for measuring radiation survival in vitro and its validation in comparison with conventional clonogenic radiation survival analysis. We combined results from this high-throughput assay with genomic parameters in cell lines from squamous cell lung carcinoma, which is standardly treated by radiotherapy, to identify parameters that predict radiation sensitivity. We showed that activation of NFE2L2, a frequent event in lung squamous cancers, confers radiation resistance. An expression-based, in silico screen nominated inhibitors of phosphoinositide 3-kinase (PI3K) as NFE2L2 antagonists. We showed that the selective PI3K inhibitor, NVP-BKM120, both decreased NRF2 protein levels and sensitized NFE2L2 or KEAP1-mutant cells to radiation. We then combined results from this high-throughput assay with single-sample gene set enrichment analysis of gene expression data. The resulting analysis identified pathways implicated in cell survival, genotoxic stress, detoxification, and innate and adaptive immunity as key correlates of radiation sensitivity. The integrative and high-throughput methods shown here for large-scale profiling of radiation survival and genomic features of solid-tumor–derived cell lines should facilitate tumor radiogenomics and the discovery of genotype-selective radiation sensitizers and protective agents. Cancer Res; 73(20); 6289–98. ©2013 AACR.
DOI: 10.1186/1741-7015-11-220
2013
Cited 111 times
Criteria for the use of omics-based predictors in clinical trials: explanation and elaboration
High-throughput 'omics' technologies that generate molecular profiles for biospecimens have been extensively used in preclinical studies to reveal molecular subtypes and elucidate the biological mechanisms of disease, and in retrospective studies on clinical specimens to develop mathematical models to predict clinical endpoints. Nevertheless, the translation of these technologies into clinical tests that are useful for guiding management decisions for patients has been relatively slow. It can be difficult to determine when the body of evidence for an omics-based test is sufficiently comprehensive and reliable to support claims that it is ready for clinical use, or even that it is ready for definitive evaluation in a clinical trial in which it may be used to direct patient therapy. Reasons for this difficulty include the exploratory and retrospective nature of many of these studies, the complexity of these assays and their application to clinical specimens, and the many potential pitfalls inherent in the development of mathematical predictor models from the very high-dimensional data generated by these omics technologies. Here we present a checklist of criteria to consider when evaluating the body of evidence supporting the clinical use of a predictor to guide patient therapy. Included are issues pertaining to specimen and assay requirements, the soundness of the process for developing predictor models, expectations regarding clinical study design and conduct, and attention to regulatory, ethical, and legal issues. The proposed checklist should serve as a useful guide to investigators preparing proposals for studies involving the use of omics-based tests. The US National Cancer Institute plans to refer to these guidelines for review of proposals for studies involving omics tests, and it is hoped that other sponsors will adopt the checklist as well.
DOI: 10.1007/978-3-642-12683-3_41
2010
Cited 110 times
Automated High-Dimensional Flow Cytometric Data Analysis
Flow cytometry is widely used for single cell interrogation of surface and intracellular protein expression by measuring fluorescence intensity of fluorophore-conjugated reagents. We focus on the recently developed procedure of Pyne et al. (2009, Proceedings of the National Academy of Sciences USA 106, 8519-8524) for automated high- dimensional flow cytometric analysis called FLAME (FLow analysis with Automated Multivariate Estimation). It introduced novel finite mixture models of heavy-tailed and asymmetric distributions to identify and model cell populations in a flow cytometric sample. This approach robustly addresses the complexities of flow data without the need for transformation or projection to lower dimensions. It also addresses the critical task of matching cell populations across samples that enables downstream analysis. It thus facilitates application of flow cytometry to new biological and clinical problems. To facilitate pipelining with standard bioinformatic applications such as high-dimensional visualization, subject classification or outcome prediction, FLAME has been incorporated with the GenePattern package of the Broad Institute. Thereby analysis of flow data can be approached similarly as other genomic platforms. We also consider some new work that proposes a rigorous and robust solution to the registration problem by a multi-level approach that allows us to model and register cell populations simultaneously across a cohort of high-dimensional flow samples. This new approach is called JCM (Joint Clustering and Matching). It enables direct and rigorous comparisons across different time points or phenotypes in a complex biological study as well as for classification of new patient samples in a more clinical setting.
DOI: 10.1158/2159-8290.cd-16-0960
2017
Cited 96 times
Exome Sequencing of African-American Prostate Cancer Reveals Loss-of-Function <i>ERF</i> Mutations
African-American men have the highest incidence of and mortality from prostate cancer. Whether a biological basis exists for this disparity remains unclear. Exome sequencing (n = 102) and targeted validation (n = 90) of localized primary hormone-naïve prostate cancer in African-American men identified several gene mutations not previously observed in this context, including recurrent loss-of-function mutations in ERF, an ETS transcriptional repressor, in 5% of cases. Analysis of existing prostate cancer cohorts revealed ERF deletions in 3% of primary prostate cancers and mutations or deletions in ERF in 3% to 5% of lethal castration-resistant prostate cancers. Knockdown of ERF confers increased anchorage-independent growth and generates a gene expression signature associated with oncogenic ETS activation and androgen signaling. Together, these results suggest that ERF is a prostate cancer tumor-suppressor gene. More generally, our findings support the application of systematic cancer genomic characterization in settings of broader ancestral diversity to enhance discovery and, eventually, therapeutic applications.Significance: Systematic genomic sequencing of prostate cancer in African-American men revealed new insights into prostate cancer, including the identification of ERF as a prostate cancer gene; somatic copy-number alteration differences; and uncommon PIK3CA and PTEN alterations. This study highlights the importance of inclusion of underrepresented minorities in cancer sequencing studies. Cancer Discov; 7(9); 973-83. ©2017 AACR.This article is highlighted in the In This Issue feature, p. 920.
DOI: 10.1073/pnas.1401819111
2014
Cited 95 times
Molecular adaptations of striatal spiny projection neurons during levodopa-induced dyskinesia
Levodopa treatment is the major pharmacotherapy for Parkinson's disease. However, almost all patients receiving levodopa eventually develop debilitating involuntary movements (dyskinesia). Although it is known that striatal spiny projection neurons (SPNs) are involved in the genesis of this movement disorder, the molecular basis of dyskinesia is not understood. In this study, we identify distinct cell-type-specific gene-expression changes that occur in subclasses of SPNs upon induction of a parkinsonian lesion followed by chronic levodopa treatment. We identify several hundred genes, the expression of which is correlated with levodopa dose, many of which are under the control of activator protein-1 and ERK signaling. Despite homeostatic adaptations involving several signaling modulators, activator protein-1-dependent gene expression remains highly dysregulated in direct pathway SPNs upon chronic levodopa treatment. We also discuss which molecular pathways are most likely to dampen abnormal dopaminoceptive signaling in spiny projection neurons, hence providing potential targets for antidyskinetic treatments in Parkinson's disease.
DOI: 10.1038/nbt.3527
2016
Cited 80 times
Characterizing genomic alterations in cancer by complementary functional associations
Complementary genomic features associated with pathway activation, gene dependency and drug sensitivity are uncovered using REVEALER. Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes.
DOI: 10.1016/j.neuron.2015.11.030
2016
Cited 72 times
Role of Tet1/3 Genes and Chromatin Remodeling Genes in Cerebellar Circuit Formation
Although mechanisms underlying early steps in cerebellar development are known, evidence is lacking on genetic and epigenetic changes during the establishment of the synaptic circuitry. Using metagene analysis, we report pivotal changes in multiple reactomes of epigenetic pathway genes in cerebellar granule cells (GCs) during circuit formation. During this stage, Tet genes are upregulated and vitamin C activation of Tet enzymes increases the levels of 5-hydroxymethylcytosine (5hmC) at exon start sites of upregulated genes, notably axon guidance genes and ion channel genes. Knockdown of Tet1 and Tet3 by RNAi in ex vivo cerebellar slice cultures inhibits dendritic arborization of developing GCs, a critical step in circuit formation. These findings demonstrate a role for Tet genes and chromatin remodeling genes in the formation of cerebellar circuitry.
DOI: 10.1101/2020.05.03.075499
2020
Cited 54 times
igv.js: an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV)
Abstract igv.js is an embeddable JavaScript implementation of the Integrative Genomics Viewer (IGV). It can be easily dropped into any web page with a single line of code and has no external dependencies. The viewer runs completely in the web browser, with no backend server and no data pre-processing required.
DOI: 10.1200/jco.22.02208
2023
Cited 14 times
The Childhood Cancer Data Initiative: Using the Power of Data to Learn From and Improve Outcomes for Every Child and Young Adult With Pediatric Cancer
Data-driven basic, translational, and clinical research has resulted in improved outcomes for children, adolescents, and young adults (AYAs) with pediatric cancers. However, challenges in sharing data between institutions, particularly in research, prevent addressing substantial unmet needs in children and AYA patients diagnosed with certain pediatric cancers. Systematically collecting and sharing data from every child and AYA can enable greater understanding of pediatric cancers, improve survivorship, and accelerate development of new and more effective therapies. To accomplish this goal, the Childhood Cancer Data Initiative (CCDI) was launched in 2019 at the National Cancer Institute. CCDI is a collaborative community endeavor supported by a 10-year, $50-million (in US dollars) annual federal investment. CCDI aims to learn from every patient diagnosed with a pediatric cancer by designing and building a data ecosystem that facilitates data collection, sharing, and analysis for researchers, clinicians, and patients across the cancer community. For example, CCDI's Molecular Characterization Initiative provides comprehensive clinical molecular characterization for children and AYAs with newly diagnosed cancers. Through these efforts, the CCDI strives to provide clinical benefit to patients and improvements in diagnosis and care through data-focused research support and to build expandable, sustainable data resources and workflows to advance research well past the planned 10 years of the initiative. Importantly, if CCDI demonstrates the success of this model for pediatric cancers, similar approaches can be applied to adults, transforming both clinical research and treatment to improve outcomes for all patients with cancer.
DOI: 10.1038/s41590-023-01529-7
2023
Cited 11 times
The GPCR–Gαs–PKA signaling axis promotes T cell dysfunction and cancer immunotherapy failure
DOI: 10.1145/332306.332564
2000
Cited 166 times
Class prediction and discovery using gene expression data
Classification of patient samples is a crucial aspect of cancer diagnosis and treatment. We present a method for classifying samples by computational analysis of gene expression data. We consider the classification problem in two parts: class discovery and class prediction. Class discovery refers to the process of dividing samples into reproducible classes that have similar behavior or properties, while class prediction places new samples into already known classes. We describe a method for performing class prediction and illustrate its strength by correctly classifying bone marrow and blood samples from acute leukemia patients. We also describe how to use our predictor to validate newly discovered classes, and we demonstrate how this technique could have discovered the key distinctions among leukemias if they were not already known. This proof-of-concept experiment paves the way for a wealth of future work on the molecular classification and understanding of disease.
DOI: 10.1126/science.1138764
2007
Cited 120 times
Comment on "The Consensus Coding Sequences of Human Breast and Colorectal Cancers"
Sjöblom et al. (Research Article, 13 October 2006, p. 268) reported nearly 200 novel cancer genes said to have a 90% probability of being involved in colon or breast cancer. However, their analysis raises two statistical concerns. When these concerns are addressed, few genes with significantly elevated mutation rates remain. Although the biological methodology in Sjöblom et al. is sound, more samples are needed to achieve sufficient power.
DOI: 10.1093/bioinformatics/btl196
2006
Cited 119 times
Comparative gene marker selection suite
An important step in analyzing expression profiles from microarray data is to identify genes that can discriminate between distinct classes of samples. Many statistical approaches for assigning significance values to genes have been developed. The Comparative Marker Selection suite consists of three modules that allow users to apply and compare different methods of computing significance for each marker gene, a viewer to assess the results, and a tool to create derivative datasets and marker lists based on user-defined significance criteria.The Comparative Marker Selection application suite is freely available as a GenePattern module. The GenePattern analysis environment is freely available at http://www.broad.mit.edu/genepattern.
DOI: 10.1177/0962280212460441
2012
Cited 87 times
The limitations of simple gene set enrichment analysis assuming gene independence
Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis’s nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis’s on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene–gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods.
DOI: 10.1200/jco.2010.28.1675
2011
Cited 77 times
Predicting Relapse in Patients With Medulloblastoma by Integrating Evidence From Clinical and Genomic Features
Purpose Despite significant progress in the molecular understanding of medulloblastoma, stratification of risk in patients remains a challenge. Focus has shifted from clinical parameters to molecular markers, such as expression of specific genes and selected genomic abnormalities, to improve accuracy of treatment outcome prediction. Here, we show how integration of high-level clinical and genomic features or risk factors, including disease subtype, can yield more comprehensive, accurate, and biologically interpretable prediction models for relapse versus no-relapse classification. We also introduce a novel Bayesian nomogram indicating the amount of evidence that each feature contributes on a patient-by-patient basis. Patients and Methods A Bayesian cumulative log-odds model of outcome was developed from a training cohort of 96 children treated for medulloblastoma, starting with the evidence provided by clinical features of metastasis and histology (model A) and incrementally adding the evidence from gene-expression–derived features representing disease subtype–independent (model B) and disease subtype–dependent (model C) pathways, and finally high-level copy-number genomic abnormalities (model D). The models were validated on an independent test cohort (n = 78). Results On an independent multi-institutional test data set, models A to D attain an area under receiver operating characteristic (au-ROC) curve of 0.73 (95% CI, 0.60 to 0.84), 0.75 (95% CI, 0.64 to 0.86), 0.80 (95% CI, 0.70 to 0.90), and 0.78 (95% CI, 0.68 to 0.88), respectively, for predicting relapse versus no relapse. Conclusion The proposed models C and D outperform the current clinical classification schema (au-ROC, 0.68), our previously published eight-gene outcome signature (au-ROC, 0.71), and several new schemas recently proposed in the literature for medulloblastoma risk stratification.
DOI: 10.1371/journal.pone.0054873
2013
Cited 73 times
Integrated Genomic Analysis of the 8q24 Amplification in Endometrial Cancers Identifies ATAD2 as Essential to MYC-Dependent Cancers
Chromosome 8q24 is the most commonly amplified region across multiple cancer types, and the typical length of the amplification suggests that it may target additional genes to MYC. To explore the roles of the genes most frequently included in 8q24 amplifications, we analyzed the relation between copy number alterations and gene expression in three sets of endometrial cancers (N = 252); and in glioblastoma, ovarian, and breast cancers profiled by TCGA. Among the genes neighbouring MYC, expression of the bromodomain-containing gene ATAD2 was the most associated with amplification. Bromodomain-containing genes have been implicated as mediators of MYC transcriptional function, and indeed ATAD2 expression was more closely associated with expression of genes known to be upregulated by MYC than was MYC itself. Amplifications of 8q24, expression of genes downstream from MYC, and overexpression of ATAD2 predicted poor outcome and increased from primary to metastatic lesions. Knockdown of ATAD2 and MYC in seven endometrial and 21 breast cancer cell lines demonstrated that cell lines that were dependent on MYC also depended upon ATAD2. These same cell lines were also the most sensitive to the histone deacetylase (HDAC) inhibitor Trichostatin-A, consistent with prior studies identifying bromodomain-containing proteins as targets of inhibition by HDAC inhibitors. Our data indicate high ATAD2 expression is a marker of aggressive endometrial cancers, and suggest specific inhibitors of ATAD2 may have therapeutic utility in these and other MYC-dependent cancers.
DOI: 10.1158/2159-8290.cd-12-0592
2013
Cited 69 times
Systematic Interrogation of 3q26 Identifies <i>TLOC1</i> and <i>SKIL</i> as Cancer Drivers
Abstract 3q26 is frequently amplified in several cancer types with a common amplified region containing 20 genes. To identify cancer driver genes in this region, we interrogated the function of each of these genes by loss- and gain-of-function genetic screens. Specifically, we found that TLOC1 (SEC62) was selectively required for the proliferation of cell lines with 3q26 amplification. Increased TLOC1 expression induced anchorage-independent growth, and a second 3q26 gene, SKIL (SNON), facilitated cell invasion in immortalized human mammary epithelial cells. Expression of both TLOC1 and SKIL induced subcutaneous tumor growth. Proteomic studies showed that TLOC1 binds to DDX3X, which is essential for TLOC1-induced transformation and affected protein translation. SKIL induced invasion through upregulation of SLUG (SNAI2) expression. Together, these studies identify TLOC1 and SKIL as driver genes at 3q26 and more broadly suggest that cooperating genes may be coamplified in other regions with somatic copy number gain. Significance: These studies identify TLOC1 and SKIL as driver genes in 3q26. These observations provide evidence that regions of somatic copy number gain may harbor cooperating genes of different but complementary functions. Cancer Discov; 3(9); 1044–57. ©2013 AACR. This article is highlighted in the In This Issue feature, p. 953
DOI: 10.1038/s41586-018-0722-x
2018
Cited 58 times
Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity
DOI: 10.1073/pnas.1608077113
2016
Cited 56 times
Leveraging premalignant biology for immune-based cancer prevention
Prevention is an essential component of cancer eradication. Next-generation sequencing of cancer genomes and epigenomes has defined large numbers of driver mutations and molecular subgroups, leading to therapeutic advances. By comparison, there is a relative paucity of such knowledge in premalignant neoplasia, which inherently limits the potential to develop precision prevention strategies. Studies on the interplay between germ-line and somatic events have elucidated genetic processes underlying premalignant progression and preventive targets. Emerging data hint at the immune system's ability to intercept premalignancy and prevent cancer. Genetically engineered mouse models have identified mechanisms by which genetic drivers and other somatic alterations recruit inflammatory cells and induce changes in normal cells to create and interact with the premalignant tumor microenvironment to promote oncogenesis and immune evasion. These studies are currently limited to only a few lesion types and patients. In this Perspective, we advocate a large-scale collaborative effort to systematically map the biology of premalignancy and the surrounding cellular response. By bringing together scientists from diverse disciplines (e.g., biochemistry, omics, and computational biology; microbiology, immunology, and medical genetics; engineering, imaging, and synthetic chemistry; and implementation science), we can drive a concerted effort focused on cancer vaccines to reprogram the immune response to prevent, detect, and reject premalignancy. Lynch syndrome, clonal hematopoiesis, and cervical intraepithelial neoplasia which also serve as models for inherited syndromes, blood, and viral premalignancies, are ideal scenarios in which to launch this initiative.
DOI: 10.1158/1078-0432.ccr-15-3011
2016
Cited 53 times
DiSCoVERing Innovative Therapies for Rare Tumors: Combining Genetically Accurate Disease Models with <i>In Silico</i> Analysis to Identify Novel Therapeutic Targets
We used human stem and progenitor cells to develop a genetically accurate novel model of MYC-driven Group 3 medulloblastoma. We also developed a new informatics method, Disease-model Signature versus Compound-Variety Enriched Response ("DiSCoVER"), to identify novel therapeutics that target this specific disease subtype.Human neural stem and progenitor cells derived from the cerebellar anlage were transduced with oncogenic elements associated with aggressive medulloblastoma. An in silico analysis method for screening drug sensitivity databases (DiSCoVER) was used in multiple drug sensitivity datasets. We validated the top hits from this analysis in vitro and in vivoHuman neural stem and progenitor cells transformed with c-MYC, dominant-negative p53, constitutively active AKT and hTERT formed tumors in mice that recapitulated Group 3 medulloblastoma in terms of pathology and expression profile. DiSCoVER analysis predicted that aggressive MYC-driven Group 3 medulloblastoma would be sensitive to cyclin-dependent kinase (CDK) inhibitors. The CDK 4/6 inhibitor palbociclib decreased proliferation, increased apoptosis, and significantly extended the survival of mice with orthotopic medulloblastoma xenografts.We present a new method to generate genetically accurate models of rare tumors, and a companion computational methodology to find therapeutic interventions that target them. We validated our human neural stem cell model of MYC-driven Group 3 medulloblastoma and showed that CDK 4/6 inhibitors are active against this subgroup. Our results suggest that palbociclib is a potential effective treatment for poor prognosis MYC-driven Group 3 medulloblastoma tumors in carefully selected patients. Clin Cancer Res; 22(15); 3903-14. ©2016 AACR.
DOI: 10.1038/s41592-018-0039-6
2018
Cited 53 times
GeNets: a unified web platform for network-based genomic analyses
Functional genomics networks are widely used to identify unexpected pathway relationships in large genomic datasets. However, it is challenging to compare the signal-to-noise ratios of different networks and to identify the optimal network with which to interpret a particular genetic dataset. We present GeNets, a platform in which users can train a machine-learning model (Quack) to carry out these comparisons and execute, store, and share analyses of genetic and RNA-sequencing datasets. The GeNets web platform can identify the most informative network, as well as execute, store and share network-based analyses of RNA-seq or genomic datasets.
DOI: 10.1158/2159-8290.cd-16-0844
2017
Cited 52 times
OTX2 Activity at Distal Regulatory Elements Shapes the Chromatin Landscape of Group 3 Medulloblastoma
Medulloblastoma is the most frequent malignant pediatric brain tumor and is divided into at least four subgroups known as WNT, SHH, Group 3, and Group 4. Here, we characterized gene regulation mechanisms in the most aggressive subtype, Group 3 tumors, through genome-wide chromatin and expression profiling. Our results show that most active distal sites in these tumors are occupied by the transcription factor OTX2. Highly active OTX2-bound enhancers are often arranged as clusters of adjacent peaks and are also bound by the transcription factor NEUROD1. These sites are responsive to OTX2 and NEUROD1 knockdown and could also be generated de novo upon ectopic OTX2 expression in primary cells, showing that OTX2 cooperates with NEUROD1 and plays a major role in maintaining and possibly establishing regulatory elements as a pioneer factor. Among OTX2 target genes, we identified the kinase NEK2, whose knockdown and pharmacologic inhibition decreased cell viability. Our studies thus show that OTX2 controls the regulatory landscape of Group 3 medulloblastoma through cooperative activity at enhancer elements and contributes to the expression of critical target genes.Significance: The gene regulation mechanisms that drive medulloblastoma are not well understood. Using chromatin profiling, we find that the transcription factor OTX2 acts as a pioneer factor and, in cooperation with NEUROD1, controls the Group 3 medulloblastoma active enhancer landscape. OTX2 itself or its target genes, including the mitotic kinase NEK2, represent attractive targets for future therapies. Cancer Discov; 7(3); 288-301. ©2017 AACR.This article is highlighted in the In This Issue feature, p. 235.
DOI: 10.1038/s41467-023-38271-5
2023
Cited 9 times
Germline modifiers of the tumor immune microenvironment implicate drivers of cancer risk and immunotherapy response
Abstract With the continued promise of immunotherapy for treating cancer, understanding how host genetics contributes to the tumor immune microenvironment (TIME) is essential to tailoring cancer screening and treatment strategies. Here, we study 1084 eQTLs affecting the TIME found through analysis of The Cancer Genome Atlas and literature curation. These TIME eQTLs are enriched in areas of active transcription, and associate with gene expression in specific immune cell subsets, such as macrophages and dendritic cells. Polygenic score models built with TIME eQTLs reproducibly stratify cancer risk, survival and immune checkpoint blockade (ICB) response across independent cohorts. To assess whether an eQTL-informed approach could reveal potential cancer immunotherapy targets, we inhibit CTSS , a gene implicated by cancer risk and ICB response-associated polygenic models; CTSS inhibition results in slowed tumor growth and extended survival in vivo. These results validate the potential of integrating germline variation and TIME characteristics for uncovering potential targets for immunotherapy.
DOI: 10.1093/bioinformatics/bth138
2004
Cited 90 times
GeneCluster 2.0: an advanced toolset for bioarray analysis
Abstract Summary: GeneCluster 2.0 is a software package for analyzing gene expression and other bioarray data, giving users a variety of methods to build and evaluate class predictors, visualize marker lists, cluster data and validate results. GeneCluster 2.0 greatly expands the data analysis capabilities of GeneCluster 1.0 by adding classification, class discovery and permutation test methods. It includes algorithms for building and testing supervised models using weighted voting and k-nearest neighbor algorithms, a module for systematically finding and evaluating clustering via self-organizing maps, and modules for marker gene selection and heat map visualization that allow users to view and sort samples and genes by many criteria. GeneCluster 2.0 is a standalone Java application and runs on any platform that supports the Java Runtime Environment version 1.3.1 or greater. Availability: http://www.broad.mit.edu/cancer/software
DOI: 10.1093/brain/awn118
2008
Cited 72 times
Cytometric profiling in multiple sclerosis uncovers patient population structure and a reduction of CD8low cells
As part of a biomarker discovery effort in peripheral blood, we acquired an immunological profile of cell-surface markers from healthy control and untreated subjects with relapsing–remitting MS (RRMS). Fresh blood from each subject was screened ex vivo using a panel of 50 fluorescently labelled monoclonal antibodies distributed amongst 56 pools of four antibodies each. From these 56 pools, we derived an immunological profile consisting of 1018 'features' for each subject in our analysis using a systematic gating strategy. These profiles were interrogated in an analysis with a screening phase (23 patients) and an extension phase (15 patients) to identify cell populations in peripheral blood whose frequency is altered in untreated RRMS subjects. A population of CD8lowCD4− cells was identified as being reduced in frequency in untreated RRMS subjects (P = 0.0002), and this observation was confirmed in an independent sample of subjects from the Comprehensive Longitudinal Investigation of MS at the Brigham & Women's Hospital (P = 0.002). This reduction in the frequency of CD8lowCD4− cells is also observed in 38 untreated subjects with a clinically isolated demyelination syndrome (CIS) (P = 0.0006). We also show that these differences may be due to a reduction in the CD8lowCD56+CD3−CD4− subset of CD8low cells, which have a natural killer cell profile. Similarities between untreated CIS and RRMS subjects extend to broader immunological profiles: consensus clustering of our data suggests that there are three distinct populations of untreated RRMS subjects and that these distinct phenotypic categories are already present in our sample of untreated CIS subjects. Thus, our large-scale immunophenotyping approach has yielded robust evidence for a reduction of CD8lowCD4− cells in both CIS and RRMS in the absence of treatment as well as suggestive evidence for the existence of immunologically distinct subsets of subjects with a demyelinating disease.
DOI: 10.1002/eji.201343657
2013
Cited 60 times
Gene signatures related to <scp>B</scp>‐cell proliferation predict influenza vaccine‐induced antibody response
Vaccines are very effective at preventing infectious disease but not all recipients mount a protective immune response to vaccination. Recently, gene expression profiles of PBMC samples in vaccinated individuals have been used to predict the development of protective immunity. However, the magnitude of change in gene expression that separates vaccine responders and nonresponders is likely to be small and distributed across networks of genes, making the selection of predictive and biologically relevant genes difficult. Here we apply a new approach to predicting vaccine response based on coordinated upregulation of sets of biologically informative genes in postvaccination gene expression profiles. We found that enrichment of gene sets related to proliferation and immunoglobulin genes accurately segregated high responders to influenza vaccination from low responders and achieved a prediction accuracy of 88% in an independent clinical trial. Many of the genes in these gene sets would not have been identified using conventional, single‐gene level approaches because of their subtle upregulation in vaccine responders. Our results demonstrate that gene set enrichment method can capture subtle transcriptional changes and may be a generally useful approach for developing and interpreting predictive models of the human immune response.
DOI: 10.1126/scitranslmed.3004186
2012
Cited 57 times
An RNA Profile Identifies Two Subsets of Multiple Sclerosis Patients Differing in Disease Activity
A peripheral blood mononuclear cell transcriptional profile differentiates two subsets of multiple sclerosis patients differing in their probability of a relapse.
DOI: 10.1371/journal.pone.0100334
2014
Cited 44 times
Joint Modeling and Registration of Cell Populations in Cohorts of High-Dimensional Flow Cytometric Data
In systems biomedicine, an experimenter encounters different potential sources of variation in data such as individual samples, multiple experimental conditions, and multi-variable network-level responses. In multiparametric cytometry, which is often used for analyzing patient samples, such issues are critical. While computational methods can identify cell populations in individual samples, without the ability to automatically match them across samples, it is difficult to compare and characterize the populations in typical experiments, such as those responding to various stimulations or distinctive of particular patients or time-points, especially when there are many samples. Joint Clustering and Matching (JCM) is a multi-level framework for simultaneous modeling and registration of populations across a cohort. JCM models every population with a robust multivariate probability distribution. Simultaneously, JCM fits a random-effects model to construct an overall batch template -- used for registering populations across samples, and classifying new samples. By tackling systems-level variation, JCM supports practical biomedical applications involving large cohorts.
DOI: 10.1038/nmeth.3732
2016
Cited 42 times
Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace
GenomeSpace is an open-source, cloud-based interoperability platform that facilitates integrative genomic analyses, allowing users to transition seamlessly between a diverse and growing set of bioinformatics tools and data resources. Complex biomedical analyses require the use of multiple software tools in concert and remain challenging for much of the biomedical research community. We introduce GenomeSpace ( http://www.genomespace.org ), a cloud-based, cooperative community resource that currently supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate integrative analysis by non-programmers, it offers a growing set of 'recipes', short workflows to guide investigators through high-utility analysis tasks.
DOI: 10.1073/pnas.1912033116
2019
Cited 42 times
Inhibition of dual-specificity tyrosine phosphorylation-regulated kinase 2 perturbs 26S proteasome-addicted neoplastic progression
Dependence on the 26S proteasome is an Achilles’ heel for triple-negative breast cancer (TNBC) and multiple myeloma (MM). The therapeutic proteasome inhibitor, bortezomib, successfully targets MM but often leads to drug-resistant disease relapse and fails in breast cancer. Here we show that a 26S proteasome-regulating kinase, DYRK2, is a therapeutic target for both MM and TNBC. Genome editing or small-molecule mediated inhibition of DYRK2 significantly reduces 26S proteasome activity, bypasses bortezomib resistance, and dramatically delays in vivo tumor growth in MM and TNBC thereby promoting survival. We further characterized the ability of LDN192960, a potent and selective DYRK2-inhibitor, to alleviate tumor burden in vivo. The drug docks into the active site of DYRK2 and partially inhibits all 3 core peptidase activities of the proteasome. Our results suggest that targeting 26S proteasome regulators will pave the way for therapeutic strategies in MM and TNBC.
DOI: 10.1164/rccm.201506-1100oc
2016
Cited 41 times
Responses to Bacteria, Virus, and Malaria Distinguish the Etiology of Pediatric Clinical Pneumonia
Plasma-detectable biomarkers that rapidly and accurately diagnose bacterial infections in children with suspected pneumonia could reduce the morbidity of respiratory disease and decrease the unnecessary use of antibiotic therapy.Using 56 markers measured in a multiplexed immunoassay, we sought to identify proteins and protein combinations that could discriminate bacterial from viral or malarial diagnoses.We selected 80 patients with clinically diagnosed pneumonia (as defined by the World Health Organization) who also met criteria for bacterial, viral, or malarial infection based on clinical, radiographic, and laboratory results. Ten healthy community control subjects were enrolled to assess marker reliability. Patients were subdivided into two sets: one for identifying potential markers and another for validating them.Three proteins (haptoglobin, tumor necrosis factor receptor 2 or IL-10, and tissue inhibitor of metalloproteinases 1) were identified that, when combined through a classification tree signature, accurately classified patients into bacterial, malarial, and viral etiologies and misclassified only one patient with bacterial pneumonia from the validation set. The overall sensitivity and specificity of this signature for the bacterial diagnosis were 96 and 86%, respectively. Alternative combinations of markers with comparable accuracy were selected by support vector machine and regression models and included haptoglobin, IL-10, and creatine kinase-MB.Combinations of plasma proteins accurately identified children with a respiratory syndrome who were likely to have bacterial infections and who would benefit from antibiotic therapy. When used in conjunction with malaria diagnostic tests, they may improve diagnostic specificity and simplify treatment decisions for clinicians.
DOI: 10.1016/j.cels.2017.08.002
2017
Cited 40 times
Decomposing Oncogenic Transcriptional Signatures to Generate Maps of Divergent Cellular States
<h2>Summary</h2> The systematic sequencing of the cancer genome has led to the identification of numerous genetic alterations in cancer. However, a deeper understanding of the functional consequences of these alterations is necessary to guide appropriate therapeutic strategies. Here, we describe <i>Onco-</i>GPS (<i>Onco</i>Genic Positioning System), a data-driven analysis framework to organize individual tumor samples with shared oncogenic alterations onto a reference map defined by their underlying cellular states. We applied the methodology to the RAS pathway and identified nine distinct components that reflect transcriptional activities downstream of RAS and defined several functional states associated with patterns of transcriptional component activation that associates with genomic hallmarks and response to genetic and pharmacological perturbations. These results show that the <i>Onco</i>-GPS is an effective approach to explore the complex landscape of oncogenic cellular states across cancers, and an analytic framework to summarize knowledge, establish relationships, and generate more effective disease models for research or as part of individualized precision medicine paradigms.