ϟ

Iñigo Martincorena

Here are all the papers by Iñigo Martincorena that you can download and read on OA.mg.
Iñigo Martincorena’s last known institution is . Download Iñigo Martincorena PDFs here.

Claim this Profile →
DOI: 10.1056/nejmoa1516192
2016
Cited 3,079 times
Genomic Classification and Prognosis in Acute Myeloid Leukemia
Recent studies have provided a detailed census of genes that are mutated in acute myeloid leukemia (AML). Our next challenge is to understand how this genetic diversity defines the pathophysiology of AML and informs clinical practice.We enrolled a total of 1540 patients in three prospective trials of intensive therapy. Combining driver mutations in 111 cancer genes with cytogenetic and clinical data, we defined AML genomic subgroups and their relevance to clinical outcomes.We identified 5234 driver mutations across 76 genes or genomic regions, with 2 or more drivers identified in 86% of the patients. Patterns of co-mutation compartmentalized the cohort into 11 classes, each with distinct diagnostic features and clinical outcomes. In addition to currently defined AML subgroups, three heterogeneous genomic categories emerged: AML with mutations in genes encoding chromatin, RNA-splicing regulators, or both (in 18% of patients); AML with TP53 mutations, chromosomal aneuploidies, or both (in 13%); and, provisionally, AML with IDH2(R172) mutations (in 1%). Patients with chromatin-spliceosome and TP53-aneuploidy AML had poor outcomes, with the various class-defining mutations contributing independently and additively to the outcome. In addition to class-defining lesions, other co-occurring driver mutations also had a substantial effect on overall survival. The prognostic effects of individual mutations were often significantly altered by the presence or absence of other driver mutations. Such gene-gene interactions were especially pronounced for NPM1-mutated AML, in which patterns of co-mutation identified groups with a favorable or adverse prognosis. These predictions require validation in prospective clinical trials.The driver landscape in AML reveals distinct molecular subgroups that reflect discrete paths in the evolution of AML, informing disease classification and prognostic stratification. (Funded by the Wellcome Trust and others; ClinicalTrials.gov number, NCT00146120.).
DOI: 10.1038/s41586-020-1943-3
2020
Cited 2,191 times
The repertoire of mutational signatures in human cancer
Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature1. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses3-15, enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated-but distinct-DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer.
DOI: 10.1038/nature17676
2016
Cited 1,773 times
Landscape of somatic mutations in 560 breast cancer whole-genome sequences
We analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer. Whole-genome sequencing of tumours from 560 breast cancer cases provides a comprehensive genome-wide view of recurrent somatic mutations and mutation frequencies across both protein coding and non-coding regions; several mutational signatures in these cancer genomes are associated with BRCA1 or BRCA2 function and defective homologous-recombination-based DNA repair. This study reports whole-genome sequencing of tumours and normal tissue from 560 breast cancer cases, providing a comprehensive genome-wide view of recurrent somatic mutations and mutation frequencies across both protein coding and non-coding regions. The authors analyse mutational signatures in these cancer genomes, including a new investigation of rearrangement mutational processes, and find several that are associated with BRCA1 or BRCA2 function and defective homologous-recombination-based DNA repair. They also find mutational signatures showing distinct DNA replication strand biases.
DOI: 10.1056/nejmoa1312542
2013
Cited 1,553 times
Somatic<i>CALR</i>Mutations in Myeloproliferative Neoplasms with Nonmutated<i>JAK2</i>
Somatic mutations in the Janus kinase 2 gene (JAK2) occur in many myeloproliferative neoplasms, but the molecular pathogenesis of myeloproliferative neoplasms with nonmutated JAK2 is obscure, and the diagnosis of these neoplasms remains a challenge.We performed exome sequencing of samples obtained from 151 patients with myeloproliferative neoplasms. The mutation status of the gene encoding calreticulin (CALR) was assessed in an additional 1345 hematologic cancers, 1517 other cancers, and 550 controls. We established phylogenetic trees using hematopoietic colonies. We assessed calreticulin subcellular localization using immunofluorescence and flow cytometry.Exome sequencing identified 1498 mutations in 151 patients, with medians of 6.5, 6.5, and 13.0 mutations per patient in samples of polycythemia vera, essential thrombocythemia, and myelofibrosis, respectively. Somatic CALR mutations were found in 70 to 84% of samples of myeloproliferative neoplasms with nonmutated JAK2, in 8% of myelodysplasia samples, in occasional samples of other myeloid cancers, and in none of the other cancers. A total of 148 CALR mutations were identified with 19 distinct variants. Mutations were located in exon 9 and generated a +1 base-pair frameshift, which would result in a mutant protein with a novel C-terminal. Mutant calreticulin was observed in the endoplasmic reticulum without increased cell-surface or Golgi accumulation. Patients with myeloproliferative neoplasms carrying CALR mutations presented with higher platelet counts and lower hemoglobin levels than patients with mutated JAK2. Mutation of CALR was detected in hematopoietic stem and progenitor cells. Clonal analyses showed CALR mutations in the earliest phylogenetic node, a finding consistent with its role as an initiating mutation in some patients.Somatic mutations in the endoplasmic reticulum chaperone CALR were found in a majority of patients with myeloproliferative neoplasms with nonmutated JAK2. (Funded by the Kay Kendall Leukaemia Fund and others.).
DOI: 10.1126/science.aaa6806
2015
Cited 1,457 times
High burden and pervasive positive selection of somatic mutations in normal human skin
Normal skin's curiously abnormal genome Within every tumor, a battle is being waged. As individual tumor cells acquire new mutations that promote their survival and growth, they clonally expand at the expense of tumor cells that are “less fit.” Martincorena et al. sequenced 234 biopsies of sun-exposed but physiologically normal skin from four individuals (see the Perspective by Brash). They found a surprisingly high burden of mutations, higher than that of many tumors. Many of the mutations known to drive the growth of cutaneous squamous cell carcinomas were already under strong positive selection. More than a quarter of normal skin cells carried a driver mutation, and every square centimeter of skin contained hundreds of competing mutant clones. Science , this issue p. 880 ; see also p. 867
DOI: 10.1016/j.cell.2017.09.042
2017
Cited 1,085 times
Universal Patterns of Selection in Cancer and Somatic Tissues
<h2>Summary</h2> Cancer develops as a result of somatic mutation and clonal selection, but quantitative measures of selection in cancer evolution are lacking. We adapted methods from molecular evolution and applied them to 7,664 tumors across 29 cancer types. Unlike species evolution, positive selection outweighs negative selection during cancer development. On average, <1 coding base substitution/tumor is lost through negative selection, with purifying selection almost absent outside homozygous loss of essential genes. This allows exome-wide enumeration of all driver coding mutations, including outside known cancer genes. On average, tumors carry ∼4 coding substitutions under positive selection, ranging from <1/tumor in thyroid and testicular cancers to >10/tumor in endometrial and colorectal cancers. Half of driver substitutions occur in yet-to-be-discovered cancer genes. With increasing mutation burden, numbers of driver mutations increase, but not linearly. We systematically catalog cancer genes and show that genes vary extensively in what proportion of mutations are drivers versus passengers.
DOI: 10.1126/science.aab4082
2015
Cited 999 times
Somatic mutation in cancer and normal cells
Spontaneously occurring mutations accumulate in somatic cells throughout a person’s lifetime. The majority of these mutations do not have a noticeable effect, but some can alter key cellular functions. Early somatic mutations can cause developmental disorders, whereas the progressive accumulation of mutations throughout life can lead to cancer and contribute to aging. Genome sequencing has revolutionized our understanding of somatic mutation in cancer, providing a detailed view of the mutational processes and genes that drive cancer. Yet, fundamental gaps remain in our knowledge of how normal cells evolve into cancer cells. We briefly summarize a number of the lessons learned over 5 years of cancer genome sequencing and discuss their implications for our understanding of cancer progression and aging.
DOI: 10.1126/science.aag0299
2016
Cited 824 times
Mutational signatures associated with tobacco smoking in human cancer
Assessing smoke damage in cancer genomes We have known for over 60 years that smoking tobacco is one of the most avoidable risk factors for cancer. Yet the detailed mechanisms by which tobacco smoke damages the genome and creates the mutations that ultimately cause cancer are still not fully understood. Alexandrov et al. examined mutational signatures and DNA methylation changes in over 5000 genome sequences from 17 different cancer types linked to smoking (see the Perspective by Pfeifer). They found a complex pattern of mutational signatures. Only cancers originating in tissues directly exposed to smoke showed a signature characteristic of the known tobacco carcinogen benzo[ a ]pyrene. One mysterious signature was shared by all smoking-associated cancers but is of unknown origin. Smoking had only a modest effect on DNA methylation. Science , this issue p. 618 ; see also p. 549
DOI: 10.1126/science.aau3879
2018
Cited 816 times
Somatic mutant clones colonize the human esophagus with age
The mutational burden of aging As people age, they accumulate somatic mutations in healthy cells. About 25% of cells in normal, sun-exposed skin harbor cancer driver mutations. What about tissues not exposed to powerful mutagens like ultraviolet light? Martincorena et al. performed targeted gene sequencing of normal esophageal epithelium from nine human donors of varying age (see the Perspective by Chanock). The mutation rate was lower in esophagus than in skin, but there was a strong positive selection of clones carrying mutations in 14 cancer-associated genes. By middle age, more than half of the esophageal epithelium was colonized by mutant clones. Interestingly, mutations in the cancer driver gene NOTCH1 were more common in normal esophageal epithelium than in esophageal cancer. Science , this issue p. 911 ; see also p. 893
DOI: 10.1038/nature19768
2016
Cited 778 times
Tissue-specific mutation accumulation in human adult stem cells during life
Stem cells of the liver, colon and small intestine gradually accumulate mutations throughout life at a similar rate even though cancer incidence varies greatly among these tissues. Accumulation of mutations in human adult stem cells in the course of a lifetime has been associated with increase in cancer risk. But the actual mutation rates and patterns in these cells are currently unknown. Edwin Cuppen and colleagues have sequenced DNA from clonal organoids in culture derived from primary multipotent cells obtained from donors of aged between 3 and 87 years. They find that mutations accumulate at a similar rate of approximately 40 novel mutations per year in tissues with known variations in cancer incidence, but they also observe tissue-specific mutation spectra in the colon and small intestine compared to the liver. The gradual accumulation of genetic mutations in human adult stem cells (ASCs) during life is associated with various age-related diseases, including cancer1,2. Extreme variation in cancer risk across tissues was recently proposed to depend on the lifetime number of ASC divisions, owing to unavoidable random mutations that arise during DNA replication1. However, the rates and patterns of mutations in normal ASCs remain unknown. Here we determine genome-wide mutation patterns in ASCs of the small intestine, colon and liver of human donors with ages ranging from 3 to 87 years by sequencing clonal organoid cultures derived from primary multipotent cells3,4,5. Our results show that mutations accumulate steadily over time in all of the assessed tissue types, at a rate of approximately 40 novel mutations per year, despite the large variation in cancer incidence among these tissues1. Liver ASCs, however, have different mutation spectra compared to those of the colon and small intestine. Mutational signature analysis reveals that this difference can be attributed to spontaneous deamination of methylated cytosine residues in the colon and small intestine, probably reflecting their high ASC division rate. In liver, a signature with an as-yet-unknown underlying mechanism is predominant. Mutation spectra of driver genes in cancer show high similarity to the tissue-specific ASC mutation spectra, suggesting that intrinsic mutational processes in ASCs can initiate tumorigenesis. Notably, the inter-individual variation in mutation rate and spectra are low, suggesting tissue-specific activity of common mutational processes throughout life.
DOI: 10.1038/ncomms3997
2014
Cited 751 times
Heterogeneity of genomic evolution and mutational profiles in multiple myeloma
Multiple myeloma is an incurable plasma cell malignancy with a complex and incompletely understood molecular pathogenesis. Here we use whole-exome sequencing, copy-number profiling and cytogenetics to analyse 84 myeloma samples. Most cases have a complex subclonal structure and show clusters of subclonal variants, including subclonal driver mutations. Serial sampling reveals diverse patterns of clonal evolution, including linear evolution, differential clonal response and branching evolution. Diverse processes contribute to the mutational repertoire, including kataegis and somatic hypermutation, and their relative contribution changes over time. We find heterogeneity of mutational spectrum across samples, with few recurrent genes. We identify new candidate genes, including truncations of SP140, LTB, ROBO1 and clustered missense mutations in EGR1. The myeloma genome is heterogeneous across the cohort, and exhibits diversity in clonal admixture and in dynamics of evolution, which may impact prognostic stratification, therapeutic approaches and assessment of disease response to treatment.
DOI: 10.1038/s41586-019-1907-7
2020
Cited 712 times
The evolutionary history of 2,658 cancers
Abstract Cancer develops through a process of somatic evolution 1,2 . Sequencing data from a single biopsy represent a snapshot of this process that can reveal the timing of specific genomic aberrations and the changing influence of mutational processes 3 . Here, by whole-genome sequencing analysis of 2,658 cancers as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) 4 , we reconstruct the life history and evolution of mutational processes and driver mutation sequences of 38 types of cancer. Early oncogenesis is characterized by mutations in a constrained set of driver genes, and specific copy number gains, such as trisomy 7 in glioblastoma and isochromosome 17q in medulloblastoma. The mutational spectrum changes significantly throughout tumour evolution in 40% of samples. A nearly fourfold diversification of driver genes and increased genomic instability are features of later stages. Copy number alterations often occur in mitotic crises, and lead to simultaneous gains of chromosomal segments. Timing analyses suggest that driver mutations often precede diagnosis by many years, if not decades. Together, these results determine the evolutionary trajectories of cancer, and highlight opportunities for early cancer detection.
DOI: 10.1038/s41586-018-0317-6
2018
Cited 628 times
Prediction of acute myeloid leukaemia risk in healthy individuals
The incidence of acute myeloid leukaemia (AML) increases with age and mortality exceeds 90% when diagnosed after age 65. Most cases arise without any detectable early symptoms and patients usually present with the acute complications of bone marrow failure1. The onset of such de novo AML cases is typically preceded by the accumulation of somatic mutations in preleukaemic haematopoietic stem and progenitor cells (HSPCs) that undergo clonal expansion2,3. However, recurrent AML mutations also accumulate in HSPCs during ageing of healthy individuals who do not develop AML, a phenomenon referred to as age-related clonal haematopoiesis (ARCH)4–8. Here we use deep sequencing to analyse genes that are recurrently mutated in AML to distinguish between individuals who have a high risk of developing AML and those with benign ARCH. We analysed peripheral blood cells from 95 individuals that were obtained on average 6.3 years before AML diagnosis (pre-AML group), together with 414 unselected age- and gender-matched individuals (control group). Pre-AML cases were distinct from controls and had more mutations per sample, higher variant allele frequencies, indicating greater clonal expansion, and showed enrichment of mutations in specific genes. Genetic parameters were used to derive a model that accurately predicted AML-free survival; this model was validated in an independent cohort of 29 pre-AML cases and 262 controls. Because AML is rare, we also developed an AML predictive model using a large electronic health record database that identified individuals at greater risk. Collectively our findings provide proof-of-concept that it is possible to discriminate ARCH from pre-AML many years before malignant transformation. This could in future enable earlier detection and monitoring, and may help to inform intervention. Individuals who are at high risk of developing acute myeloid leukaemia can be identified years before diagnosis using genetic information from blood samples.
DOI: 10.1038/s41586-019-1913-9
2020
Cited 573 times
Patterns of somatic structural variation in human cancer genomes
Abstract A key mutational process in cancer is structural variation, in which rearrangements delete, amplify or reorder genomic segments that range in size from kilobases to whole chromosomes 1–7 . Here we develop methods to group, classify and describe somatic structural variants, using data from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumour types 8 . Sixteen signatures of structural variation emerged. Deletions have a multimodal size distribution, assort unevenly across tumour types and patients, are enriched in late-replicating regions and correlate with inversions. Tandem duplications also have a multimodal size distribution, but are enriched in early-replicating regions—as are unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy-number gains and frequent inverted rearrangements. One prominent structure consists of 2–7 templates copied from distinct regions of the genome strung together within one locus. Such cycles of templated insertions correlate with tandem duplications, and—in liver cancer—frequently activate the telomerase gene TERT . A wide variety of rearrangement processes are active in cancer, which generate complex configurations of the genome upon which selection can act.
DOI: 10.1016/j.ccell.2017.07.005
2017
Cited 520 times
Genomic Evolution of Breast Cancer Metastasis and Relapse
Patterns of genomic evolution between primary and metastatic breast cancer have not been studied in large numbers, despite patients with metastatic breast cancer having dismal survival. We sequenced whole genomes or a panel of 365 genes on 299 samples from 170 patients with locally relapsed or metastatic breast cancer. Several lines of analysis indicate that clones seeding metastasis or relapse disseminate late from primary tumors, but continue to acquire mutations, mostly accessing the same mutational processes active in the primary tumor. Most distant metastases acquired driver mutations not seen in the primary tumor, drawing from a wider repertoire of cancer genes than early drivers. These include a number of clinically actionable alterations and mutations inactivating SWI-SNF and JAK2-STAT3 pathways.
DOI: 10.1038/s41586-019-1672-7
2019
Cited 494 times
The landscape of somatic mutation in normal colorectal epithelial cells
The colorectal adenoma–carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic changes and consequent clonal expansions that lead to cancer1. However, our understanding of the earliest phases of colorectal neoplastic changes—which may occur in morphologically normal tissue—is comparatively limited, as for most cancer types. Here we use whole-genome sequencing to analyse hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed; some of these were ubiquitous and continuous, whereas others were only found in some individuals, in some crypts or during certain periods of life. Probable driver mutations were present in around 1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium. Colorectal cancers exhibit substantially increased mutational burdens relative to normal cells. Sequencing normal colorectal cells provides quantitative insights into the genomic and clonal evolution of cancer. Genome sequencing of hundreds of normal colonic crypts from 42 individuals sheds light on mutational processes and driver mutations in normal colorectal epithelial cells.
DOI: 10.1038/s41588-019-0576-7
2020
Cited 451 times
Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing
Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer.
DOI: 10.1038/s41586-018-0497-0
2018
Cited 439 times
Population dynamics of normal human blood inferred from somatic mutations
Haematopoietic stem cells drive blood production, but their population size and lifetime dynamics have not been quantified directly in humans. Here we identified 129,582 spontaneous, genome-wide somatic mutations in 140 single-cell-derived haematopoietic stem and progenitor colonies from a healthy 59-year-old man and applied population-genetics approaches to reconstruct clonal dynamics. Cell divisions from early embryogenesis were evident in the phylogenetic tree; all blood cells were derived from a common ancestor that preceded gastrulation. The size of the stem cell population grew steadily in early life, reaching a stable plateau by adolescence. We estimate the numbers of haematopoietic stem cells that are actively making white blood cells at any one time to be in the range of 50,000–200,000. We observed adult haematopoietic stem cell clones that generate multilineage outputs, including granulocytes and B lymphocytes. Harnessing naturally occurring mutations to report the clonal architecture of an organ enables the high-resolution reconstruction of somatic cell dynamics in humans. Analysis of blood from a healthy human show that haematopoietic stem cells increase rapidly in numbers through early life, reaching a stable plateau in adulthood, and contribute to myeloid and B lymphocyte populations throughout life.
DOI: 10.1016/j.cell.2018.02.020
2018
Cited 394 times
Timing the Landmark Events in the Evolution of Clear Cell Renal Cell Cancer: TRACERx Renal
Clear cell renal cell carcinoma (ccRCC) is characterized by near-universal loss of the short arm of chromosome 3, deleting several tumor suppressor genes. We analyzed whole genomes from 95 biopsies across 33 patients with clear cell renal cell carcinoma. We find hotspots of point mutations in the 5' UTR of TERT, targeting a MYC-MAX-MAD1 repressor associated with telomere lengthening. The most common structural abnormality generates simultaneous 3p loss and 5q gain (36% patients), typically through chromothripsis. This event occurs in childhood or adolescence, generally as the initiating event that precedes emergence of the tumor's most recent common ancestor by years to decades. Similar genomic changes drive inherited ccRCC. Modeling differences in age incidence between inherited and sporadic cancers suggests that the number of cells with 3p loss capable of initiating sporadic tumors is no more than a few hundred. Early development of ccRCC follows well-defined evolutionary trajectories, offering opportunity for early intervention.
DOI: 10.1016/j.cell.2012.12.023
2013
Cited 386 times
Direct Competition between hnRNP C and U2AF65 Protects the Transcriptome from the Exonization of Alu Elements
There are ~650,000 Alu elements in transcribed regions of the human genome. These elements contain cryptic splice sites, so they are in constant danger of aberrant incorporation into mature transcripts. Despite posing a major threat to transcriptome integrity, little is known about the molecular mechanisms preventing their inclusion. Here, we present a mechanism for protecting the human transcriptome from the aberrant exonization of transposable elements. Quantitative iCLIP data show that the RNA-binding protein hnRNP C competes with the splicing factor U2AF65 at many genuine and cryptic splice sites. Loss of hnRNP C leads to formation of previously suppressed Alu exons, which severely disrupt transcript function. Minigene experiments explain disease-associated mutations in Alu elements that hamper hnRNP C binding. Thus, by preventing U2AF65 binding to Alu elements, hnRNP C plays a critical role as a genome-wide sentinel protecting the transcriptome. The findings have important implications for human evolution and disease.
DOI: 10.1371/journal.pcbi.1000808
2010
Cited 360 times
The Organization of Local and Distant Functional Connectivity in the Human Brain
Information processing in the human brain arises from both interactions between adjacent areas and from distant projections that form distributed brain systems. Here we map interactions across different spatial scales by estimating the degree of intrinsic functional connectivity for the local (<or=14 mm) neighborhood directly surrounding brain regions as contrasted with distant (>14 mm) interactions. The balance between local and distant functional interactions measured at rest forms a map that separates sensorimotor cortices from heteromodal association areas and further identifies regions that possess both high local and distant cortical-cortical interactions. Map estimates of network measures demonstrate that high local connectivity is most often associated with a high clustering coefficient, long path length, and low physical cost. Task performance changed the balance between local and distant functional coupling in a subset of regions, particularly, increasing local functional coupling in regions engaged by the task. The observed properties suggest that the brain has evolved a balance that optimizes information-processing efficiency across different classes of specialized areas as well as mechanisms to modulate coupling in support of dynamically changing processing demands. We discuss the implications of these observations and applications of the present method for exploring normal and atypical brain function.
DOI: 10.1038/s41586-020-2214-z
2020
Cited 354 times
The mutational landscape of normal human endometrial epithelium
All normal somatic cells are thought to acquire mutations, but understanding of the rates, patterns, causes and consequences of somatic mutations in normal cells is limited. The uterine endometrium adopts multiple physiological states over a lifetime and is lined by a gland-forming epithelium1,2. Here, using whole-genome sequencing, we show that normal human endometrial glands are clonal cell populations with total mutation burdens that increase at about 29 base substitutions per year and that are many-fold lower than those of endometrial cancers. Normal endometrial glands frequently carry ‘driver’ mutations in cancer genes, the burden of which increases with age and decreases with parity. Cell clones with drivers often originate during the first decades of life and subsequently progressively colonize the epithelial lining of the endometrium. Our results show that mutational landscapes differ markedly between normal tissues—perhaps shaped by differences in their structure and physiology—and indicate that the procession of neoplastic change that leads to endometrial cancer is initiated early in life. Whole-genome sequencing of normal human endometrial glands shows that most are clonal cell populations and frequently carry cancer driver mutations that occur early in life, and that parity has a protective effect.
DOI: 10.1038/s41586-020-2832-5
2020
Cited 354 times
Evidence for 28 genetic disorders discovered by combining healthcare and research data
De novo mutations in protein-coding genes are a well-established cause of developmental disorders1. However, genes known to be associated with developmental disorders account for only a minority of the observed excess of such de novo mutations1,2. Here, to identify previously undescribed genes associated with developmental disorders, we integrate healthcare and research exome-sequence data from 31,058 parent–offspring trios of individuals with developmental disorders, and develop a simulation-based statistical test to identify gene-specific enrichment of de novo mutations. We identified 285 genes that were significantly associated with developmental disorders, including 28 that had not previously been robustly associated with developmental disorders. Although we detected more genes associated with developmental disorders, much of the excess of de novo mutations in protein-coding genes remains unaccounted for. Modelling suggests that more than 1,000 genes associated with developmental disorders have not yet been described, many of which are likely to be less penetrant than the currently known genes. Research access to clinical diagnostic datasets will be critical for completing the map of genes associated with developmental disorders. By integrating healthcare and exome-sequencing data from parent–offspring trios of patients with developmental disorders, 28 genes that had not previously been associated with developmental disorders were identified.
DOI: 10.1126/science.1251343
2014
Cited 353 times
Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes
Introduction The human genome is peppered with mobile repetitive elements called long interspersed nuclear element–1 (L1) retrotransposons. Propagating through RNA and cDNA intermediates, these molecular parasites copy and insert themselves throughout the genome, with potentially disruptive effects on neighboring genes or regulatory sequences. In the germ line, unique sequence downstream of L1 elements can also be retrotransposed if transcription continues beyond the repeat, a process known as 3′ transduction. There has been growing interest in retrotransposition and 3′ transduction as a possible source of somatic mutations during tumorigenesis. Rationale To explore whether 3′ transductions are frequent in cancer, we developed a bioinformatic algorithm for identifying somatically acquired retrotranspositions in cancer genomes. We applied our algorithm to 290 cancer samples from 244 patients across 12 tumor types. The unique downstream sequence mobilized with 3′ transductions effectively fingerprints the L1 source element, providing insights into the activity of individual L1 loci across the genome. Results Across the 290 samples, we identified 2756 somatic L1 retrotranspositions. Tumors from 53% of patients had at least one such event, with colorectal and lung cancers being most frequently affected (93% and 75% of patients, respectively). Somatic 3′ transductions comprised 24% of events, half of which represented mobilizations of unique sequence alone, without any accompanying L1 sequence. Overall, 95% of 3′ transductions identified derived from only 72 germline L1 source elements, with as few as four loci accounting for 50% of events. In a given sample, the same source element could generate 50 or more somatic transductions, scattered extensively across the genome. About 5% of somatic transductions arose from L1 source elements that were themselves somatic retrotranspositions. In three of the cases in which we sequenced more than one sample from a patient’s tumor, we were able to place 3′ transductions on the phylogenetic tree. We found that the activity of individual source elements fluctuated during tumor evolution, with different subclones exhibiting much variability in which elements were “on” and which were “off.” The ability to identify the individual L1 source elements active in a given tumor enabled us to study the promoter methylation of those elements specifically. We found that 3′ transduction activity in a patient’s tumor was always associated with hypomethylation of that element. Overall, 2.3% of transductions distributed exons or entire genes to other sites in the genome, and many more mobilized deoxyribonuclease I (DNAse-I) hypersensitive sites or transcription factor binding sites identified by the ENCODE project. Occasionally, somatic L1 insertions inserted near coding sequence and redistributed these exons elsewhere in the genome. However, we found no general effects of retrotranspositions on transcription levels of genes at the insertion points and no evidence for aberrant RNA species resulting from somatically acquired transposable elements. Indeed, as with germline retrotranspositions, somatic insertions exhibited a strong enrichment in heterochromatic, gene-poor regions of the genome. Conclusion Somatic 3′ transduction occurs frequently in human tumors, and in some cases transduction events can scatter exons, genes, and regulatory elements widely across the genome. Dissemination of these sequences appears to be due to a small number of highly active L1 elements, whose activity can wax and wane during tumor evolution. The majority of the retrotransposition events are likely to be harmless “passenger” mutations.
DOI: 10.1038/s41586-020-1961-1
2020
Cited 339 times
Tobacco smoking and somatic mutations in human bronchial epithelium
Tobacco smoking causes lung cancer1–3, a process that is driven by more than 60 carcinogens in cigarette smoke that directly damage and mutate DNA4,5. The profound effects of tobacco on the genome of lung cancer cells are well-documented6–10, but equivalent data for normal bronchial cells are lacking. Here we sequenced whole genomes of 632 colonies derived from single bronchial epithelial cells across 16 subjects. Tobacco smoking was the major influence on mutational burden, typically adding from 1,000 to 10,000 mutations per cell; massively increasing the variance both within and between subjects; and generating several distinct mutational signatures of substitutions and of insertions and deletions. A population of cells in individuals with a history of smoking had mutational burdens that were equivalent to those expected for people who had never smoked: these cells had less damage from tobacco-specific mutational processes, were fourfold more frequent in ex-smokers than current smokers and had considerably longer telomeres than their more-mutated counterparts. Driver mutations increased in frequency with age, affecting 4–14% of cells in middle-aged subjects who had never smoked. In current smokers, at least 25% of cells carried driver mutations and 0–6% of cells had two or even three drivers. Thus, tobacco smoking increases mutational burden, cell-to-cell heterogeneity and driver mutations, but quitting promotes replenishment of the bronchial epithelium from mitotically quiescent cells that have avoided tobacco mutagenesis. Whole-genome sequencing of normal bronchial epithelium from 16 individuals shows that tobacco smoking increases genomic heterogeneity, mutational burden and driver mutations, whereas stopping smoking promotes replenishment of the epithelium with near-normal cells.
DOI: 10.7554/elife.02935
2014
Cited 325 times
Origins and functional consequences of somatic mitochondrial DNA mutations in human cancer
Recent sequencing studies have extensively explored the somatic alterations present in the nuclear genomes of cancers. Although mitochondria control energy metabolism and apoptosis, the origins and impact of cancer-associated mutations in mtDNA are unclear. In this study, we analyzed somatic alterations in mtDNA from 1675 tumors. We identified 1907 somatic substitutions, which exhibited dramatic replicative strand bias, predominantly C &gt; T and A &gt; G on the mitochondrial heavy strand. This strand-asymmetric signature differs from those found in nuclear cancer genomes but matches the inferred germline process shaping primate mtDNA sequence content. A number of mtDNA mutations showed considerable heterogeneity across tumor types. Missense mutations were selectively neutral and often gradually drifted towards homoplasmy over time. In contrast, mutations resulting in protein truncation undergo negative selection and were almost exclusively heteroplasmic. Our findings indicate that the endogenous mutational mechanism has far greater impact than any other external mutagens in mitochondria and is fundamentally linked to mtDNA replication.
DOI: 10.1038/nature13448
2014
Cited 318 times
Genome sequencing of normal cells reveals developmental lineages and mutational processes
The somatic mutations present in the genome of a cell accumulate over the lifetime of a multicellular organism. These mutations can provide insights into the developmental lineage tree, the number of divisions that each cell has undergone and the mutational processes that have been operative. Here we describe whole genomes of clonal lines derived from multiple tissues of healthy mice. Using somatic base substitutions, we reconstructed the early cell divisions of each animal, demonstrating the contributions of embryonic cells to adult tissues. Differences were observed between tissues in the numbers and types of mutations accumulated by each cell, which likely reflect differences in the number of cell divisions they have undergone and varying contributions of different mutational processes. If somatic mutation rates are similar to those in mice, the results indicate that precise insights into development and mutagenesis of normal human cells will be possible.
DOI: 10.1038/ng.2874
2014
Cited 311 times
RAG-mediated recombination is the predominant driver of oncogenic rearrangement in ETV6-RUNX1 acute lymphoblastic leukemia
The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL) cases, is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, characterized by recombination signal sequence motifs near breakpoints, incorporation of non-templated sequence at junctions, ∼30-fold enrichment at promoters and enhancers of genes actively transcribed in B cell development and an unexpectedly high ratio of recurrent to non-recurrent structural variants. Single-cell tracking shows that this mechanism is active throughout leukemic evolution, with evidence of localized clustering and reiterated deletions. Integration of data on point mutations and rearrangements identifies ATF7IP and MGA as two new tumor-suppressor genes in ALL. Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1-positive lymphoblasts, targeting the promoters, enhancers and first exons of genes that normally regulate B cell differentiation.
DOI: 10.1016/j.cell.2021.03.009
2021
Cited 282 times
Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes
Intra-tumor heterogeneity (ITH) is a mechanism of therapeutic resistance and therefore an important clinical challenge. However, the extent, origin, and drivers of ITH across cancer types are poorly understood. To address this, we extensively characterize ITH across whole-genome sequences of 2,658 cancer samples spanning 38 cancer types. Nearly all informative samples (95.1%) contain evidence of distinct subclonal expansions with frequent branching relationships between subclones. We observe positive selection of subclonal driver mutations across most cancer types and identify cancer type-specific subclonal patterns of driver gene mutations, fusions, structural variants, and copy number alterations as well as dynamic changes in mutational processes between subclonal expansions. Our results underline the importance of ITH and its drivers in tumor evolution and provide a pan-cancer resource of comprehensively annotated subclonal events from whole-genome sequencing data.
DOI: 10.1038/s41588-019-0562-0
2020
Cited 277 times
Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition
About half of all cancers have somatic integrations of retrotransposons. Here, to characterize their role in oncogenesis, we analyzed the patterns and mechanisms of somatic retrotransposition in 2,954 cancer genomes from 38 histological cancer subtypes within the framework of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project. We identified 19,166 somatically acquired retrotransposition events, which affected 35% of samples and spanned a range of event types. Long interspersed nuclear element (LINE-1; L1 hereafter) insertions emerged as the first most frequent type of somatic structural variation in esophageal adenocarcinoma, and the second most frequent in head-and-neck and colorectal cancers. Aberrant L1 integrations can delete megabase-scale regions of a chromosome, which sometimes leads to the removal of tumor-suppressor genes, and can induce complex translocations and large-scale duplications. Somatic retrotranspositions can also initiate breakage-fusion-bridge cycles, leading to high-level amplification of oncogenes. These observations illuminate a relevant role of L1 retrotransposition in remodeling the cancer genome, with potential implications for the development of human tumors.
DOI: 10.1038/s41586-021-03477-4
2021
Cited 270 times
Somatic mutation landscapes at single-molecule resolution
Somatic mutations drive the development of cancer and may contribute to ageing and other diseases1,2. Despite their importance, the difficulty of detecting mutations that are only present in single cells or small clones has limited our knowledge of somatic mutagenesis to a minority of tissues. Here, to overcome these limitations, we developed nanorate sequencing (NanoSeq), a duplex sequencing protocol with error rates of less than five errors per billion base pairs in single DNA molecules from cell populations. This rate is two orders of magnitude lower than typical somatic mutation loads, enabling the study of somatic mutations in any tissue independently of clonality. We used this single-molecule sensitivity to study somatic mutations in non-dividing cells across several tissues, comparing stem cells to differentiated cells and studying mutagenesis in the absence of cell division. Differentiated cells in blood and colon displayed remarkably similar mutation loads and signatures to their corresponding stem cells, despite mature blood cells having undergone considerably more divisions. We then characterized the mutational landscape of post-mitotic neurons and polyclonal smooth muscle, confirming that neurons accumulate somatic mutations at a constant rate throughout life without cell division, with similar rates to mitotically active tissues. Together, our results suggest that mutational processes that are independent of cell division are important contributors to somatic mutagenesis. We anticipate that the ability to reliably detect mutations in single DNA molecules could transform our understanding of somatic mutagenesis and enable non-invasive studies on large-scale cohorts. NanoSeq is used to detect mutations in single DNA molecules and analyses show that mutational processes that are independent of cell division are important contributors to somatic mutagenesis.
DOI: 10.1038/ng.2921
2014
Cited 268 times
Recurrent PTPRB and PLCG1 mutations in angiosarcoma
Angiosarcoma is an aggressive malignancy that arises spontaneously or secondarily to ionizing radiation or chronic lymphoedema. Previous work has identified aberrant angiogenesis, including occasional somatic mutations in angiogenesis signaling genes, as a key driver of angiosarcoma. Here we employed whole-genome, whole-exome and targeted sequencing to study the somatic changes underpinning primary and secondary angiosarcoma. We identified recurrent mutations in two genes, PTPRB and PLCG1, which are intimately linked to angiogenesis. The endothelial phosphatase PTPRB, a negative regulator of vascular growth factor tyrosine kinases, harbored predominantly truncating mutations in 10 of 39 tumors (26%). PLCG1, a signal transducer of tyrosine kinases, encoded a recurrent, likely activating p.Arg707Gln missense variant in 3 of 34 cases (9%). Overall, 15 of 39 tumors (38%) harbored at least one driver mutation in angiogenesis signaling genes. Our findings inform and reinforce current therapeutic efforts to target angiogenesis signaling in angiosarcoma.
DOI: 10.1038/s41586-019-1670-9
2019
Cited 260 times
Somatic mutations and clonal dynamics in healthy and cirrhotic human liver
The most common causes of chronic liver disease are excess alcohol intake, viral hepatitis and non-alcoholic fatty liver disease, with the clinical spectrum ranging in severity from hepatic inflammation to cirrhosis, liver failure or hepatocellular carcinoma (HCC). The genome of HCC exhibits diverse mutational signatures, resulting in recurrent mutations across more than 30 cancer genes1-7. Stem cells from normal livers have a low mutational burden and limited diversity of signatures8, which suggests that the complexity of HCC arises during the progression to chronic liver disease and subsequent malignant transformation. Here, by sequencing whole genomes of 482 microdissections of 100-500 hepatocytes from 5 normal and 9 cirrhotic livers, we show that cirrhotic liver has a higher mutational burden than normal liver. Although rare in normal hepatocytes, structural variants, including chromothripsis, were prominent in cirrhosis. Driver mutations, such as point mutations and structural variants, affected 1-5% of clones. Clonal expansions of millimetres in diameter occurred in cirrhosis, with clones sequestered by the bands of fibrosis that surround regenerative nodules. Some mutational signatures were universal and equally active in both non-malignant hepatocytes and HCCs; some were substantially more active in HCCs than chronic liver disease; and others-arising from exogenous exposures-were present in a subset of patients. The activity of exogenous signatures between adjacent cirrhotic nodules varied by up to tenfold within each patient, as a result of clone-specific and microenvironmental forces. Synchronous HCCs exhibited the same mutational signatures as background cirrhotic liver, but with higher burden. Somatic mutations chronicle the exposures, toxicity, regeneration and clonal structure of liver tissue as it progresses from health to disease.
DOI: 10.1038/s41588-019-0557-x
2020
Cited 258 times
Comprehensive molecular characterization of mitochondrial genomes in human cancers
Abstract Mitochondria are essential cellular organelles that play critical roles in cancer. Here, as part of the International Cancer Genome Consortium/The Cancer Genome Atlas Pan-Cancer Analysis of Whole Genomes Consortium, which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumor types, we performed a multidimensional, integrated characterization of mitochondrial genomes and related RNA sequencing data. Our analysis presents the most definitive mutational landscape of mitochondrial genomes and identifies several hypermutated cases. Truncating mutations are markedly enriched in kidney, colorectal and thyroid cancers, suggesting oncogenic effects with the activation of signaling pathways. We find frequent somatic nuclear transfers of mitochondrial DNA, some of which disrupt therapeutic target genes. Mitochondrial copy number varies greatly within and across cancers and correlates with clinical variables. Co-expression analysis highlights the function of mitochondrial genes in oxidative phosphorylation, DNA repair and the cell cycle, and shows their connections with clinically actionable genes. Our study lays a foundation for translating mitochondrial biology into clinical applications.
DOI: 10.1038/nature21703
2017
Cited 235 times
Somatic mutations reveal asymmetric cellular dynamics in the early human embryo
Somatic cells acquire mutations throughout the course of an individual's life. Mutations occurring early in embryogenesis are often present in a substantial proportion of, but not all, cells in postnatal humans and thus have particular characteristics and effects. Depending on their location in the genome and the proportion of cells they are present in, these mosaic mutations can cause a wide range of genetic disease syndromes and predispose carriers to cancer. They have a high chance of being transmitted to offspring as de novo germline mutations and, in principle, can provide insights into early human embryonic cell lineages and their contributions to adult tissues. Although it is known that gross chromosomal abnormalities are remarkably common in early human embryos, our understanding of early embryonic somatic mutations is very limited. Here we use whole-genome sequences of normal blood from 241 adults to identify 163 early embryonic mutations. We estimate that approximately three base substitution mutations occur per cell per cell-doubling event in early human embryogenesis and these are mainly attributable to two known mutational signatures. We used the mutations to reconstruct developmental lineages of adult cells and demonstrate that the two daughter cells of many early embryonic cell-doubling events contribute asymmetrically to adult blood at an approximately 2:1 ratio. This study therefore provides insights into the mutation rates, mutational processes and developmental outcomes of cell dynamics that operate during early human embryogenesis.
DOI: 10.1038/ng.3756
2017
Cited 228 times
Precision oncology for acute myeloid leukemia using a knowledge bank approach
Peter Campbell, Hartmut Döhner and colleagues present an analysis of genetic mutations and clinical information from 1,540 patients with acute myeloid leukemia, demonstrating the utility of clinical knowledge banks for personalized medicine. They show that use of their approach could reduce the number of hematopoietic cell transplants in patients with AML by up to 25% while maintaining survival rates. Underpinning the vision of precision medicine is the concept that causative mutations in a patient's cancer drive its biology and, by extension, its clinical features and treatment response. However, considerable between-patient heterogeneity in driver mutations complicates evidence-based personalization of cancer care. Here, by reanalyzing data from 1,540 patients with acute myeloid leukemia (AML), we explore how large knowledge banks of matched genomic–clinical data can support clinical decision-making. Inclusive, multistage statistical models accurately predicted likelihoods of remission, relapse and mortality, which were validated using data from independent patients in The Cancer Genome Atlas. Comparison of long-term survival probabilities under different treatments enables therapeutic decision support, which is available in exploratory form online. Personally tailored management decisions could reduce the number of hematopoietic cell transplants in patients with AML by 20–25% while maintaining overall survival rates. Power calculations show that databases require information from thousands of patients for accurate decision support. Knowledge banks facilitate personally tailored therapeutic decisions but require sustainable updating, inclusive cohorts and large sample sizes.
DOI: 10.1038/s41586-022-04618-z
2022
Cited 227 times
Somatic mutation rates scale with lifespan across mammals
Abstract The rates and patterns of somatic mutation in normal tissues are largely unknown outside of humans 1–7 . Comparative analyses can shed light on the diversity of mutagenesis across species, and on long-standing hypotheses about the evolution of somatic mutation rates and their role in cancer and ageing. Here we performed whole-genome sequencing of 208 intestinal crypts from 56 individuals to study the landscape of somatic mutation across 16 mammalian species. We found that somatic mutagenesis was dominated by seemingly endogenous mutational processes in all species, including 5-methylcytosine deamination and oxidative damage. With some differences, mutational signatures in other species resembled those described in humans 8 , although the relative contribution of each signature varied across species. Notably, the somatic mutation rate per year varied greatly across species and exhibited a strong inverse relationship with species lifespan, with no other life-history trait studied showing a comparable association. Despite widely different life histories among the species we examined—including variation of around 30-fold in lifespan and around 40,000-fold in body mass—the somatic mutation burden at the end of lifespan varied only by a factor of around 3. These data unveil common mutational processes across mammals, and suggest that somatic mutation rates are evolutionarily constrained and may be a contributing factor in ageing.
DOI: 10.1126/science.aba8347
2020
Cited 203 times
Extensive heterogeneity in somatic mutation and selection in the human bladder
Genetic profiles of the bladder Depending on the environment of the individual, the human bladder can be exposed to carcinogens as they are flushed through the body. Lawson et al. and Li et al. examined the genetic composition of laser-dissected microbiopsies from normal and cancer cells collected from the urothelium, a specialized epithelium lining the lower urinary tract (see the Perspective by Rozen). These complementary studies identified the mutational landscape of bladder urothelium through various sequencing strategies and identified high mutational heterogeneity within and between individuals and tumors. Both studies identified mutational profiles related to specific carcinogens such as aristolochic acid and the molecules found in tobacco. These studies present a comprehensive description of the diverse mutational landscape of the human bladder in health and disease, unraveling positive selection for cancer-causing mutations, a diversity of mutational processes, and large differences across individuals. Science , this issue p. 75 , p. 82 ; see also p. 34
DOI: 10.1038/s41591-020-1072-4
2020
Cited 201 times
Whole genome, transcriptome and methylome profiling enhances actionable target discovery in high-risk pediatric cancer
DOI: 10.1038/s41586-021-03822-7
2021
Cited 195 times
The mutational landscape of human somatic and germline cells
Over the course of an individual's lifetime, normal human cells accumulate mutations1. Here we compare the mutational landscape in 29 cell types from the soma and germline using multiple samples from the same individuals. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types, but their absolute and relative contributions varied substantially. SBS18, which potentially reflects oxidative damage2, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The rate of mutation was lowest in spermatogonia, the stem cells from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutational processes and may be partially attributable to a low rate of cell division in basal spermatogonia. These results highlight similarities and differences in the maintenance of the germline and soma.
DOI: 10.1038/nature10995
2012
Cited 194 times
Evidence of non-random mutation rates suggests an evolutionary risk management strategy
DOI: 10.1038/s41467-019-11680-1
2019
Cited 194 times
Genomic landscape and chronological reconstruction of driver events in multiple myeloma
The multiple myeloma (MM) genome is heterogeneous and evolves through preclinical and post-diagnosis phases. Here we report a catalog and hierarchy of driver lesions using sequences from 67 MM genomes serially collected from 30 patients together with public exome datasets. Bayesian clustering defines at least 7 genomic subgroups with distinct sets of co-operating events. Focusing on whole genome sequencing data, complex structural events emerge as major drivers, including chromothripsis and a novel replication-based mechanism of templated insertions, which typically occur early. Hyperdiploidy also occurs early, with individual trisomies often acquired in different chronological windows during evolution, and with a preferred order of acquisition. Conversely, positively selected point mutations, whole genome duplication and chromoplexy events occur in later disease phases. Thus, initiating driver events, drawn from a limited repertoire of structural and numerical chromosomal changes, shape preferred trajectories of evolution that are biologically relevant but heterogeneous across patients.
DOI: 10.1038/ncomms15936
2017
Cited 190 times
Recurrent mutation of IGF signalling genes and distinct patterns of genomic rearrangement in osteosarcoma
Osteosarcoma is a primary malignancy of bone that affects children and adults. Here, we present the largest sequencing study of osteosarcoma to date, comprising 112 childhood and adult tumours encompassing all major histological subtypes. A key finding of our study is the identification of mutations in insulin-like growth factor (IGF) signalling genes in 8/112 (7%) of cases. We validate this observation using fluorescence in situ hybridization (FISH) in an additional 87 osteosarcomas, with IGF1 receptor (IGF1R) amplification observed in 14% of tumours. These findings may inform patient selection in future trials of IGF1R inhibitors in osteosarcoma. Analysing patterns of mutation, we identify distinct rearrangement profiles including a process characterized by chromothripsis and amplification. This process operates recurrently at discrete genomic regions and generates driver mutations. It may represent an age-independent mutational mechanism that contributes to the development of osteosarcoma in children and adults alike.
DOI: 10.1038/s41588-018-0086-z
2018
Cited 185 times
Sequencing of prostate cancers identifies new cancer genes, routes of progression and drug targets
Prostate cancer represents a substantial clinical challenge because it is difficult to predict outcome and advanced disease is often fatal. We sequenced the whole genomes of 112 primary and metastatic prostate cancer samples. From joint analysis of these cancers with those from previous studies (930 cancers in total), we found evidence for 22 previously unidentified putative driver genes harboring coding mutations, as well as evidence for NEAT1 and FOXA1 acting as drivers through noncoding mutations. Through the temporal dissection of aberrations, we identified driver mutations specifically associated with steps in the progression of prostate cancer, establishing, for example, loss of CHD1 and BRCA2 as early events in cancer development of ETS fusion-negative cancers. Computational chemogenomic (canSAR) analysis of prostate cancer mutations identified 11 targets of approved drugs, 7 targets of investigational drugs, and 62 targets of compounds that may be active and should be considered candidates for future clinical trials.
DOI: 10.1038/s41586-022-04786-y
2022
Cited 175 times
Clonal dynamics of haematopoiesis across the human lifespan
Age-related change in human haematopoiesis causes reduced regenerative capacity1, cytopenias2, immune dysfunction3 and increased risk of blood cancer4–6, but the reason for such abrupt functional decline after 70 years of age remains unclear. Here we sequenced 3,579 genomes from single cell-derived colonies of haematopoietic cells across 10 human subjects from 0 to 81 years of age. Haematopoietic stem cells or multipotent progenitors (HSC/MPPs) accumulated a mean of 17 mutations per year after birth and lost 30 base pairs per year of telomere length. Haematopoiesis in adults less than 65 years of age was massively polyclonal, with high clonal diversity and a stable population of 20,000–200,000 HSC/MPPs contributing evenly to blood production. By contrast, haematopoiesis in individuals aged over 75 showed profoundly decreased clonal diversity. In each of the older subjects, 30–60% of haematopoiesis was accounted for by 12–18 independent clones, each contributing 1–34% of blood production. Most clones had begun their expansion before the subject was 40 years old, but only 22% had known driver mutations. Genome-wide selection analysis estimated that between 1 in 34 and 1 in 12 non-synonymous mutations were drivers, accruing at constant rates throughout life, affecting more genes than identified in blood cancers. Loss of the Y chromosome conferred selective benefits in males. Simulations of haematopoiesis, with constant stem cell population size and constant acquisition of driver mutations conferring moderate fitness benefits, entirely explained the abrupt change in clonal structure in the elderly. Rapidly decreasing clonal diversity is a universal feature of haematopoiesis in aged humans, underpinned by pervasive positive selection acting on many more genes than currently identified. Haematopoiesis has high clonal diversity up to about 65 years of age, after which diversity drops precipitously owing to positive selection acting on a handful of clones that expand exponentially throughout adulthood.
DOI: 10.1038/s41467-018-05058-y
2018
Cited 162 times
Genomic patterns of progression in smoldering multiple myeloma
Abstract We analyzed whole genomes of unique paired samples from smoldering multiple myeloma (SMM) patients progressing to multiple myeloma (MM). We report that the genomic landscape, including mutational profile and structural rearrangements at the smoldering stage is very similar to MM. Paired sample analysis shows two different patterns of progression: a “static progression model”, where the subclonal architecture is retained as the disease progressed to MM suggesting that progression solely reflects the time needed to accumulate a sufficient disease burden; and a “spontaneous evolution model”, where a change in the subclonal composition is observed. We also observe that activation-induced cytidine deaminase plays a major role in shaping the mutational landscape of early subclinical phases, while progression is driven by APOBEC cytidine deaminases. These results provide a unique insight into myelomagenesis with potential implications for the definition of smoldering disease and timing of treatment initiation.
DOI: 10.1038/s41586-022-04785-z
2022
Cited 150 times
The longitudinal dynamics and natural history of clonal haematopoiesis
Abstract Clonal expansions driven by somatic mutations become pervasive across human tissues with age, including in the haematopoietic system, where the phenomenon is termed clonal haematopoiesis 1–4 . The understanding of how and when clonal haematopoiesis develops, the factors that govern its behaviour, how it interacts with ageing and how these variables relate to malignant progression remains limited 5,6 . Here we track 697 clonal haematopoiesis clones from 385 individuals 55 years of age or older over a median of 13 years. We find that 92.4% of clones expanded at a stable exponential rate over the study period, with different mutations driving substantially different growth rates, ranging from 5% ( DNMT3A and TP53 ) to more than 50% per year ( SRSF2 P95H ). Growth rates of clones with the same mutation differed by approximately ±5% per year, proportionately affecting slow drivers more substantially. By combining our time-series data with phylogenetic analysis of 1,731 whole-genome sequences of haematopoietic colonies from 7 individuals from an older age group, we reveal distinct patterns of lifelong clonal behaviour. DNMT3A -mutant clones preferentially expanded early in life and displayed slower growth in old age, in the context of an increasingly competitive oligoclonal landscape. By contrast, splicing gene mutations drove expansion only later in life, whereas TET2 -mutant clones emerged across all ages. Finally, we show that mutations driving faster clonal growth carry a higher risk of malignant progression. Our findings characterize the lifelong natural history of clonal haematopoiesis and give fundamental insights into the interactions between somatic mutation, ageing and clonal selection.
DOI: 10.1038/s42003-019-0741-7
2020
Cited 146 times
Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis
Abstract Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1 , but not MALAT1 . Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis.
DOI: 10.1038/s41467-020-15912-7
2020
Cited 143 times
Mutational signatures are jointly shaped by DNA damage and repair
Cells possess an armamentarium of DNA repair pathways to counter DNA damage and prevent mutation. Here we use C. elegans whole genome sequencing to systematically quantify the contributions of these factors to mutational signatures. We analyse 2,717 genomes from wild-type and 53 DNA repair defective backgrounds, exposed to 11 genotoxins, including UV-B and ionizing radiation, alkylating compounds, aristolochic acid, aflatoxin B1, and cisplatin. Combined genotoxic exposure and DNA repair deficiency alters mutation rates or signatures in 41% of experiments, revealing how different DNA alterations induced by the same genotoxin are mended by separate repair pathways. Error-prone translesion synthesis causes the majority of genotoxin-induced base substitutions, but averts larger deletions. Nucleotide excision repair prevents up to 99% of point mutations, almost uniformly across the mutation spectrum. Our data show that mutational signatures are joint products of DNA damage and repair and suggest that multiple factors underlie signatures observed in cancer genomes.
DOI: 10.7554/elife.66857
2021
Cited 128 times
Patterns of within-host genetic diversity in SARS-CoV-2
Monitoring the spread of SARS-CoV-2 and reconstructing transmission chains has become a major public health focus for many governments around the world. The modest mutation rate and rapid transmission of SARS-CoV-2 prevents the reconstruction of transmission chains from consensus genome sequences, but within-host genetic diversity could theoretically help identify close contacts. Here we describe the patterns of within-host diversity in 1181 SARS-CoV-2 samples sequenced to high depth in duplicate. 95.1% of samples show within-host mutations at detectable allele frequencies. Analyses of the mutational spectra revealed strong strand asymmetries suggestive of damage or RNA editing of the plus strand, rather than replication errors, dominating the accumulation of mutations during the SARS-CoV-2 pandemic. Within- and between-host diversity show strong purifying selection, particularly against nonsense mutations. Recurrent within-host mutations, many of which coincide with known phylogenetic homoplasies, display a spectrum and patterns of purifying selection more suggestive of mutational hotspots than recombination or convergent evolution. While allele frequencies suggest that most samples result from infection by a single lineage, we identify multiple putative examples of co-infection. Integrating these results into an epidemiological inference framework, we find that while sharing of within-host variants between samples could help the reconstruction of transmission chains, mutational hotspots and rare cases of superinfection can confound these analyses.The COVID-19 pandemic has had major health impacts across the globe. The scientific community has focused much attention on finding ways to monitor how the virus responsible for the pandemic, SARS-CoV-2, spreads. One option is to perform genetic tests, known as sequencing, on SARS-CoV-2 samples to determine the genetic code of the virus and to find any differences or mutations in the genes between the viral samples. Viruses mutate within their hosts and can develop into variants that are able to more easily transmit between hosts. Genetic sequencing can reveal how genetically similar two SARS-CoV-2 samples are. But tracking how SARS-CoV-2 moves from one person to the next through sequencing can be tricky. Even a sample of SARS-CoV-2 viruses from the same individual can display differences in their genetic material or within-host variants. Could genetic testing of within-host variants shed light on factors driving SARS-CoV-2 to evolve in humans? To get to the bottom of this, Tonkin-Hill, Martincorena et al. probed the genetics of SARS-CoV-2 within-host variants using 1,181 samples. The analyses revealed that 95.1% of samples contained within-host variants. A number of variants occurred frequently in many samples, which were consistent with mutational hotspots in the SARS-CoV-2 genome. In addition, within-host variants displayed mutation patterns that were similar to patterns found between infected individuals. The shared within-host variants between samples can help to reconstruct transmission chains. However, the observed mutational hotspots and the detection of multiple strains within an individual can make this challenging. These findings could be used to help predict how SARS-CoV-2 evolves in response to interventions such as vaccines. They also suggest that caution is needed when using information on within-host variants to determine transmission between individuals.
DOI: 10.1038/s41467-019-13983-9
2020
Cited 126 times
Integrative pathway enrichment analysis of multivariate omics data
Multi-omics datasets represent distinct aspects of the central dogma of molecular biology. Such high-dimensional molecular profiles pose challenges to data interpretation and hypothesis generation. ActivePathways is an integrative method that discovers significantly enriched pathways across multiple datasets using statistical data fusion, rationalizes contributing evidence and highlights associated genes. As part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we integrated genes with coding and non-coding mutations and revealed frequently mutated pathways and additional cancer genes with infrequent mutations. We also analyzed prognostic molecular pathways by integrating genomic and transcriptomic features of 1780 breast cancers and highlighted associations with immune response and anti-apoptotic signaling. Integration of ChIP-seq and RNA-seq data for master regulators of the Hippo pathway across normal human tissues identified processes of tissue regeneration and stem cell regulation. ActivePathways is a versatile method that improves systems-level understanding of cellular organization in health and disease through integration of multiple molecular datasets and pathway annotations.
DOI: 10.1016/j.cell.2020.06.036
2020
Cited 125 times
Somatic Evolution in Non-neoplastic IBD-Affected Colon
Inflammatory bowel disease (IBD) is a chronic inflammatory disease associated with increased risk of gastrointestinal cancers. We whole-genome sequenced 446 colonic crypts from 46 IBD patients and compared these to 412 crypts from 41 non-IBD controls from our previous publication on the mutation landscape of the normal colon. The average mutation rate of affected colonic epithelial cells is 2.4-fold that of healthy colon, and this increase is mostly driven by acceleration of mutational processes ubiquitously observed in normal colon. In contrast to the normal colon, where clonal expansions outside the confines of the crypt are rare, we observed widespread millimeter-scale clonal expansions. We discovered non-synonymous mutations in ARID1A, FBXW7, PIGR, ZC3H12A, and genes in the interleukin 17 and Toll-like receptor pathways, under positive selection in IBD. These results suggest distinct selection mechanisms in the colitis-affected colon and that somatic mutations potentially play a causal role in IBD pathogenesis.
DOI: 10.1016/j.xgen.2022.100179
2022
Cited 114 times
Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor
Mutational signature analysis is commonly performed in cancer genomic studies. Here, we present SigProfilerExtractor, an automated tool for de novo extraction of mutational signatures, and benchmark it against another 13 bioinformatics tools by using 34 scenarios encompassing 2,500 simulated signatures found in 60,000 synthetic genomes and 20,000 synthetic exomes. For simulations with 5% noise, reflecting high-quality datasets, SigProfilerExtractor outperforms other approaches by elucidating between 20% and 50% more true-positive signatures while yielding 5-fold less false-positive signatures. Applying SigProfilerExtractor to 4,643 whole-genome- and 19,184 whole-exome-sequenced cancers reveals four novel signatures. Two of the signatures are confirmed in independent cohorts, and one of these signatures is associated with tobacco smoking. In summary, this report provides a reference tool for analysis of mutational signatures, a comprehensive benchmarking of bioinformatics tools for extracting signatures, and several novel mutational signatures, including one putatively attributed to direct tobacco smoking mutagenesis in bladder tissues.
DOI: 10.1038/s41588-020-0624-3
2020
Cited 108 times
Spatial competition shapes the dynamic mutational landscape of normal esophageal epithelium
During aging, progenitor cells acquire mutations, which may generate clones that colonize the surrounding tissue. By middle age, normal human tissues, including the esophageal epithelium (EE), become a patchwork of mutant clones. Despite their relevance for understanding aging and cancer, the processes that underpin mutational selection in normal tissues remain poorly understood. Here, we investigated this issue in the esophageal epithelium of mutagen-treated mice. Deep sequencing identified numerous mutant clones with multiple genes under positive selection, including Notch1, Notch2 and Trp53, which are also selected in human esophageal epithelium. Transgenic lineage tracing revealed strong clonal competition that evolved over time. Clone dynamics were consistent with a simple model in which the proliferative advantage conferred by positively selected mutations depends on the nature of the neighboring cells. When clones with similar competitive fitness collide, mutant cell fate reverts towards homeostasis, a constraint that explains how selection operates in normal-appearing epithelium. Deep sequencing and lineage tracing analysis of esophageal epithelium of mutagen-treated aging mice leads to a model in which the proliferative advantage of positively selected mutations depends on the competitive fitness of neighboring cells.
DOI: 10.1038/s41588-021-00930-y
2021
Cited 93 times
Increased somatic mutation burdens in normal human cells due to defective DNA polymerases
Abstract Mutation accumulation in somatic cells contributes to cancer development and is proposed as a cause of aging. DNA polymerases Pol ε and Pol δ replicate DNA during cell division. However, in some cancers, defective proofreading due to acquired POLE / POLD1 exonuclease domain mutations causes markedly elevated somatic mutation burdens with distinctive mutational signatures. Germline POLE / POLD1 mutations cause familial cancer predisposition. Here, we sequenced normal tissue and tumor DNA from individuals with germline POLE / POLD1 mutations. Increased mutation burdens with characteristic mutational signatures were found in normal adult somatic cell types, during early embryogenesis and in sperm. Thus human physiology can tolerate ubiquitously elevated mutation burdens. Except for increased cancer risk, individuals with germline POLE / POLD1 mutations do not exhibit overt features of premature aging. These results do not support a model in which all features of aging are attributable to widespread cell malfunction directly resulting from somatic mutation burdens accrued during life.
DOI: 10.1038/s41586-021-03974-6
2021
Cited 91 times
Convergent somatic mutations in metabolism genes in chronic liver disease
The progression of chronic liver disease to hepatocellular carcinoma is caused by the acquisition of somatic mutations that affect 20-30 cancer genes1-8. Burdens of somatic mutations are higher and clonal expansions larger in chronic liver disease9-13 than in normal liver13-16, which enables positive selection to shape the genomic landscape9-13. Here we analysed somatic mutations from 1,590 genomes across 34 liver samples, including healthy controls, alcohol-related liver disease and non-alcoholic fatty liver disease. Seven of the 29 patients with liver disease had mutations in FOXO1, the major transcription factor in insulin signalling. These mutations affected a single hotspot within the gene, impairing the insulin-mediated nuclear export of FOXO1. Notably, six of the seven patients with FOXO1S22W hotspot mutations showed convergent evolution, with variants acquired independently by up to nine distinct hepatocyte clones per patient. CIDEB, which regulates lipid droplet metabolism in hepatocytes17-19, and GPAM, which produces storage triacylglycerol from free fatty acids20,21, also had a significant excess of mutations. We again observed frequent convergent evolution: up to fourteen independent clones per patient with CIDEB mutations and up to seven clones per patient with GPAM mutations. Mutations in metabolism genes were distributed across multiple anatomical segments of the liver, increased clone size and were seen in both alcohol-related liver disease and non-alcoholic fatty liver disease, but rarely in hepatocellular carcinoma. Master regulators of metabolic pathways are a frequent target of convergent somatic mutation in alcohol-related and non-alcoholic fatty liver disease.
DOI: 10.1038/s41586-021-03790-y
2021
Cited 89 times
Extensive phylogenies of human development inferred from somatic mutations
Starting from the zygote, all cells in the human body continuously acquire mutations. Mutations shared between different cells imply a common progenitor and are thus naturally occurring markers for lineage tracing1,2. Here we reconstruct extensive phylogenies of normal tissues from three adult individuals using whole-genome sequencing of 511 laser capture microdissections. Reconstructed embryonic progenitors in the same generation of a phylogeny often contribute to different extents to the adult body. The degree of this asymmetry varies between individuals, with ratios between the two reconstructed daughter cells of the zygote ranging from 60:40 to 93:7. Asymmetries pervade subsequent generations and can differ between tissues in the same individual. The phylogenies resolve the spatial embryonic patterning of tissues, revealing contiguous patches of, on average, 301 crypts in the adult colonic epithelium derived from a most recent embryonic cell and also a spatial effect in brain development. Using data from ten additional men, we investigated the developmental split between soma and germline, with results suggesting an extraembryonic contribution to primordial germ cells. This research demonstrates that, despite reaching the same ultimate tissue patterns, early bottlenecks and lineage commitments lead to substantial variation in embryonic patterns both within and between individuals.
DOI: 10.1126/science.1247167
2014
Cited 156 times
Transmissible Dog Cancer Genome Reveals the Origin and History of an Ancient Cell Lineage
Canine transmissible venereal tumor (CTVT) is the oldest known somatic cell lineage. It is a transmissible cancer that propagates naturally in dogs. We sequenced the genomes of two CTVT tumors and found that CTVT has acquired 1.9 million somatic substitution mutations and bears evidence of exposure to ultraviolet light. CTVT is remarkably stable and lacks subclonal heterogeneity despite thousands of rearrangements, copy-number changes, and retrotransposon insertions. More than 10,000 genes carry nonsynonymous variants, and 646 genes have been lost. CTVT first arose in a dog with low genomic heterozygosity that may have lived about 11,000 years ago. The cancer spawned by this individual dispersed across continents about 500 years ago. Our results provide a genetic identikit of an ancient dog and demonstrate the robustness of mammalian somatic cells to survive for millennia despite a massive mutation burden.
DOI: 10.1016/j.cell.2018.06.001
2018
Cited 116 times
Universal Patterns of Selection in Cancer and Somatic Tissues
(Cell 171, 1029–1041.e1–e15; November 16, 2017) It has come to our attention that in the Results and Discussion of the above article, we neglected to cite Davoli et al., 2013Davoli T. Xu A.W. Mengwasser K.E. Sack L.M. Yoon J.C. Park P.J. Elledge S.J. Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome.Cell. 2013; 155: 948-962Abstract Full Text Full Text PDF PubMed Scopus (476) Google Scholar. This paper identified several of the cancer driver genes we mentioned in our paper and provided estimates of the number of genes under positive selection in cancer. The text and references in the online version of our paper have been corrected accordingly. We apologize for the omission and any inconvenience it may have caused to the scientific community. Universal Patterns of Selection in Cancer and Somatic TissuesMartincorena et al.CellOctober 19, 2017In BriefAdapting an evolutionary genomics approach to cancer highlights a limited impact of negative selection on cancer genomes and significant variations in the proportion of coding driver mutations per tumor among different tumor types. Full-Text PDF Open Access
DOI: 10.1038/s41467-017-01026-0
2017
Cited 115 times
The driver landscape of sporadic chordoma
Chordoma is a malignant, often incurable bone tumour showing notochordal differentiation. Here, we defined the somatic driver landscape of 104 cases of sporadic chordoma. We reveal somatic duplications of the notochordal transcription factor brachyury (T) in up to 27% of cases. These variants recapitulate the rearrangement architecture of the pathogenic germline duplications of T that underlie familial chordoma. In addition, we find potentially clinically actionable PI3K signalling mutations in 16% of cases. Intriguingly, one of the most frequently altered genes, mutated exclusively by inactivating mutation, was LYST (10%), which may represent a novel cancer gene in chordoma.Chordoma is a rare often incurable malignant bone tumour. Here, the authors investigate driver mutations of sporadic chordoma in 104 cases, revealing duplications in notochordal transcription factor brachyury (T), PI3K signalling mutations, and mutations in LYST, a potential novel cancer gene in chordoma.
DOI: 10.1038/ng.2846
2013
Cited 114 times
Inactivating CUX1 mutations promote tumorigenesis
David Adams and colleagues identify inactivating mutations in CUX1 in diverse human cancers. They validate CUX1 as a tumor suppressor using mouse and Drosophila cancer models, and show that CUX1 deficiency activates phosphoinositide 3-kinase signaling through transcriptional downregulation of a PI3K inhibitor. A major challenge in cancer genetics is to determine which low-frequency somatic mutations are drivers of tumorigenesis. Here we interrogate the genomes of 7,651 diverse human cancers and find inactivating mutations in the homeodomain transcription factor gene CUX1 (cut-like homeobox 1) in ∼1–5% of various tumors. Meta-analysis of CUX1 mutational status in 2,519 cases of myeloid malignancies reveals disruptive mutations associated with poor survival, highlighting the clinical significance of CUX1 loss. In parallel, we validate CUX1 as a bona fide tumor suppressor using mouse transposon-mediated insertional mutagenesis and Drosophila cancer models. We demonstrate that CUX1 deficiency activates phosphoinositide 3-kinase (PI3K) signaling through direct transcriptional downregulation of the PI3K inhibitor PIK3IP1 (phosphoinositide-3-kinase interacting protein 1), leading to increased tumor growth and susceptibility to PI3K-AKT inhibition. Thus, our complementary approaches identify CUX1 as a pan-driver of tumorigenesis and uncover a potential strategy for treating CUX1-mutant tumors.
DOI: 10.1007/s10339-010-0372-x
2010
Cited 110 times
The semantic organization of the animal category: evidence from semantic verbal fluency and network theory
DOI: 10.1126/science.aax1323
2019
Cited 105 times
Embryonal precursors of Wilms tumor
Adult cancers often arise from premalignant clonal expansions. Whether the same is true of childhood tumors has been unclear. To investigate whether Wilms tumor (nephroblastoma; a childhood kidney cancer) develops from a premalignant background, we examined the phylogenetic relationship between tumors and corresponding normal tissues. In 14 of 23 cases studied (61%), we found premalignant clonal expansions in morphologically normal kidney tissues that preceded tumor development. These clonal expansions were defined by somatic mutations shared between tumor and normal tissues but absent from blood cells. We also found hypermethylation of the H19 locus, a known driver of Wilms tumor development, in 58% of the expansions. Phylogenetic analyses of bilateral tumors indicated that clonal expansions can evolve before the divergence of left and right kidney primordia. These findings reveal embryonal precursors from which unilateral and multifocal cancers develop.
DOI: 10.1186/s13073-019-0648-4
2019
Cited 102 times
Somatic mutation and clonal expansions in human tissues
Recent sequencing studies on healthy skin and esophagus have found that, as we age, these tissues become colonized by mutant clones of cells carrying driver mutations in traditional cancer genes. This comment summarizes these findings and discusses their possible implications for our understanding of cancer, ageing, and other diseases.
DOI: 10.1038/s41467-018-08081-1
2019
Cited 101 times
Cross-species genomic landscape comparison of human mucosal melanoma with canine oral and equine melanoma
Abstract Mucosal melanoma is a rare and poorly characterized subtype of human melanoma. Here we perform a cross-species analysis by sequencing tumor-germline pairs from 46 primary human muscosal, 65 primary canine oral and 28 primary equine melanoma cases from mucosal sites. Analysis of these data reveals recurrently mutated driver genes shared between species such as NRAS , FAT4, PTPRJ, TP53 and PTEN , and pathogenic germline alleles of BRCA1, BRCA2 and TP53 . We identify a UV mutation signature in a small number of samples, including human cases from the lip and nasal mucosa. A cross-species comparative analysis of recurrent copy number alterations identifies several candidate drivers including MDM2 , B2M , KNSTRN and BUB1B . Comparison of somatic mutations in recurrences and metastases to those in the primary tumor suggests pervasive intra-tumor heterogeneity. Collectively, these studies suggest a convergence of some genetic changes in mucosal melanomas between species but also distinctly different paths to tumorigenesis.
DOI: 10.1073/pnas.1803155115
2018
Cited 91 times
Cancer-mutation network and the number and specificity of driver mutations
Cancer genomics has produced extensive information on cancer-associated genes, but the number and specificity of cancer-driver mutations remains a matter of debate. We constructed a bipartite network in which 7,665 tumors from 30 cancer types are connected via shared mutations in 198 previously identified cancer genes. We show that about 27% of the tumors can be assigned to statistically supported modules, most of which encompass one or two cancer types. The rest of the tumors belong to a diffuse network component suggesting lower gene specificity of driver mutations. Linear regression of the mutational loads in cancer genes was used to estimate the number of drivers required for the onset of different cancers. The mean number of drivers in known cancer genes is approximately two, with a range of one to five. Cancers that are associated with modules had more drivers than those from the diffuse network component, suggesting that unidentified and/or interchangeable drivers exist in the latter.
DOI: 10.1038/s41467-019-13824-9
2020
Cited 90 times
Genomic footprints of activated telomere maintenance mechanisms in cancer
Abstract Cancers require telomere maintenance mechanisms for unlimited replicative potential. They achieve this through TERT activation or alternative telomere lengthening associated with ATRX or DAXX loss. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium , we dissect whole-genome sequencing data of over 2500 matched tumor-control samples from 36 different tumor types aggregated within the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium to characterize the genomic footprints of these mechanisms. While the telomere content of tumors with ATRX or DAXX mutations (ATRX/DAXX trunc ) is increased, tumors with TERT modifications show a moderate decrease of telomere content. One quarter of all tumor samples contain somatic integrations of telomeric sequences into non-telomeric DNA. This fraction is increased to 80% prevalence in ATRX/DAXX trunc tumors, which carry an aberrant telomere variant repeat (TVR) distribution as another genomic marker. The latter feature includes enrichment or depletion of the previously undescribed singleton TVRs TTCGGG and TTTGGG, respectively. Our systematic analysis provides new insight into the recurrent genomic alterations associated with telomere maintenance mechanisms in cancer.
DOI: 10.1038/ncomms4644
2014
Cited 89 times
Processed pseudogenes acquired somatically during cancer development
Cancer evolves by mutation, with somatic reactivation of retrotransposons being one such mutational process. Germline retrotransposition can cause processed pseudogenes, but whether this occurs somatically has not been evaluated. Here we screen sequencing data from 660 cancer samples for somatically acquired pseudogenes. We find 42 events in 17 samples, especially non-small cell lung cancer (5/27) and colorectal cancer (2/11). Genomic features mirror those of germline LINE element retrotranspositions, with frequent target-site duplications (67%), consensus TTTTAA sites at insertion points, inverted rearrangements (21%), 5' truncation (74%) and polyA tails (88%). Transcriptional consequences include expression of pseudogenes from UTRs or introns of target genes. In addition, a somatic pseudogene that integrated into the promoter and first exon of the tumour suppressor gene, MGA, abrogated expression from that allele. Thus, formation of processed pseudogenes represents a new class of mutation occurring during cancer development, with potentially diverse functional consequences depending on genomic context.
DOI: 10.1038/leu.2017.345
2017
Cited 86 times
Biological and prognostic impact of APOBEC-induced mutations in the spectrum of plasma cell dyscrasias and multiple myeloma cell lines
Biological and prognostic impact of APOBEC-induced mutations in the spectrum of plasma cell dyscrasias and multiple myeloma cell lines
DOI: 10.1016/j.ccell.2019.02.002
2019
Cited 85 times
Undifferentiated Sarcomas Develop through Distinct Evolutionary Pathways
Undifferentiated sarcomas (USARCs) of adults are diverse, rare, and aggressive soft tissue cancers. Recent sequencing efforts have confirmed that USARCs exhibit one of the highest burdens of structural aberrations across human cancer. Here, we sought to unravel the molecular basis of the structural complexity in USARCs by integrating DNA sequencing, ploidy analysis, gene expression, and methylation profiling. We identified whole genome duplication as a prevalent and pernicious force in USARC tumorigenesis. Using mathematical deconvolution strategies to unravel the complex copy-number profiles and mutational timing models we infer distinct evolutionary pathways of these rare cancers. In addition, 15% of tumors exhibited raised mutational burdens that correlated with gene expression signatures of immune infiltration, and good prognosis.
DOI: 10.1038/s41596-020-00437-6
2020
Cited 84 times
Reliable detection of somatic mutations in solid tissues by laser-capture microdissection and low-input DNA sequencing
DOI: 10.1038/s41467-020-14367-0
2020
Cited 75 times
Pathway and network analysis of more than 2500 whole cancer genomes
Abstract The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53 , TLE4 , and TCF4 . We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.
DOI: 10.1186/s12861-020-0209-5
2020
Cited 68 times
Long-term expansion, genomic stability and in vivo safety of adult human pancreas organoids
Abstract Background Pancreatic organoid systems have recently been described for the in vitro culture of pancreatic ductal cells from mouse and human. Mouse pancreatic organoids exhibit unlimited expansion potential, while previously reported human pancreas organoid (hPO) cultures do not expand efficiently long-term in a chemically defined, serum-free medium. We sought to generate a 3D culture system for long-term expansion of human pancreas ductal cells as hPOs to serve as the basis for studies of human pancreas ductal epithelium, exocrine pancreatic diseases and the development of a genomically stable replacement cell therapy for diabetes mellitus. Results Our chemically defined, serum-free, human pancreas organoid culture medium supports the generation and expansion of hPOs with high efficiency from both fresh and cryopreserved primary tissue. hPOs can be expanded from a single cell, enabling their genetic manipulation and generation of clonal cultures. hPOs expanded for months in vitro maintain their ductal morphology, biomarker expression and chromosomal integrity. Xenografts of hPOs survive long-term in vivo when transplanted into the pancreas of immunodeficient mice. Notably, mouse orthotopic transplants show no signs of tumorigenicity. Crucially, our medium also supports the establishment and expansion of hPOs in a chemically defined, modifiable and scalable, biomimetic hydrogel. Conclusions hPOs can be expanded long-term, from both fresh and cryopreserved human pancreas tissue in a chemically defined, serum-free medium with no detectable tumorigenicity. hPOs can be clonally expanded, genetically manipulated and are amenable to culture in a chemically defined hydrogel. hPOs therefore represent an abundant source of pancreas ductal cells that retain the characteristics of the tissue-of-origin, which opens up avenues for modelling diseases of the ductal epithelium and increasing understanding of human pancreas exocrine biology as well as for potentially producing insulin-secreting cells for the treatment of diabetes.
DOI: 10.1126/science.aau9923
2019
Cited 62 times
Somatic evolution and global expansion of an ancient transmissible cancer lineage
It's a dog's life Canine transmissible venereal tumor is one of the few cancer lineages that is transferred among individuals through contact. It arose millennia ago and has been evolving independently from its hosts ever since. Baez-Ortega et al. looked at the phylogenetic history of the cancer and describe several distinctive mutational patterns (see the Perspective by Maley and Shibata). Most notably, both positive and negative selection show only weak or distant signals. This suggests that the main driver of the lineage's evolution is neutral genetic drift. Understanding the influence of drift may reshape how we think about long-term cancer evolution. Science , this issue p. eaau9923 ; see also p. 440
DOI: 10.1038/s41588-019-0551-3
2020
Cited 53 times
Genomic evidence supports a clonal diaspora model for metastases of esophageal adenocarcinoma
The poor outcomes in esophageal adenocarcinoma (EAC) prompted us to interrogate the pattern and timing of metastatic spread. Whole-genome sequencing and phylogenetic analysis of 388 samples across 18 individuals with EAC showed, in 90% of patients, that multiple subclones from the primary tumor spread very rapidly from the primary site to form multiple metastases, including lymph nodes and distant tissues-a mode of dissemination that we term 'clonal diaspora'. Metastatic subclones at autopsy were present in tissue and blood samples from earlier time points. These findings have implications for our understanding and clinical evaluation of EAC.
DOI: 10.1038/s41467-022-31341-0
2022
Cited 38 times
Inherited MUTYH mutations cause elevated somatic mutation rates and distinctive mutational signatures in normal human cells
Abstract Cellular DNA damage caused by reactive oxygen species is repaired by the base excision repair (BER) pathway which includes the DNA glycosylase MUTYH. Inherited biallelic MUTYH mutations cause predisposition to colorectal adenomas and carcinoma. However, the mechanistic progression from germline MUTYH mutations to MUTYH-Associated Polyposis (MAP) is incompletely understood. Here, we sequence normal tissue DNAs from 10 individuals with MAP. Somatic base substitution mutation rates in intestinal epithelial cells were elevated 2 to 4-fold in all individuals, except for one showing a 31-fold increase, and were also increased in other tissues. The increased mutation burdens were of multiple mutational signatures characterised by C &gt; A changes. Different mutation rates and signatures between individuals are likely due to different MUTYH mutations or additional inherited mutations in other BER pathway genes. The elevated base substitution rate in normal cells likely accounts for the predisposition to neoplasia in MAP. Despite ubiquitously elevated mutation rates, individuals with MAP do not display overt evidence of premature ageing. Thus, accumulation of somatic mutations may not be sufficient to cause the global organismal functional decline of ageing.
DOI: 10.1038/s41586-022-05072-7
2022
Cited 36 times
Diverse mutational landscapes in human lymphocytes
The lymphocyte genome is prone to many threats, including programmed mutation during differentiation1, antigen-driven proliferation and residency in diverse microenvironments. Here, after developing protocols for expansion of single-cell lymphocyte cultures, we sequenced whole genomes from 717 normal naive and memory B and T cells and haematopoietic stem cells. All lymphocyte subsets carried more point mutations and structural variants than haematopoietic stem cells, with higher burdens in memory cells than in naive cells, and with T cells accumulating mutations at a higher rate throughout life. Off-target effects of immunological diversification accounted for approximately half of the additional differentiation-associated mutations in lymphocytes. Memory B cells acquired, on average, 18 off-target mutations genome-wide for every on-target IGHV mutation during the germinal centre reaction. Structural variation was 16-fold higher in lymphocytes than in stem cells, with around 15% of deletions being attributable to off-target recombinase-activating gene activity. DNA damage from ultraviolet light exposure and other sporadic mutational processes generated hundreds to thousands of mutations in some memory cells. The mutation burden and signatures of normal B cells were broadly similar to those seen in many B-cell cancers, suggesting that malignant transformation of lymphocytes arises from the same mutational processes that are active across normal ontogeny. The mutational landscape of normal lymphocytes chronicles the off-target effects of programmed genome engineering during immunological diversification and the consequences of differentiation, proliferation and residency in diverse microenvironments.
DOI: 10.1038/s41588-022-01147-3
2022
Cited 35 times
Substantial somatic genomic variation and selection for BCOR mutations in human induced pluripotent stem cells
Abstract We explored human induced pluripotent stem cells (hiPSCs) derived from different tissues to gain insights into genomic integrity at single-nucleotide resolution. We used genome sequencing data from two large hiPSC repositories involving 696 hiPSCs and daughter subclones. We find ultraviolet light (UV)-related damage in ~72% of skin fibroblast-derived hiPSCs (F-hiPSCs), occasionally resulting in substantial mutagenesis (up to 15 mutations per megabase). We demonstrate remarkable genomic heterogeneity between independent F-hiPSC clones derived during the same round of reprogramming due to oligoclonal fibroblast populations. In contrast, blood-derived hiPSCs (B-hiPSCs) had fewer mutations and no UV damage but a high prevalence of acquired BCOR mutations (26.9% of lines). We reveal strong selection pressure for BCOR mutations in F-hiPSCs and B-hiPSCs and provide evidence that they arise in vitro. Directed differentiation of hiPSCs and RNA sequencing showed that BCOR mutations have functional consequences. Our work strongly suggests that detailed nucleotide-resolution characterization is essential before using hiPSCs.
DOI: 10.1038/s41588-022-01296-5
2023
Cited 13 times
APOBEC mutagenesis is a common process in normal human small intestine
APOBEC mutational signatures SBS2 and SBS13 are common in many human cancer types. However, there is an incomplete understanding of its stimulus, when it occurs in the progression from normal to cancer cell and the APOBEC enzymes responsible. Here we whole-genome sequenced 342 microdissected normal epithelial crypts from the small intestines of 39 individuals and found that SBS2/SBS13 mutations were present in 17% of crypts, more frequent than most other normal tissues. Crypts with SBS2/SBS13 often had immediate crypt neighbors without SBS2/SBS13, suggesting that the underlying cause of SBS2/SBS13 is cell-intrinsic. APOBEC mutagenesis occurred in an episodic manner throughout the human lifespan, including in young children. APOBEC1 mRNA levels were very high in the small intestine epithelium, but low in the large intestine epithelium and other tissues. The results suggest that the high levels of SBS2/SBS13 in the small intestine are collateral damage from APOBEC1 fulfilling its physiological function of editing APOB mRNA.
DOI: 10.1002/bies.201200150
2012
Cited 77 times
Non‐random mutation: The evolution of targeted hypermutation and hypomutation
Abstract A widely accepted tenet of evolutionary biology is that spontaneous mutations occur randomly with regard to their fitness effect. However, since the mutation rate varies along a genome and this variation can be subject to selection, organisms might evolve lower mutation rates at loci where mutations are most deleterious or increased rates where mutations are most needed. In fact, mechanisms of targeted hypermutation are known in organisms ranging from bacteria to humans. Here we review the main forces driving the evolution of local mutation rates and identify the main limiting factors. Both targeted hyper‐ and hypomutation can evolve, although the former is restricted to loci under very frequent positive selection and the latter is severely limited by genetic drift. Nevertheless, we show how an association of repair with transcription or chromatin‐associated proteins could overcome the drift limit and lead to non‐random hypomutation along the genome in most organisms. Editor's suggested further reading in BioEssays Stress‐induced mutation via DNA breaks in Escherichia coli: A molecular mechanism with implications for evolution and medicine Abstract
DOI: 10.1038/s41588-018-0258-x
2018
Cited 61 times
Neutral tumor evolution?
Williams et al. (Nat. Genet. 48:238-224, 2016) recently reported neutral tumor evolution in one third of 904 samples from The Cancer Genome Atlas. Here, we assess the reproducibility and validity of their method and the extent of positive selection in subclonal mutations across cancer types. Our results do not support observable neutral tumor evolution and uncover strong positive selection within subclonal mutations across cancers.
DOI: 10.1371/journal.pone.0174744
2017
Cited 61 times
GOTHiC, a probabilistic model to resolve complex biases and to identify real interactions in Hi-C data
Hi-C is one of the main methods for investigating spatial co-localisation of DNA in the nucleus. However, the raw sequencing data obtained from Hi-C experiments suffer from large biases and spurious contacts, making it difficult to identify true interactions. Existing methods use complex models to account for biases and do not provide a significance threshold for detecting interactions. Here we introduce a simple binomial probabilistic model that resolves complex biases and distinguishes between true and false interactions. The model corrects biases of known and unknown origin and yields a p-value for each interaction, providing a reliable threshold based on significance. We demonstrate this experimentally by testing the method against a random ligation dataset. Our method outperforms previous methods and provides a statistical framework for further data analysis, such as comparisons of Hi-C interactions between different conditions. GOTHiC is available as a BioConductor package (http://www.bioconductor.org/packages/release/bioc/html/GOTHiC.html).
DOI: 10.1038/s41598-018-31659-0
2018
Cited 52 times
An integrated genomic analysis of anaplastic meningioma identifies prognostic molecular signatures
Abstract Anaplastic meningioma is a rare and aggressive brain tumor characterised by intractable recurrences and dismal outcomes. Here, we present an integrated analysis of the whole genome, transcriptome and methylation profiles of primary and recurrent anaplastic meningioma. A key finding was the delineation of distinct molecular subgroups that were associated with diametrically opposed survival outcomes. Relative to lower grade meningiomas, anaplastic tumors harbored frequent driver mutations in SWI/SNF complex genes, which were confined to the poor prognosis subgroup. Aggressive disease was further characterised by transcriptional evidence of increased PRC2 activity, stemness and epithelial-to-mesenchymal transition. Our analyses discern biologically distinct variants of anaplastic meningioma with prognostic and therapeutic significance.
DOI: 10.7554/elife.14552
2016
Cited 51 times
Mitochondrial genetic diversity, selection and recombination in a canine transmissible cancer
Canine transmissible venereal tumour (CTVT) is a clonally transmissible cancer that originated approximately 11,000 years ago and affects dogs worldwide. Despite the clonal origin of the CTVT nuclear genome, CTVT mitochondrial genomes (mtDNAs) have been acquired by periodic capture from transient hosts. We sequenced 449 complete mtDNAs from a global population of CTVTs, and show that mtDNA horizontal transfer has occurred at least five times, delineating five tumour clades whose distributions track two millennia of dog global migration. Negative selection has operated to prevent accumulation of deleterious mutations in captured mtDNA, and recombination has caused occasional mtDNA re-assortment. These findings implicate functional mtDNA as a driver of CTVT global metastatic spread, further highlighting the important role of mtDNA in cancer evolution.
DOI: 10.1038/s41467-019-13929-1
2020
Cited 39 times
Combined burden and functional impact tests for cancer driver discovery using DriverPower
Abstract The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here , as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium , which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower’s background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery.
DOI: 10.1016/j.stem.2021.02.005
2021
Cited 30 times
Development, maturation, and maintenance of human prostate inferred from somatic mutations
Clonal dynamics and mutation burden in healthy human prostate epithelium are relevant to prostate cancer. We sequenced whole genomes from 409 microdissections of normal prostate epithelium across 8 donors, using phylogenetic reconstruction with spatial mapping in a 59-year-old man’s prostate to reconstruct tissue dynamics across the lifespan. Somatic mutations accumulate steadily at ∼16 mutations/year/clone, with higher rates in peripheral than peri-urethral regions. The 24–30 independent glandular subunits are established as rudimentary ductal structures during fetal development by 5–10 embryonic cells each. Puberty induces formation of further side and terminal branches by local stem cells disseminated throughout the rudimentary ducts during development. During adult tissue maintenance, clonal expansions have limited geographic scope and minimal migration. Driver mutations are rare in aging prostate epithelium, but the one driver we did observe generated a sizable intraepithelial clonal expansion. Leveraging unbiased clock-like mutations, we define prostate stem cell dynamics through fetal development, puberty, and aging.
DOI: 10.1038/s41467-022-29920-2
2022
Cited 22 times
Mutational landscape of normal epithelial cells in Lynch Syndrome patients
Abstract Lynch Syndrome (LS) is an autosomal dominant disease conferring a high risk of colorectal cancer due to germline heterozygous mutations in a DNA mismatch repair (MMR) gene. Although cancers in LS patients show elevated somatic mutation burdens, information on mutation rates in normal tissues and understanding of the trajectory from normal to cancer cell is limited. Here we whole genome sequence 152 crypts from normal and neoplastic epithelial tissues from 10 LS patients. In normal tissues the repertoire of mutational processes and mutation rates is similar to that found in wild type individuals. A morphologically normal colonic crypt with an increased mutation burden and MMR deficiency-associated mutational signatures is identified, which may represent a very early stage of LS pathogenesis. Phylogenetic trees of tumour crypts indicate that the most recent ancestor cell of each tumour is already MMR deficient and has experienced multiple cycles of clonal evolution. This study demonstrates the genomic stability of epithelial cells with heterozygous germline MMR gene mutations and highlights important differences in the pathogenesis of LS from other colorectal cancer predisposition syndromes.
DOI: 10.1101/190330
2017
Cited 48 times
The whole-genome panorama of cancer drivers
SUMMARY The advance of personalized cancer medicine requires the accurate identification of the mutations driving each patient’s tumor. However, to date, we have only been able to obtain partial insights into the contribution of genomic events to tumor development. Here, we design a comprehensive approach to identify the driver mutations in each patient’s tumor and obtain a whole-genome panorama of driver events across more than 2,500 tumors from 37 types of cancer. This panorama includes coding and non-coding point mutations, copy number alterations and other genomic rearrangements of somatic origin, and potentially predisposing germline variants. We demonstrate that genomic events are at the root of virtually all tumors, with each carrying on average 4.6 driver events. Most individual tumors harbor a unique combination of drivers, and we uncover the most frequent co-occurring driver events. Half of all cancer genes are affected by several types of driver mutations. In summary, the panorama described here provides answers to fundamental questions in cancer genomics and bridges the gap between cancer genomics and personalized cancer medicine.
DOI: 10.1038/s41467-020-14351-8
2020
Cited 37 times
Inferring structural variant cancer cell fraction
Abstract We present SVclone, a computational method for inferring the cancer cell fraction of structural variant (SV) breakpoints from whole-genome sequencing data. SVclone accurately determines the variant allele frequencies of both SV breakends, then simultaneously estimates the cancer cell fraction and SV copy number. We assess performance using in silico mixtures of real samples, at known proportions, created from two clonal metastases from the same patient. We find that SVclone’s performance is comparable to single-nucleotide variant-based methods, despite having an order of magnitude fewer data points. As part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we use SVclone to reveal a subset of liver, ovarian and pancreatic cancers with subclonally enriched copy-number neutral rearrangements that show decreased overall survival. SVclone enables improved characterisation of SV intra-tumour heterogeneity.
DOI: 10.1038/s41467-020-14352-7
2020
Cited 37 times
Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig
Abstract The type and genomic context of cancer mutations depend on their causes. These causes have been characterized using signatures that represent mutation types that co-occur in the same tumours. However, it remains unclear how mutation processes change during cancer evolution due to the lack of reliable methods to reconstruct evolutionary trajectories of mutational signature activity. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we present TrackSig, a new method that reconstructs these trajectories using optimal, joint segmentation and deconvolution of mutation type and allele frequencies from a single tumour sample. In simulations, we find TrackSig has a 3–5% activity reconstruction error, and 12% false detection rate. It outperforms an aggressive baseline in situations with branching evolution, CNA gain, and neutral mutations. Applied to data from 2658 tumours and 38 cancer types, TrackSig permits pan-cancer insight into evolutionary changes in mutational processes.
DOI: 10.1101/2022.06.10.22276179
2022
Cited 14 times
Saturation genome editing of DDX3X clarifies pathogenicity of germline and somatic variation
Abstract Loss-of-function of DDX3X is a leading cause of neurodevelopmental disorders (NDD) in females. DDX3X is also a somatically mutated cancer driver gene proposed to have tumour promoting and suppressing effects. We performed saturation genome editing of DDX3X, testing in vitro the functional impact of 12,776 nucleotide variants. We identified 3,432 functionally abnormal variants, in three distinct classes. We trained a machine learning classifier to identify functionally abnormal variants of NDD-relevance. This classifier has at least 97% sensitivity and 99% specificity to detect variants pathogenic for NDD, substantially out-performing in silico predictors, and resolving up to 93% of variants of uncertain significance. Moreover, functionally-abnormal variants could account for almost all of the excess nonsynonymous DDX3X somatic mutations seen in DDX3X - driven cancers. Systematic maps of variant effects generated in experimentally tractable cell types have the potential to transform clinical interpretation of both germline and somatic disease-associated variation.
DOI: 10.1038/s41467-023-43041-4
2023
Cited 6 times
Saturation genome editing of DDX3X clarifies pathogenicity of germline and somatic variation
Abstract Loss-of-function of DDX3X is a leading cause of neurodevelopmental disorders (NDD) in females. DDX3X is also a somatically mutated cancer driver gene proposed to have tumour promoting and suppressing effects. We perform saturation genome editing of DDX3X , testing in vitro the functional impact of 12,776 nucleotide variants. We identify 3432 functionally abnormal variants, in three distinct classes. We train a machine learning classifier to identify functionally abnormal variants of NDD-relevance. This classifier has at least 97% sensitivity and 99% specificity to detect variants pathogenic for NDD, substantially out-performing in silico predictors, and resolving up to 93% of variants of uncertain significance. Moreover, functionally-abnormal variants can account for almost all of the excess nonsynonymous DDX3X somatic mutations seen in DDX3X -driven cancers. Systematic maps of variant effects generated in experimentally tractable cell types have the potential to transform clinical interpretation of both germline and somatic disease-associated variation.
DOI: 10.1186/gb-2013-14-10-r113
2013
Cited 41 times
The genetic heterogeneity and mutational burden of engineered melanomas in zebrafish models
Melanoma is the most deadly form of skin cancer. Expression of oncogenic BRAF or NRAS, which are frequently mutated in human melanomas, promote the formation of nevi but are not sufficient for tumorigenesis. Even with germline mutated p53, these engineered melanomas present with variable onset and pathology, implicating additional somatic mutations in a multi-hit tumorigenic process.To decipher the genetics of these melanomas, we sequence the protein coding exons of 53 primary melanomas generated from several BRAF(V600E) or NRAS(Q61K) driven transgenic zebrafish lines. We find that engineered zebrafish melanomas show an overall low mutation burden, which has a strong, inverse association with the number of initiating germline drivers. Although tumors reveal distinct mutation spectrums, they show mostly C > T transitions without UV light exposure, and enrichment of mutations in melanogenesis, p53 and MAPK signaling. Importantly, a recurrent amplification occurring with pre-configured drivers BRAF(V600E) and p53-/- suggests a novel path of BRAF cooperativity through the protein kinase A pathway.This is the first analysis of a melanoma mutational landscape in the absence of UV light, where tumors manifest with remarkably low mutation burden and high heterogeneity. Genotype specific amplification of protein kinase A in cooperation with BRAF and p53 mutation suggests the involvement of melanogenesis in these tumors. This work is important for defining the spectrum of events in BRAF or NRAS driven melanoma in the absence of UV light, and for informed exploitation of models such as transgenic zebrafish to better understand mechanisms leading to human melanoma formation.
DOI: 10.1101/797787
2019
Cited 33 times
Integrating healthcare and research genetic data empowers the discovery of 28 novel developmental disorders
Summary De novo mutations (DNMs) in protein-coding genes are a well-established cause of developmental disorders (DD). However, known DD-associated genes only account for a minority of the observed excess of such DNMs. To identify novel DD-associated genes, we integrated healthcare and research exome sequences on 31,058 DD parent-offspring trios, and developed a simulation-based statistical test to identify gene-specific enrichments of DNMs. We identified 285 significantly DD-associated genes, including 28 not previously robustly associated with DDs. Despite detecting more DD-associated genes than in any previous study, much of the excess of DNMs of protein-coding genes remains unaccounted for. Modelling suggests that over 1,000 novel DD-associated genes await discovery, many of which are likely to be less penetrant than the currently known genes. Research access to clinical diagnostic datasets will be critical for completing the map of dominant DDs.
DOI: 10.1186/s12859-020-03772-3
2020
Cited 28 times
Generating realistic null hypothesis of cancer mutational landscapes using SigProfilerSimulator
Abstract Background Performing a statistical test requires a null hypothesis. In cancer genomics, a key challenge is the fast generation of accurate somatic mutational landscapes that can be used as a realistic null hypothesis for making biological discoveries. Results Here we present SigProfilerSimulator, a powerful tool that is capable of simulating the mutational landscapes of thousands of cancer genomes at different resolutions within seconds. Applying SigProfilerSimulator to 2144 whole-genome sequenced cancers reveals: (i) that most doublet base substitutions are not due to two adjacent single base substitutions but likely occur as single genomic events; (ii) that an extended sequencing context of ± 2 bp is required to more completely capture the patterns of substitution mutational signatures in human cancer; (iii) information on false-positive discovery rate of commonly used bioinformatics tools for detecting driver genes. Conclusions SigProfilerSimulator’s breadth of features allows one to construct a tailored null hypothesis and use it for evaluating the accuracy of other bioinformatics tools or for downstream statistical analysis for biological discoveries. SigProfilerSimulator is freely available at https://github.com/AlexandrovLab/SigProfilerSimulator with an extensive documentation at https://osf.io/usxjz/wiki/home/ .
DOI: 10.1016/j.xcrm.2021.100472
2021
Cited 21 times
Stage-stratified molecular profiling of non-muscle-invasive bladder cancer enhances biological, clinical, and therapeutic insight
<h2>Summary</h2> Understanding the molecular determinants that underpin the clinical heterogeneity of non-muscle-invasive bladder cancer (NMIBC) is essential for prognostication and therapy development. Stage T1 disease in particular presents a high risk of progression and requires improved understanding. We present a detailed multi-omics study containing gene expression, copy number, and mutational profiles that show relationships to immune infiltration, disease recurrence, and progression to muscle invasion. We compare expression and genomic subtypes derived from all NMIBCs with those derived from the individual disease stages Ta and T1. We show that sufficient molecular heterogeneity exists within the separate stages to allow subclassification and that this is more clinically meaningful for stage T1 disease than that derived from all NMIBCs. This provides improved biological understanding and identifies subtypes of T1 tumors that may benefit from chemo- or immunotherapy.
DOI: 10.1186/s13059-023-03026-4
2023
Cited 4 times
Cross-species oncogenomics offers insight into human muscle-invasive bladder cancer
In humans, muscle-invasive bladder cancer (MIBC) is highly aggressive and associated with a poor prognosis. With a high mutation load and large number of altered genes, strategies to delineate key driver events are necessary. Dogs and cats develop urothelial carcinoma (UC) with histological and clinical similarities to human MIBC. Cattle that graze on bracken fern also develop UC, associated with exposure to the carcinogen ptaquiloside. These species may represent relevant animal models of spontaneous and carcinogen-induced UC that can provide insight into human MIBC.Whole-exome sequencing of domestic canine (n = 87) and feline (n = 23) UC, and comparative analysis with human MIBC reveals a lower mutation rate in animal cases and the absence of APOBEC mutational signatures. A convergence of driver genes (ARID1A, KDM6A, TP53, FAT1, and NRAS) is discovered, along with common focally amplified and deleted genes involved in regulation of the cell cycle and chromatin remodelling. We identify mismatch repair deficiency in a subset of canine and feline UCs with biallelic inactivation of MSH2. Bovine UC (n = 8) is distinctly different; we identify novel mutational signatures which are recapitulated in vitro in human urinary bladder UC cells treated with bracken fern extracts or purified ptaquiloside.Canine and feline urinary bladder UC represent relevant models of MIBC in humans, and cross-species analysis can identify evolutionarily conserved driver genes. We characterize mutational signatures in bovine UC associated with bracken fern and ptaquiloside exposure, a human-linked cancer exposure. Our work demonstrates the relevance of cross-species comparative analysis in understanding both human and animal UC.
DOI: 10.1080/13803395.2010.499354
2010
Cited 43 times
Lexical access changes in patients with multiple sclerosis: A two-year follow-up study
The aim of the study was to analyze lexical access strategies in patients with multiple sclerosis (MS) and their changes over time. We studied lexical access strategies during semantic and phonemic verbal fluency tests and also confrontation naming in a 2-year prospective cohort of 45 MS patients and 20 healthy controls. At baseline, switching lexical access strategy (both in semantic and in phonemic verbal fluency tests) and confrontation naming were significantly impaired in MS patients compared with controls. After 2 years follow-up, switching score decreased, and cluster size increased over time in semantic verbal fluency tasks, suggesting a failure in the retrieval of lexical information rather than an impairment of the lexical pool. In conclusion, these findings underline the significant presence of lexical access problems in patients with MS and could point out their key role in the alterations of high-level communications abilities in MS.
DOI: 10.1101/312041
2018
Cited 28 times
Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes
SUMMARY Intra-tumor heterogeneity (ITH) is a mechanism of therapeutic resistance and therefore an important clinical challenge. However, the extent, origin and drivers of ITH across cancer types are poorly understood. To address this question, we extensively characterize ITH across whole-genome sequences of 2,658 cancer samples, spanning 38 cancer types. Nearly all informative samples (95.1%) contain evidence of distinct subclonal expansions, with frequent branching relationships between subclones. We observe positive selection of subclonal driver mutations across most cancer types, and identify cancer type specific subclonal patterns of driver gene mutations, fusions, structural variants and copy-number alterations, as well as dynamic changes in mutational processes between subclonal expansions. Our results underline the importance of ITH and its drivers in tumor evolution, and provide an unprecedented pan-cancer resource of comprehensively annotated subclonal events from whole-genome sequencing data.
DOI: 10.3324/haematol.2020.262659
2020
Cited 21 times
&lt;i&gt;CDKN2A&lt;/i&gt; deletion is a frequent event associated with poor outcome in patients with peripheral T-cell lymphoma not otherwise specified (PTCL-NOS)
Nodal peripheral T-cell lymphoma not otherwise specified (PTCLNOS) remains a diagnosis encompassing a heterogenous group of PTCL cases not fitting criteria for more homogeneous subtypes. They are characterized by a poor clinical outcome when treated with anthracycline-containing regimens. A better understanding of their biology could improve prognostic stratification and foster the development of novel therapeutic approaches. Recent targeted and whole exome sequencing studies have shown recurrent copy number abnormalities (CNA) with prognostic significance. Here, investigating five formalinfixed, paraffin embedded cases of PTCL-NOS by whole genome sequencing, we found a high prevalence of structural variants and complex events, such as chromothripsis likely responsible for the observed CNA. Among them, CDKN2A and PTEN deletions emerged as the most frequent aberration, as confirmed in a final cohort of 143 patients with nodal PTCL. The incidence of CDKN2A and PTEN deletions among PTCL-NOS was 46% and 26%, respectively. Furthermore, we found that co-occurrence of CDKN2A and PTEN deletions is an event associated with PTCLNOS with absolute specificity. In contrast, these deletions are rare and never co-occur in angioimmunoblastic and anaplastic lymphomas. CDKN2A deletion was associated with shorter overall survival in multivariate analysis corrected by age, International Prognostic Index, transplant eligibility and GATA3 expression (adjusted Hazard Ratio =2.53; 95% Confidence Interval: 1.006-6.3; P=0.048). These data suggest that CDKN2A deletions may be relevant for refining the prognosis of PTCLNOS and their significance should be evaluated in prospective trials.