ϟ

Carsten Russ

Here are all the papers by Carsten Russ that you can download and read on OA.mg.
Carsten Russ’s last known institution is . Download Carsten Russ PDFs here.

Claim this Profile →
DOI: 10.1038/nature06008
2007
Cited 3,771 times
Genome-wide maps of chromatin state in pluripotent and lineage-committed cells
We report the application of single-molecule-based sequencing technology for high-throughput profiling of histone modifications in mammalian cells. By obtaining over four billion bases of sequence from chromatin immunoprecipitated DNA, we generated genome-wide chromatin-state maps of mouse embryonic stem cells, neural progenitor cells and embryonic fibroblasts. We find that lysine 4 and lysine 27 trimethylation effectively discriminates genes that are expressed, poised for expression, or stably repressed, and therefore reflect cell state and lineage potential. Lysine 36 trimethylation marks primary coding and non-coding transcripts, facilitating gene annotation. Trimethylation of lysine 9 and lysine 20 is detected at satellite, telomeric and active long-terminal repeats, and can spread into proximal unique sequences. Lysine 4 and lysine 9 trimethylation marks imprinting control regions. Finally, we show that chromatin state can be read in an allele-specific manner by using single nucleotide polymorphisms. This study provides a framework for the application of comprehensive chromatin profiling towards characterization of diverse mammalian cell populations.
DOI: 10.1126/science.1188021
2010
Cited 3,694 times
A Draft Sequence of the Neandertal Genome
Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other.
DOI: 10.1126/science.1166066
2009
Cited 2,318 times
Mutations in the <i>FUS/TLS</i> Gene on Chromosome 16 Cause Familial Amyotrophic Lateral Sclerosis
Amyotrophic lateral sclerosis (ALS) is a fatal degenerative motor neuron disorder. Ten percent of cases are inherited; most involve unidentified genes. We report here 13 mutations in the fused in sarcoma/translated in liposarcoma (FUS/TLS) gene on chromosome 16 that were specific for familial ALS. The FUS/TLS protein binds to RNA, functions in diverse processes, and is normally located predominantly in the nucleus. In contrast, the mutant forms of FUS/TLS accumulated in the cytoplasm of neurons, a pathology that is similar to that of the gene TAR DNA-binding protein 43 (TDP43), whose mutations also cause ALS. Neuronal cytoplasmic protein aggregation and defective RNA metabolism thus appear to be common pathogenic mechanisms involved in ALS and possibly in other neurodegenerative disorders.
DOI: 10.1038/nbt.1523
2009
Cited 1,278 times
Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing
Targeting genomic loci by massively parallel sequencing requires new methods to enrich templates to be sequenced. We developed a capture method that uses biotinylated RNA 'baits' to fish targets out of a 'pond' of DNA fragments. The RNA is transcribed from PCR-amplified oligodeoxynucleotides originally synthesized on a microarray, generating sufficient bait for multiple captures at concentrations high enough to drive the hybridization. We tested this method with 170-mer baits that target >15,000 coding exons (2.5 Mb) and four regions (1.7 Mb total) using Illumina sequencing as read-out. About 90% of uniquely aligning bases fell on or near bait sequence; up to 50% lay on exons proper. The uniformity was such that approximately 60% of target bases in the exonic 'catch', and approximately 80% in the regional catch, had at least half the mean coverage. One lane of Illumina sequence was sufficient to call high-confidence genotypes for 89% of the targeted exon space.
DOI: 10.1186/gb-2011-12-2-r18
2011
Cited 1,000 times
Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries
Despite the ever-increasing output of Illumina sequencing data, loci with extreme base compositions are often under-represented or absent. To evaluate sources of base-composition bias, we traced genomic sequences ranging from 6% to 90% GC through the process by quantitative PCR. We identified PCR during library preparation as a principal source of bias and optimized the conditions. Our improved protocol significantly reduces amplification bias and minimizes the previously severe effects of PCR instrument and temperature ramp rate.
DOI: 10.1038/s41591-018-0050-6
2018
Cited 764 times
p53 inhibits CRISPR–Cas9 engineering in human pluripotent stem cells
CRISPR/Cas9 has revolutionized our ability to engineer genomes and conduct genome-wide screens in human cells1–3. Whereas some cell types are amenable to genome engineering, genomes of human pluripotent stem cells (hPSCs) have been difficult to engineer, with reduced efficiencies relative to tumour cell lines or mouse embryonic stem cells3–13. Here, using hPSC lines with stable integration of Cas9 or transient delivery of Cas9-ribonucleoproteins (RNPs), we achieved an average insertion or deletion (indel) efficiency greater than 80%. This high efficiency of indel generation revealed that double-strand breaks (DSBs) induced by Cas9 are toxic and kill most hPSCs. In previous studies, the toxicity of Cas9 in hPSCs was less apparent because of low transfection efficiency and subsequently low DSB induction3. The toxic response to DSBs was P53/TP53-dependent, such that the efficiency of precise genome engineering in hPSCs with a wild-type P53 gene was severely reduced. Our results indicate that Cas9 toxicity creates an obstacle to the high-throughput use of CRISPR/Cas9 for genome engineering and screening in hPSCs. Moreover, as hPSCs can acquire P53 mutations14, cell replacement therapies using CRISPR/Cas9-enginereed hPSCs should proceed with caution, and such engineered hPSCs should be monitored for P53 function. CRISPR–Cas9-induced DNA damage triggers p53 to limit the efficiency of gene editing in human pluripotent cells.
DOI: 10.1186/gb-2013-14-5-r51
2013
Cited 756 times
Characterizing and measuring bias in sequence data
DNA sequencing technologies deviate from the ideal uniform distribution of reads. These biases impair scientific and medical applications. Accordingly, we have developed computational methods for discovering, describing and measuring bias.We applied these methods to the Illumina, Ion Torrent, Pacific Biosciences and Complete Genomics sequencing platforms, using data from human and from a set of microbes with diverse base compositions. As in previous work, library construction conditions significantly influence sequencing bias. Pacific Biosciences coverage levels are the least biased, followed by Illumina, although all technologies exhibit error-rate biases in high- and low-GC regions and at long homopolymer runs. The GC-rich regions prone to low coverage include a number of human promoters, so we therefore catalog 1,000 that were exceptionally resistant to sequencing. Our results indicate that combining data from two technologies can reduce coverage bias if the biases in the component technologies are complementary and of similar magnitude. Analysis of Illumina data representing 120-fold coverage of a well-studied human sample reveals that 0.20% of the autosomal genome was covered at less than 10% of the genome-wide average. Excluding locations that were similar to known bias motifs or likely due to sample-reference variations left only 0.045% of the autosomal genome with unexplained poor coverage.The assays presented in this paper provide a comprehensive view of sequencing bias, which can be used to drive laboratory improvements and to monitor production processes. Development guided by these assays should result in improved genome assemblies and better coverage of biologically important loci.
DOI: 10.1101/gad.1884710
2010
Cited 744 times
Mammalian microRNAs: experimental evaluation of novel and previously annotated genes
MicroRNAs (miRNAs) are small regulatory RNAs that derive from distinctive hairpin transcripts. To learn more about the miRNAs of mammals, we sequenced 60 million small RNAs from mouse brain, ovary, testes, embryonic stem cells, three embryonic stages, and whole newborns. Analysis of these sequences confirmed 398 annotated miRNA genes and identified 108 novel miRNA genes. More than 150 previously annotated miRNAs and hundreds of candidates failed to yield sequenced RNAs with miRNA-like features. Ectopically expressing these previously proposed miRNA hairpins also did not yield small RNAs, whereas ectopically expressing the confirmed and newly identified hairpins usually did yield small RNAs with the classical miRNA features, including dependence on the Drosha endonuclease for processing. These experiments, which suggest that previous estimates of conserved mammalian miRNAs were inflated, provide a substantially revised list of confidently identified murine miRNAs from which to infer the general features of mammalian miRNAs. Our analyses also revealed new aspects of miRNA biogenesis and modification, including tissue-specific strand preferences, sequential Dicer cleavage of a metazoan precursor miRNA (pre-miRNA), consequential 5′ heterogeneity, newly identified instances of miRNA editing, and evidence for widespread pre-miRNA uridylation reminiscent of miRNA regulation by Lin28.
DOI: 10.1038/nature11329
2012
Cited 675 times
Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations
Medulloblastoma is the most common brain tumour in children; using exome sequencing of tumour samples the authors show that these cancers have low mutation rates and identify 12 significantly mutated genes, among them the gene encoding RNA helicase DDX3X. Medulloblastoma is the most common malignant brain tumour in children. Four papers published in the 2 August 2012 issue of Nature use whole-genome and other sequencing techniques to produce a detailed picture of the genetics and genomics of this condition. Notable findings include the identification of recurrent mutations in genes not previously implicated in medulloblastoma, with significant genetic differences associated with the four biologically distinct subgroups and clinical outcomes in each. Potential avenues for therapy are suggested by the identification of targetable somatic copy-number alterations, including recurrent events targeting TGFβ signalling in Group 3, and NF-κB signalling in Group 4 medulloblastomas. Medulloblastomas are the most common malignant brain tumours in children1. Identifying and understanding the genetic events that drive these tumours is critical for the development of more effective diagnostic, prognostic and therapeutic strategies. Recently, our group and others described distinct molecular subtypes of medulloblastoma on the basis of transcriptional and copy number profiles2,3,4,5. Here we use whole-exome hybrid capture and deep sequencing to identify somatic mutations across the coding regions of 92 primary medulloblastoma/normal pairs. Overall, medulloblastomas have low mutation rates consistent with other paediatric tumours, with a median of 0.35 non-silent mutations per megabase. We identified twelve genes mutated at statistically significant frequencies, including previously known mutated genes in medulloblastoma such as CTNNB1, PTCH1, MLL2, SMARCA4 and TP53. Recurrent somatic mutations were newly identified in an RNA helicase gene, DDX3X, often concurrent with CTNNB1 mutations, and in the nuclear co-repressor (N-CoR) complex genes GPS2, BCOR and LDB1. We show that mutant DDX3X potentiates transactivation of a TCF promoter and enhances cell viability in combination with mutant, but not wild-type, β-catenin. Together, our study reveals the alteration of WNT, hedgehog, histone methyltransferase and now N-CoR pathways across medulloblastomas and within specific subtypes of this disease, and nominates the RNA helicase DDX3X as a component of pathogenic β-catenin signalling in medulloblastoma.
DOI: 10.1038/ng.2329
2012
Cited 615 times
De novo somatic mutations in components of the PI3K-AKT3-mTOR pathway cause hemimegalencephaly
De novo somatic mutations in focal areas are well documented in diseases such as neoplasia but are rarely reported in malformation of the developing brain. Hemimegalencephaly (HME) is characterized by overgrowth of either one of the two cerebral hemispheres. The molecular etiology of HME remains a mystery. The intractable epilepsy that is associated with HME can be relieved by the surgical treatment hemispherectomy, allowing sampling of diseased tissue. Exome sequencing and mass spectrometry analysis in paired brain-blood samples from individuals with HME (n = 20 cases) identified de novo somatic mutations in 30% of affected individuals in the PIK3CA, AKT3 and MTOR genes. A recurrent PIK3CA c.1633G>A mutation was found in four separate cases. Identified mutations were present in 8-40% of sequenced alleles in various brain regions and were associated with increased neuronal S6 protein phosphorylation in the brains of affected individuals, indicating aberrant activation of mammalian target of rapamycin (mTOR) signaling. Thus HME is probably a genetically mosaic disease caused by gain of function in phosphatidylinositol 3-kinase (PI3K)-AKT3-mTOR signaling.
DOI: 10.1038/ng1742
2006
Cited 612 times
ANG mutations segregate with familial and 'sporadic' amyotrophic lateral sclerosis
DOI: 10.1126/science.1183605
2010
Cited 597 times
A Catalog of Reference Genomes from the Human Microbiome
News from the Inner Tube of Life A major initiative by the U.S. National Institutes of Health to sequence 900 genomes of microorganisms that live on the surfaces and orifices of the human body has established standardized protocols and methods for such large-scale reference sequencing. By combining previously accumulated data with new data, Nelson et al. (p. 994 ) present an initial analysis of 178 bacterial genomes. The sampling so far barely scratches the surface of the microbial diversity found on humans, but the work provides an important baseline for future analyses.
DOI: 10.1038/nmeth.1276
2008
Cited 477 times
High-resolution mapping of copy-number alterations with massively parallel sequencing
Cancer results from somatic alterations in key genes, including point mutations, copy-number alterations and structural rearrangements. A powerful way to discover cancer-causing genes is to identify genomic regions that show recurrent copy-number alterations (gains and losses) in tumor genomes. Recent advances in sequencing technologies suggest that massively parallel sequencing may provide a feasible alternative to DNA microarrays for detecting copy-number alterations. Here we present: (i) a statistical analysis of the power to detect copy-number alterations of a given size; (ii) SegSeq, an algorithm to segment equal copy numbers from massively parallel sequence data; and (iii) analysis of experimental data from three matched pairs of tumor and normal cell lines. We show that a collection of approximately 14 million aligned sequence reads from human cell lines has comparable power to detect events as the current generation of DNA microarrays and has over twofold better precision for localizing breakpoints (typically, to within approximately 1 kilobase).
DOI: 10.1126/science.1203357
2011
Cited 465 times
Comparative Functional Genomics of the Fission Yeasts
A combined analysis of genome sequence, structure, and expression gives insights into fission yeast biology.
DOI: 10.1016/j.molcel.2016.06.037
2016
Cited 387 times
DNA Repair Profiling Reveals Nonrandom Outcomes at Cas9-Mediated Breaks
The repair outcomes at site-specific DNA double-strand breaks (DSBs) generated by the RNA-guided DNA endonuclease Cas9 determine how gene function is altered. Despite the widespread adoption of CRISPR-Cas9 technology to induce DSBs for genome engineering, the resulting repair products have not been examined in depth. Here, the DNA repair profiles of 223 sites in the human genome demonstrate that the pattern of DNA repair following Cas9 cutting at each site is nonrandom and consistent across experimental replicates, cell lines, and reagent delivery methods. Furthermore, the repair outcomes are determined by the protospacer sequence rather than genomic context, indicating that DNA repair profiling in cell lines can be used to anticipate repair outcomes in primary cells. Chemical inhibition of DNA-PK enabled dissection of the DNA repair profiles into contributions from c-NHEJ and MMEJ. Finally, this work elucidates a strategy for using “error-prone” DNA-repair machinery to generate precise edits.
DOI: 10.1056/nejmoa1505819
2015
Cited 337 times
Genetic Diversity and Protective Efficacy of the RTS,S/AS01 Malaria Vaccine
The RTS,S/AS01 vaccine targets the circumsporozoite protein of Plasmodium falciparum and has partial protective efficacy against clinical and severe malaria disease in infants and children. We investigated whether the vaccine efficacy was specific to certain parasite genotypes at the circumsporozoite protein locus.We used polymerase chain reaction-based next-generation sequencing of DNA extracted from samples from 4985 participants to survey circumsporozoite protein polymorphisms. We evaluated the effect that polymorphic positions and haplotypic regions within the circumsporozoite protein had on vaccine efficacy against first episodes of clinical malaria within 1 year after vaccination.In the per-protocol group of 4577 RTS,S/AS01-vaccinated participants and 2335 control-vaccinated participants who were 5 to 17 months of age, the 1-year cumulative vaccine efficacy was 50.3% (95% confidence interval [CI], 34.6 to 62.3) against clinical malaria in which parasites matched the vaccine in the entire circumsporozoite protein C-terminal (139 infections), as compared with 33.4% (95% CI, 29.3 to 37.2) against mismatched malaria (1951 infections) (P=0.04 for differential vaccine efficacy). The vaccine efficacy based on the hazard ratio was 62.7% (95% CI, 51.6 to 71.3) against matched infections versus 54.2% (95% CI, 49.9 to 58.1) against mismatched infections (P=0.06). In the group of infants 6 to 12 weeks of age, there was no evidence of differential allele-specific vaccine efficacy.These results suggest that among children 5 to 17 months of age, the RTS,S vaccine has greater activity against malaria parasites with the matched circumsporozoite protein allele than against mismatched malaria. The overall vaccine efficacy in this age category will depend on the proportion of matched alleles in the local parasite population; in this trial, less than 10% of parasites had matched alleles. (Funded by the National Institutes of Health and others.).
DOI: 10.1073/pnas.1121491109
2012
Cited 260 times
Genomic epidemiology of the <i>Escherichia coli</i> O104:H4 outbreaks in Europe, 2011
The degree to which molecular epidemiology reveals information about the sources and transmission patterns of an outbreak depends on the resolution of the technology used and the samples studied. Isolates of Escherichia coli O104:H4 from the outbreak centered in Germany in May-July 2011, and the much smaller outbreak in southwest France in June 2011, were indistinguishable by standard tests. We report a molecular epidemiological analysis using multiplatform whole-genome sequencing and analysis of multiple isolates from the German and French outbreaks. Isolates from the German outbreak showed remarkably little diversity, with only two single nucleotide polymorphisms (SNPs) found in isolates from four individuals. Surprisingly, we found much greater diversity (19 SNPs) in isolates from seven individuals infected in the French outbreak. The German isolates form a clade within the more diverse French outbreak strains. Moreover, five isolates derived from a single infected individual from the French outbreak had extremely limited diversity. The striking difference in diversity between the German and French outbreak samples is consistent with several hypotheses, including a bottleneck that purged diversity in the German isolates, variation in mutation rates in the two E. coli outbreak populations, or uneven distribution of diversity in the seed populations that led to each outbreak.
DOI: 10.1038/ncomms3325
2013
Cited 258 times
The Capsaspora genome reveals a complex unicellular prehistory of animals
To reconstruct the evolutionary origin of multicellular animals from their unicellular ancestors, the genome sequences of diverse unicellular relatives are essential. However, only the genome of the choanoflagellate Monosiga brevicollis has been reported to date. Here we completely sequence the genome of the filasterean Capsaspora owczarzaki, the closest known unicellular relative of metazoans besides choanoflagellates. Analyses of this genome alter our understanding of the molecular complexity of metazoans' unicellular ancestors showing that they had a richer repertoire of proteins involved in cell adhesion and transcriptional regulation than previously inferred only with the choanoflagellate genome. Some of these proteins were secondarily lost in choanoflagellates. In contrast, most intercellular signalling systems controlling development evolved later concomitant with the emergence of the first metazoans. We propose that the acquisition of these metazoan-specific developmental systems and the co-option of pre-existing genes drove the evolutionary transition from unicellular protists to metazoans.
DOI: 10.1038/ng.594
2010
Cited 257 times
Mutations in TMEM216 perturb ciliogenesis and cause Joubert, Meckel and related syndromes
Joubert syndrome (JBTS), related disorders (JSRDs) and Meckel syndrome (MKS) are ciliopathies. We now report that MKS2 and CORS2 (JBTS2) loci are allelic and caused by mutations in TMEM216, which encodes an uncharacterized tetraspan transmembrane protein. Individuals with CORS2 frequently had nephronophthisis and polydactyly, and two affected individuals conformed to the oro-facio-digital type VI phenotype, whereas skeletal dysplasia was common in fetuses affected by MKS. A single G218T mutation (R73L in the protein) was identified in all cases of Ashkenazi Jewish descent (n=10). TMEM216 localized to the base of primary cilia, and loss of TMEM216 in mutant fibroblasts or after knockdown caused defective ciliogenesis and centrosomal docking, with concomitant hyperactivation of RhoA and Dishevelled. TMEM216 formed a complex with Meckelin, which is encoded by a gene also mutated in JSRDs and MKS. Disruption of tmem216 expression in zebrafish caused gastrulation defects similar to those in other ciliary morphants. These data implicate a new family of proteins in the ciliopathies and further support allelism between ciliopathy disorders.
DOI: 10.1101/gr.070227.107
2008
Cited 249 times
Quality scores and SNP detection in sequencing-by-synthesis systems
Promising new sequencing technologies, based on sequencing-by-synthesis (SBS), are starting to deliver large amounts of DNA sequence at very low cost. Polymorphism detection is a key application. We describe general methods for improved quality scores and accurate automated polymorphism detection, and apply them to data from the Roche (454) Genome Sequencer 20. We assess our methods using known-truth data sets, which is critical to the validity of the assessments. We developed informative, base-by-base error predictors for this sequencer and used a variant of the phred binning algorithm to combine them into a single empirically derived quality score. These quality scores are more useful than those produced by the system software: They both better predict actual error rates and identify many more high-quality bases. We developed a SNP detection method, with variants for low coverage, high coverage, and PCR amplicon applications, and evaluated it on known-truth data sets. We demonstrate good specificity in single reads, and excellent specificity (no false positives in 215 kb of genome) in high-coverage data.
DOI: 10.1186/1471-2164-13-375
2012
Cited 233 times
Pacific biosciences sequencing technology for genotyping and variation discovery in human data
Pacific Biosciences technology provides a fundamentally new data type that provides the potential to overcome some limitations of current next generation sequencing platforms by providing significantly longer reads, single molecule sequencing, low composition bias and an error profile that is orthogonal to other platforms. With these potential advantages in mind, we here evaluate the utility of the Pacific Biosciences RS platform for human medical amplicon resequencing projects.We evaluated the Pacific Biosciences technology for SNP discovery in medical resequencing projects using the Genome Analysis Toolkit, observing high sensitivity and specificity for calling differences in amplicons containing known true or false SNPs. We assessed data quality: most errors were indels (~14%) with few apparent miscalls (~1%). In this work, we define a custom data processing pipeline for Pacific Biosciences data for human data analysis.Critically, the error properties were largely free of the context-specific effects that affect other sequencing technologies. These data show excellent utility for follow-up validation and extension studies in human data and medical genetics projects, but can be extended to other organisms with a reference genome.
DOI: 10.1126/scitranslmed.3003544
2012
Cited 225 times
Exome Sequencing Can Improve Diagnosis and Alter Patient Management
Exome sequencing of 118 patients with neurodevelopmental disorders shows that this technique is useful for identifying new pathogenic mutations and for correcting diagnosis in ~10% of cases.
DOI: 10.1186/gb-2013-14-2-r15
2013
Cited 223 times
Premetazoan genome evolution and the regulation of cell differentiation in the choanoflagellate Salpingoeca rosetta
Metazoan multicellularity is rooted in mechanisms of cell adhesion, signaling, and differentiation that first evolved in the progenitors of metazoans. To reconstruct the genome composition of metazoan ancestors, we sequenced the genome and transcriptome of the choanoflagellate Salpingoeca rosetta, a close relative of metazoans that forms rosette-shaped colonies of cells.A comparison of the 55 Mb S. rosetta genome with genomes from diverse opisthokonts suggests that the origin of metazoans was preceded by a period of dynamic gene gain and loss. The S. rosetta genome encodes homologs of cell adhesion, neuropeptide, and glycosphingolipid metabolism genes previously found only in metazoans and expands the repertoire of genes inferred to have been present in the progenitors of metazoans and choanoflagellates. Transcriptome analysis revealed that all four S. rosetta septins are upregulated in colonies relative to single cells, suggesting that these conserved cytokinesis proteins may regulate incomplete cytokinesis during colony development. Furthermore, genes shared exclusively by metazoans and choanoflagellates were disproportionately upregulated in colonies and the single cells from which they develop.The S. rosetta genome sequence refines the catalog of metazoan-specific genes while also extending the evolutionary history of certain gene families that are central to metazoan biology. Transcriptome data suggest that conserved cytokinesis genes, including septins, may contribute to S. rosetta colony formation and indicate that the initiation of colony development may preferentially draw upon genes shared with metazoans, while later stages of colony maturation are likely regulated by genes unique to S. rosetta.
DOI: 10.1038/ng.3121
2014
Cited 213 times
Comprehensive variation discovery in single human genomes
Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome.
DOI: 10.1101/gr.141515.112
2012
Cited 207 times
Finished bacterial genomes from shotgun sequence data
Exceptionally accurate genome reference sequences have proven to be of great value to microbial researchers. Thus, to date, about 1800 bacterial genome assemblies have been "finished" at great expense with the aid of manual laboratory and computational processes that typically iterate over a period of months or even years. By applying a new laboratory design and new assembly algorithm to 16 samples, we demonstrate that assemblies exceeding finished quality can be obtained from whole-genome shotgun data and automated computation. Cost and time requirements are thus dramatically reduced.
DOI: 10.1038/ng.2585
2013
Cited 176 times
Genomics of Loa loa, a Wolbachia-free filarial parasite of humans
Loa loa, the African eyeworm, is a major filarial pathogen of humans. Unlike most filariae, L. loa does not contain the obligate intracellular Wolbachia endosymbiont. We describe the 91.4-Mb genome of L. loa and that of the related filarial parasite Wuchereria bancrofti and predict 14,907 L. loa genes on the basis of microfilarial RNA sequencing. By comparing these genomes to that of another filarial parasite, Brugia malayi, and to those of several other nematodes, we demonstrate synteny among filariae but not with nonparasitic nematodes. The L. loa genome encodes many immunologically relevant genes, as well as protein kinases targeted by drugs currently approved for use in humans. Despite lacking Wolbachia, L. loa shows no new metabolic synthesis or transport capabilities compared to other filariae. These results suggest that the role of Wolbachia in filarial biology is more subtle than previously thought and reveal marked differences between parasitic and nonparasitic nematodes.
DOI: 10.1016/j.stem.2019.04.005
2019
Cited 157 times
YAP, but Not RSPO-LGR4/5, Signaling in Biliary Epithelial Cells Promotes a Ductular Reaction in Response to Liver Injury
Biliary epithelial cells (BECs) form bile ducts in the liver and are facultative liver stem cells that establish a ductular reaction (DR) to support liver regeneration following injury. Liver damage induces periportal LGR5+ putative liver stem cells that can form BEC-like organoids, suggesting that RSPO-LGR4/5-mediated WNT/β-catenin activity is important for a DR. We addressed the roles of this and other signaling pathways in a DR by performing a focused CRISPR-based loss-of-function screen in BEC-like organoids, followed by in vivo validation and single-cell RNA sequencing. We found that BECs lack and do not require LGR4/5-mediated WNT/β-catenin signaling during a DR, whereas YAP and mTORC1 signaling are required for this process. Upregulation of AXIN2 and LGR5 is required in hepatocytes to enable their regenerative capacity in response to injury. Together, these data highlight heterogeneity within the BEC pool, delineate signaling pathways involved in a DR, and clarify the identity and roles of injury-induced periportal LGR5+ cells.
DOI: 10.15252/embr.201845889
2018
Cited 135 times
<scp>TMEM</scp> 41B is a novel regulator of autophagy and lipid mobilization
Autophagy maintains cellular homeostasis by targeting damaged organelles, pathogens, or misfolded protein aggregates for lysosomal degradation. The autophagic process is initiated by the formation of autophagosomes, which can selectively enclose cargo via autophagy cargo receptors. A machinery of well-characterized autophagy-related proteins orchestrates the biogenesis of autophagosomes; however, the origin of the required membranes is incompletely understood. Here, we have applied sensitized pooled CRISPR screens and identify the uncharacterized transmembrane protein TMEM41B as a novel regulator of autophagy. In the absence of TMEM41B, autophagosome biogenesis is stalled, LC3 accumulates at WIPI2- and DFCP1-positive isolation membranes, and lysosomal flux of autophagy cargo receptors and intracellular bacteria is impaired. In addition to defective autophagy, TMEM41B knockout cells display significantly enlarged lipid droplets and reduced mobilization and β-oxidation of fatty acids. Immunostaining and interaction proteomics data suggest that TMEM41B localizes to the endoplasmic reticulum (ER). Taken together, we propose that TMEM41B is a novel ER-localized regulator of autophagosome biogenesis and lipid mobilization.
DOI: 10.1093/hmg/8.2.157
1999
Cited 337 times
Deletions of the heavy neurofilament subunit tail in amyotrophic lateral sclerosis
Amyotrophic lateral sclerosis (ALS) is a progressive motor neuron degeneration resulting in paralysis and death, usually within 3 years of onset. Pathological and animal studies implicate neurofilament involvement in ALS, but whether this is primary or secondary is not clear. The heavy neurofilament subunit (NFH) tail is composed of a repeating amino acid motif, usually X-lysine-serine-proline-Y-lysine (XKSPYK), where X is a single amino acid and Y is one to three amino acids. There are two common polymorphic variants of 44 or 45 repeats. The tail probably regulates axonal calibre, with interfilament spacing determined by phosphorylation of the KSP motifs. A previous study suggested an association between sporadic cases of ALS and NFH tail deletions, but two subsequent studies have found none. We have analysed samples from two different populations (UK 207, Scandinavia 323) with age-matched controls for each group (UK 219, Scandinavia 228) and have found four novel NFH tail deletions, each involving a whole motif. These were found in three patients with sporadic ALS and a family with autosomal dominant ALS, although another was also found in two young controls. In all cases motif deletions were only associated with disease when paired with the long NFH allele. The deletions all occurred within a small region of the NFH tail. This has allowed us to propose a structural organization of the tail as well as allowing observed deletions both from this study and previous reports to be organized into logical groups. These results strongly suggest that NFH motif deletions can be a primary event in ALS but that they are not common.
DOI: 10.1038/5009
1999
Cited 243 times
Variation in DCP1, encoding ACE, is associated with susceptibility to Alzheimer disease
DOI: 10.1371/journal.pone.0005683
2009
Cited 209 times
Quantitative Deep Sequencing Reveals Dynamic HIV-1 Escape and Large Population Shifts during CCR5 Antagonist Therapy In Vivo
High-throughput sequencing platforms provide an approach for detecting rare HIV-1 variants and documenting more fully quasispecies diversity. We applied this technology to the V3 loop-coding region of env in samples collected from 4 chronically HIV-infected subjects in whom CCR5 antagonist (vicriviroc [VVC]) therapy failed. Between 25,000-140,000 amplified sequences were obtained per sample. Profound baseline V3 loop sequence heterogeneity existed; predicted CXCR4-using populations were identified in a largely CCR5-using population. The V3 loop forms associated with subsequent virologic failure, either through CXCR4 use or the emergence of high-level VVC resistance, were present as minor variants at 0.8-2.8% of baseline samples. Extreme, rapid shifts in population frequencies toward these forms occurred, and deep sequencing provided a detailed view of the rapid evolutionary impact of VVC selection. Greater V3 diversity was observed post-selection. This previously unreported degree of V3 loop sequence diversity has implications for viral pathogenesis, vaccine design, and the optimal use of HIV-1 CCR5 antagonists.
DOI: 10.1371/journal.pgen.1003272
2013
Cited 168 times
Distinctive Expansion of Potential Virulence Genes in the Genome of the Oomycete Fish Pathogen Saprolegnia parasitica
Oomycetes in the class Saprolegniomycetidae of the Eukaryotic kingdom Stramenopila have evolved as severe pathogens of amphibians, crustaceans, fish and insects, resulting in major losses in aquaculture and damage to aquatic ecosystems. We have sequenced the 63 Mb genome of the fresh water fish pathogen, Saprolegnia parasitica. Approximately 1/3 of the assembled genome exhibits loss of heterozygosity, indicating an efficient mechanism for revealing new variation. Comparison of S. parasitica with plant pathogenic oomycetes suggests that during evolution the host cellular environment has driven distinct patterns of gene expansion and loss in the genomes of plant and animal pathogens. S. parasitica possesses one of the largest repertoires of proteases (270) among eukaryotes that are deployed in waves at different points during infection as determined from RNA-Seq data. In contrast, despite being capable of living saprotrophically, parasitism has led to loss of inorganic nitrogen and sulfur assimilation pathways, strikingly similar to losses in obligate plant pathogenic oomycetes and fungi. The large gene families that are hallmarks of plant pathogenic oomycetes such as Phytophthora appear to be lacking in S. parasitica, including those encoding RXLR effectors, Crinkler's, and Necrosis Inducing-Like Proteins (NLP). S. parasitica also has a very large kinome of 543 kinases, 10% of which is induced upon infection. Moreover, S. parasitica encodes several genes typical of animals or animal-pathogens and lacking from other oomycetes, including disintegrins and galactose-binding lectins, whose expression and evolutionary origins implicate horizontal gene transfer in the evolution of animal pathogenesis in S. parasitica.
DOI: 10.7554/elife.01287
2013
Cited 146 times
Regulated aggregative multicellularity in a close unicellular relative of metazoa
The evolution of metazoans from their unicellular ancestors was one of the most important events in the history of life. However, the cellular and genetic changes that ultimately led to the evolution of multicellularity are not known. In this study, we describe an aggregative multicellular stage in the protist Capsaspora owczarzaki, a close unicellular relative of metazoans. Remarkably, transition to the aggregative stage is associated with significant upregulation of orthologs of genes known to establish multicellularity and tissue architecture in metazoans. We further observe transitions in regulated alternative splicing during the C. owczarzaki life cycle, including the deployment of an exon network associated with signaling, a feature of splicing regulation so far only observed in metazoans. Our results reveal the existence of a highly regulated aggregative stage in C. owczarzaki and further suggest that features of aggregative behavior in an ancestral protist may had been co-opted to develop some multicellular properties currently seen in metazoans.
DOI: 10.7554/elife.19090
2016
Cited 133 times
Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome
The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena's germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum.
DOI: 10.7554/elife.17290
2016
Cited 121 times
Functional CRISPR screening identifies the ufmylation pathway as a regulator of SQSTM1/p62
SQSTM1 is an adaptor protein that integrates multiple cellular signaling pathways and whose expression is tightly regulated at the transcriptional and post-translational level. Here, we describe a forward genetic screening paradigm exploiting CRISPR-mediated genome editing coupled to a cell selection step by FACS to identify regulators of SQSTM1. Through systematic comparison of pooled libraries, we show that CRISPR is superior to RNAi in identifying known SQSTM1 modulators. A genome-wide CRISPR screen exposed MTOR signalling and the entire macroautophagy machinery as key regulators of SQSTM1 and identified several novel modulators including HNRNPM, SLC39A14, SRRD, PGK1 and the ufmylation cascade. We show that ufmylation regulates SQSTM1 by eliciting a cell type-specific ER stress response which induces SQSTM1 expression and results in its accumulation in the cytosol. This study validates pooled CRISPR screening as a powerful method to map the repertoire of cellular pathways that regulate the fate of an individual target protein.
DOI: 10.1371/journal.pone.0096094
2014
Cited 114 times
Massively Parallel Sequencing of Human Urinary Exosome/Microvesicle RNA Reveals a Predominance of Non-Coding RNA
Intact RNA from exosomes/microvesicles (collectively referred to as microvesicles) has sparked much interest as potential biomarkers for the non-invasive analysis of disease. Here we use the Illumina Genome Analyzer to determine the comprehensive array of nucleic acid reads present in urinary microvesicles. Extraneous nucleic acids were digested using RNase and DNase treatment and the microvesicle inner nucleic acid cargo was analyzed with and without DNase digestion to examine both DNA and RNA sequences contained in microvesicles. Results revealed that a substantial proportion (∼87%) of reads aligned to ribosomal RNA. Of the non-ribosomal RNA sequences, ∼60% aligned to non-coding RNA and repeat sequences including LINE, SINE, satellite repeats, and RNA repeats (tRNA, snRNA, scRNA and srpRNA). The remaining ∼40% of non-ribosomal RNA reads aligned to protein coding genes and splice sites encompassing approximately 13,500 of the known 21,892 protein coding genes of the human genome. Analysis of protein coding genes specific to the renal and genitourinary tract revealed that complete segments of the renal nephron and collecting duct as well as genes indicative of the bladder and prostate could be identified. This study reveals that the entire genitourinary system may be mapped using microvesicle transcript analysis and that the majority of non-ribosomal RNA sequences contained in microvesicles is potentially functional non-coding RNA, which play an emerging role in cell regulation.
DOI: 10.1016/j.cub.2009.11.035
2010
Cited 113 times
A Cellular Memory of Developmental History Generates Phenotypic Diversity in C. elegans
Early life experiences have a major impact on adult phenotypes [1-3]. However, the mechanisms by which animals retain a cellular memory of early experience are not well understood. Here we show that adult wild-type Caenorhabditis elegans that transiently pass through the stress-resistant dauer larval stage exhibit distinct gene expression profiles and life history traits, as compared to adult animals that bypassed this stage. Using chromatin immunoprecipitation experiments coupled with massively parallel sequencing, we found that genome-wide levels of specific histone tail modifications are markedly altered in postdauer animals. Mutations in subsets of genes implicated in chromatin remodeling abolish, or alter, the observed changes in gene expression and life history traits in postdauer animals. Modifications to the epigenome as a consequence of early experience may contribute in part to a memory of early experience and generate phenotypic variation in an isogenic population.
DOI: 10.1186/gb-2011-12-8-r73
2011
Cited 103 times
Hybrid selection for sequencing pathogen genomes from clinical samples
We have adapted a solution hybrid selection protocol to enrich pathogen DNA in clinical samples dominated by human genetic material. Using mock mixtures of human and Plasmodium falciparum malaria parasite DNA as well as clinical samples from infected patients, we demonstrate an average of approximately 40-fold enrichment of parasite DNA after hybrid selection. This approach will enable efficient genome sequencing of pathogens from clinical samples, as well as sequencing of endosymbiotic organisms such as Wolbachia that live inside diverse metazoan phyla.
DOI: 10.1073/pnas.1711023115
2017
Cited 77 times
Genome-wide CRISPR screen for PARKIN regulators reveals transcriptional repression as a determinant of mitophagy
Significance In mitophagy, damaged mitochondria are targeted for disposal by the autophagy machinery. PARKIN promotes signaling of mitochondrial damage to the autophagy machinery for engagement, and PARKIN mutations cause Parkinson’s disease, possibly because damaged mitochondria accumulate in neurons. Because regulation of PARKIN abundance and the impact on signaling are poorly understood, we performed a genetic screen to identify PARKIN abundance regulators. Both positive and negative regulators were identified and will help us to further understand mitophagy and Parkinson’s disease. We show that some of the identified genes negatively regulate PARKIN gene expression, which impacts signaling of mitochondrial damage in mitophagy. This link between transcriptional repression and mitophagy is also apparent in neurons in culture, bearing implications for disease.
DOI: 10.1038/s41467-019-12143-3
2019
Cited 75 times
USP7 inhibits Wnt/β-catenin signaling through promoting stabilization of Axin
Abstract Axin is a key scaffolding protein responsible for the formation of the β-catenin destruction complex. Stability of Axin protein is regulated by the ubiquitin-proteasome system, and modulation of cellular concentration of Axin protein has a profound effect on Wnt/β-catenin signaling. Although E3s promoting Axin ubiquitination have been identified, the deubiquitinase responsible for Axin deubiquitination and stabilization remains unknown. Here, we identify USP7 as a potent negative regulator of Wnt/β-catenin signaling through CRISPR screens. Genetic ablation or pharmacological inhibition of USP7 robustly increases Wnt/β-catenin signaling in multiple cellular systems. USP7 directly interacts with Axin through its TRAF domain, and promotes deubiquitination and stabilization of Axin. Inhibition of USP7 regulates osteoblast differentiation and adipocyte differentiation through increasing Wnt/β-catenin signaling. Our study reveals a critical mechanism that prevents excessive degradation of Axin and identifies USP7 as a target for sensitizing cells to Wnt/β-catenin signaling.
DOI: 10.1016/j.celrep.2019.03.043
2019
Cited 66 times
Genome-Scale CRISPR Screens Identify Human Pluripotency-Specific Genes
Human pluripotent stem cells (hPSCs) generate a variety of disease-relevant cells that can be used to improve the translation of preclinical research. Despite the potential of hPSCs, their use for genetic screening has been limited by technical challenges. We developed a scalable and renewable Cas9 and sgRNA-hPSC library in which loss-of-function mutations can be induced at will. Our inducible mutant hPSC library can be used for multiple genome-wide CRISPR screens in a variety of hPSC-induced cell types. As proof of concept, we performed three screens for regulators of properties fundamental to hPSCs: their ability to self-renew and/or survive (fitness), their inability to survive as single-cell clones, and their capacity to differentiate. We identified the majority of known genes and pathways involved in these processes, as well as a plethora of genes with unidentified roles. This resource will increase the understanding of human development and genetics. This approach will be a powerful tool to identify disease-modifying genes and pathways.
DOI: 10.1016/j.biopsych.2004.08.008
2004
Cited 109 times
Functional effects of a tandem duplication polymorphism in the 5′flanking region of the DRD4 gene
Background Several polymorphisms have been identified in the 5′flanking region of the human dopamine D4 receptor gene (DRD4), including a tandem duplication polymorphism. This comprises a 120-base-pair repeat sequence that is known to have different allele frequencies in various populations around the world. Furthermore, various studies have revealed evidence of linkage to attention-deficit/hyperactivity disorder and association with schizophrenia and methamphetamine abuse. The location of the polymorphism in the 5′regulatory region of the DRD4 gene and the fact that it consists of potential transcription factor binding sites suggest that it might confer differential transcriptional activity of the alleles. Methods We investigated the functional effects of this gene variant with transient transfection methods in four human cell lines and then assessed transcriptional activity with luciferase reporter gene assays. Results The longer allele has lower transcriptional activity than the shorter allele in SK-N-MC, SH-SY5Y, HEK293, and HeLa cell lines. Conclusions This evidence suggests that the duplication might have a role in regulating the expression of the DRD4 gene and provides an understanding of the biological mechanisms underlying the etiology of neuropsychiatric disorders such as ADHD, schizophrenia, and metamphetamine abuse. Several polymorphisms have been identified in the 5′flanking region of the human dopamine D4 receptor gene (DRD4), including a tandem duplication polymorphism. This comprises a 120-base-pair repeat sequence that is known to have different allele frequencies in various populations around the world. Furthermore, various studies have revealed evidence of linkage to attention-deficit/hyperactivity disorder and association with schizophrenia and methamphetamine abuse. The location of the polymorphism in the 5′regulatory region of the DRD4 gene and the fact that it consists of potential transcription factor binding sites suggest that it might confer differential transcriptional activity of the alleles. We investigated the functional effects of this gene variant with transient transfection methods in four human cell lines and then assessed transcriptional activity with luciferase reporter gene assays. The longer allele has lower transcriptional activity than the shorter allele in SK-N-MC, SH-SY5Y, HEK293, and HeLa cell lines. This evidence suggests that the duplication might have a role in regulating the expression of the DRD4 gene and provides an understanding of the biological mechanisms underlying the etiology of neuropsychiatric disorders such as ADHD, schizophrenia, and metamphetamine abuse.
DOI: 10.1186/gb-2010-11-2-r15
2010
Cited 101 times
A scalable, fully automated process for construction of sequence-ready barcoded libraries for 454
We present an automated, high throughput library construction process for 454 technology. Sample handling errors and cross-contamination are minimized via end-to-end barcoding of plasticware, along with molecular DNA barcoding of constructs. Automation-friendly magnetic bead-based size selection and cleanup steps have been devised, eliminating major bottlenecks and significant sources of error. Using this methodology, one technician can create 96 sequence-ready 454 libraries in 2 days, a dramatic improvement over the standard method.
DOI: 10.1186/gb-2012-13-4-r27
2012
Cited 91 times
Genome-wide identification and characterization of replication origins by deep sequencing
DNA replication initiates at distinct origins in eukaryotic genomes, but the genomic features that define these sites are not well understood.We have taken a combined experimental and bioinformatic approach to identify and characterize origins of replication in three distantly related fission yeasts: Schizosaccharomyces pombe, Schizosaccharomyces octosporus and Schizosaccharomyces japonicus. Using single-molecule deep sequencing to construct amplification-free high-resolution replication profiles, we located origins and identified sequence motifs that predict origin function. We then mapped nucleosome occupancy by deep sequencing of mononucleosomal DNA from the corresponding species, finding that origins tend to occupy nucleosome-depleted regions.The sequences that specify origins are evolutionarily plastic, with low complexity nucleosome-excluding sequences functioning in S. pombe and S. octosporus, and binding sites for trans-acting nucleosome-excluding proteins functioning in S. japonicus. Furthermore, chromosome-scale variation in replication timing is conserved independently of origin location and via a mechanism distinct from known heterochromatic effects on origin function. These results are consistent with a model in which origins are simply the nucleosome-depleted regions of the genome with the highest affinity for the origin recognition complex. This approach provides a general strategy for understanding the mechanisms that define DNA replication origins in eukaryotes.
DOI: 10.1126/science.1213506
2012
Cited 87 times
Evolutionarily Assembled cis-Regulatory Module at a Human Ciliopathy Locus
Distinguishing Ciliopathy Cilia were once thought to be evolutionary remnants, but structural defects reveal their importance in signaling pathways and human disease, such as Joubert syndrome. Either of the genes TMEM138 and TMEM216 can be found mutated in phenotypically indistinguishable ciliopathy patients. Interestingly, despite their lack of sequence homology, these genes have always been aligned in head-to-tail configuration during vertebrate evolution. The proteins expressed by these genes mark distinct tethered vesicles, which differentially carry ciliary proteins for assembly. Lee et al. (p. 966 , published online 26 January; see the Perspective by Chakravarti and Kapoor ) show that the coordinated expression of these adjacent genes depends upon a coevolved regulatory element in the noncoding intergenic region, which thus integrates the roles of both gene products. This discovery explains not only the indistinguishable pathogenesis of the patients' genotypes but also how the evolutionary clustering of genes unrelated in sequence may correlate with coordinated control of expression and function.
DOI: 10.1038/s41592-018-0149-1
2018
Cited 66 times
Guide Swap enables genome-scale pooled CRISPR–Cas9 screening in human primary cells
DOI: 10.1016/j.celrep.2017.03.071
2017
Cited 65 times
The Natural Product Cavinafungin Selectively Interferes with Zika and Dengue Virus Replication by Inhibition of the Host Signal Peptidase
Flavivirus infections by Zika and dengue virus impose a significant global healthcare threat with no US Food and Drug Administration (FDA)-approved vaccination or specific antiviral treatment available. Here, we present the discovery of an anti-flaviviral natural product named cavinafungin. Cavinafungin is a potent and selectively active compound against Zika and all four dengue virus serotypes. Unbiased, genome-wide genomic profiling in human cells using a novel CRISPR/Cas9 protocol identified the endoplasmic-reticulum-localized signal peptidase as the efficacy target of cavinafungin. Orthogonal profiling in S. cerevisiae followed by the selection of resistant mutants pinpointed the catalytic subunit of the signal peptidase SEC11 as the evolutionary conserved target. Biochemical analysis confirmed a rapid block of signal sequence cleavage of both host and viral proteins by cavinafungin. This study provides an effective compound against the eukaryotic signal peptidase and independent confirmation of the recently identified critical role of the signal peptidase in the replicative cycle of flaviviruses.
DOI: 10.1074/jbc.m116.722967
2016
Cited 64 times
Tankyrase Inhibitor Sensitizes Lung Cancer Cells to Endothelial Growth Factor Receptor (EGFR) Inhibition via Stabilizing Angiomotins and Inhibiting YAP Signaling
YAP signaling pathway plays critical roles in tissue homeostasis, and aberrant activation of YAP signaling has been implicated in cancers. To identify tractable targets of YAP pathway, we have performed a pathway-based pooled CRISPR screen and identified tankyrase and its associated E3 ligase RNF146 as positive regulators of YAP signaling. Genetic ablation or pharmacological inhibition of tankyrase prominently suppresses YAP activity and YAP target gene expression. Using a proteomic approach, we have identified angiomotin family proteins, which are known negative regulators of YAP signaling, as novel tankyrase substrates. Inhibition of tankyrase or depletion of RNF146 stabilizes angiomotins. Angiomotins physically interact with tankyrase through a highly conserved motif at their N terminus, and mutation of this motif leads to their stabilization. Tankyrase inhibitor-induced stabilization of angiomotins reduces YAP nuclear translocation and decreases downstream YAP signaling. We have further shown that knock-out of YAP sensitizes non-small cell lung cancer to EGFR inhibitor Erlotinib. Tankyrase inhibitor, but not porcupine inhibitor, which blocks Wnt secretion, enhances growth inhibitory activity of Erlotinib. This activity is mediated by YAP inhibition and not Wnt/β-catenin inhibition. Our data suggest that tankyrase inhibition could serve as a novel strategy to suppress YAP signaling for combinatorial targeted therapy. YAP signaling pathway plays critical roles in tissue homeostasis, and aberrant activation of YAP signaling has been implicated in cancers. To identify tractable targets of YAP pathway, we have performed a pathway-based pooled CRISPR screen and identified tankyrase and its associated E3 ligase RNF146 as positive regulators of YAP signaling. Genetic ablation or pharmacological inhibition of tankyrase prominently suppresses YAP activity and YAP target gene expression. Using a proteomic approach, we have identified angiomotin family proteins, which are known negative regulators of YAP signaling, as novel tankyrase substrates. Inhibition of tankyrase or depletion of RNF146 stabilizes angiomotins. Angiomotins physically interact with tankyrase through a highly conserved motif at their N terminus, and mutation of this motif leads to their stabilization. Tankyrase inhibitor-induced stabilization of angiomotins reduces YAP nuclear translocation and decreases downstream YAP signaling. We have further shown that knock-out of YAP sensitizes non-small cell lung cancer to EGFR inhibitor Erlotinib. Tankyrase inhibitor, but not porcupine inhibitor, which blocks Wnt secretion, enhances growth inhibitory activity of Erlotinib. This activity is mediated by YAP inhibition and not Wnt/β-catenin inhibition. Our data suggest that tankyrase inhibition could serve as a novel strategy to suppress YAP signaling for combinatorial targeted therapy.
DOI: 10.1073/pnas.1808575115
2018
Cited 50 times
mTORC1 signaling suppresses Wnt/β-catenin signaling through DVL-dependent regulation of Wnt receptor FZD level
Significance The Wnt/β-catenin signaling pathway plays prominent roles during embryonic development and adult tissue homeostasis by maintaining somatic stem cell functions. The mammalian target of rapamycin complex 1 (mTORC1) signaling pathway has also been implicated in regulating stem cell functions in multiple tissue types. However, the crosstalk between these two pathways remains largely unclear. Herein, using in vitro cell lines, ex vivo organoids, and an in vivo mouse model, we made striking findings in support of a paradigm that mTORC1 signaling cell autonomously suppresses Wnt/β-catenin signaling through down-regulating the Wnt receptor FZD level to influence stem cell functions, with implications in the aging process.
DOI: 10.1016/j.celrep.2019.10.113
2019
Cited 48 times
A Genome-wide CRISPR Screen Identifies ZCCHC14 as a Host Factor Required for Hepatitis B Surface Antigen Production
A hallmark of chronic hepatitis B (CHB) virus infection is the presence of high circulating levels of non-infectious small lipid HBV surface antigen (HBsAg) vesicles. Although rare, sustained HBsAg loss is the idealized endpoint of any CHB therapy. A small molecule, RG7834, has been previously reported to inhibit HBsAg expression by targeting terminal nucleotidyltransferase proteins 4A and 4B (TENT4A and TENT4B). In this study, we describe a genome-wide CRISPR screen to identify other potential host factors required for HBsAg expression and to gain further insights into the mechanism of RG7834. We report more than 60 genes involved in regulating HBsAg and identify additional factors involved in RG7834 activity, including a zinc finger CCHC-type containing 14 (ZCCHC14) protein. We show that ZCCHC14, together with TENT4A/B, stabilizes HBsAg expression through HBV RNA tailing, providing a potential new therapeutic target to achieve functional cure in CHB patients.
DOI: 10.1074/jbc.ra118.005103
2019
Cited 44 times
Bile acid analogues are activators of pyrin inflammasome
Bile acids are critical metabolites in the gastrointestinal tract and contribute to maintaining intestinal immune homeostasis through cross-talk with the gut microbiota. The conversion of bile acids by the gut microbiome is now recognized as a factor affecting both host metabolism and immune responses, but its physiological roles remain unclear. We conducted a screen for microbiome metabolites that would function as inflammasome activators and herein report the identification of 12-oxo-lithocholic acid (BAA485), a potential microbiome-derived bile acid metabolite. We demonstrate that the more potent analogue 11-oxo-12S-hydroxylithocholic acid methyl ester (BAA473) can induce secretion of interleukin-18 (IL-18) through activation of the inflammasome in both myeloid and intestinal epithelial cells. Using a genome-wide CRISPR screen with compound induced pyroptosis in THP-1 cells, we identified that inflammasome activation by BAA473 is pyrin-dependent (MEFV). To our knowledge, the bile acid analogues BAA485 and BAA473 are the first small molecule activators of the pyrin inflammasome. We surmise that pyrin inflammasome activation through microbiota-modified bile acid metabolites such as BAA473 and BAA485 plays a role in gut microbiota regulated intestinal immune response. The discovery of these two bioactive compounds may help to further unveil the importance of pyrin in gut homeostasis and autoimmune diseases. Bile acids are critical metabolites in the gastrointestinal tract and contribute to maintaining intestinal immune homeostasis through cross-talk with the gut microbiota. The conversion of bile acids by the gut microbiome is now recognized as a factor affecting both host metabolism and immune responses, but its physiological roles remain unclear. We conducted a screen for microbiome metabolites that would function as inflammasome activators and herein report the identification of 12-oxo-lithocholic acid (BAA485), a potential microbiome-derived bile acid metabolite. We demonstrate that the more potent analogue 11-oxo-12S-hydroxylithocholic acid methyl ester (BAA473) can induce secretion of interleukin-18 (IL-18) through activation of the inflammasome in both myeloid and intestinal epithelial cells. Using a genome-wide CRISPR screen with compound induced pyroptosis in THP-1 cells, we identified that inflammasome activation by BAA473 is pyrin-dependent (MEFV). To our knowledge, the bile acid analogues BAA485 and BAA473 are the first small molecule activators of the pyrin inflammasome. We surmise that pyrin inflammasome activation through microbiota-modified bile acid metabolites such as BAA473 and BAA485 plays a role in gut microbiota regulated intestinal immune response. The discovery of these two bioactive compounds may help to further unveil the importance of pyrin in gut homeostasis and autoimmune diseases.
DOI: 10.1111/j.1365-294x.2010.04472.x
2010
Cited 67 times
Key considerations for measuring allelic expression on a genomic scale using high‐throughput sequencing
Differences in gene expression are thought to be an important source of phenotypic diversity, so dissecting the genetic components of natural variation in gene expression is important for understanding the evolutionary mechanisms that lead to adaptation. Gene expression is a complex trait that, in diploid organisms, results from transcription of both maternal and paternal alleles. Directly measuring allelic expression rather than total gene expression offers greater insight into regulatory variation. The recent emergence of high-throughput sequencing offers an unprecedented opportunity to study allelic transcription at a genomic scale for virtually any species. By sequencing transcript pools derived from heterozygous individuals, estimates of allelic expression can be directly obtained. The statistical power of this approach is influenced by the number of transcripts sequenced and the ability to unambiguously assign individual sequence fragments to specific alleles on the basis of transcribed nucleotide polymorphisms. Here, using mathematical modelling and computer simulations, we determine the minimum sequencing depth required to accurately measure relative allelic expression and detect allelic imbalance via high-throughput sequencing under a variety of conditions. We conclude that, within a species, a minimum of 500-1000 sequencing reads per gene are needed to test for allelic imbalance, and consequently, at least five to 10 millions reads are required for studying a genome expressing 10 000 genes. Finally, using 454 sequencing, we illustrate an application of allelic expression by testing for cis-regulatory divergence between closely related Drosophila species.
DOI: 10.7554/elife.50223
2019
Cited 34 times
Genome-wide CRISPR screening reveals genetic modifiers of mutant EGFR dependence in human NSCLC
EGFR-mutant NSCLCs frequently respond to EGFR tyrosine kinase inhibitors (TKIs). However, the responses are not durable, and the magnitude of tumor regression is variable, suggesting the existence of genetic modifiers of EGFR dependency. Here, we applied a genome-wide CRISPR-Cas9 screening to identify genetic determinants of EGFR TKI sensitivity and uncovered putative candidates. We show that knockout of RIC8A, essential for G-alpha protein activation, enhanced EGFR TKI-induced cell death. Mechanistically, we demonstrate that RIC8A is a positive regulator of YAP signaling, activation of which rescued the EGFR TKI sensitizing phenotype resulting from RIC8A knockout. We also show that knockout of ARIH2, or other components in the Cullin-5 E3 complex, conferred resistance to EGFR inhibition, in part by promoting nascent protein synthesis through METAP2. Together, these data uncover a spectrum of previously unidentified regulators of EGFR TKI sensitivity in EGFR-mutant human NSCLC, providing insights into the heterogeneity of EGFR TKI treatment responses.
DOI: 10.1038/s41467-022-28567-3
2022
Cited 14 times
Cell adhesion molecule KIRREL1 is a feedback regulator of Hippo signaling recruiting SAV1 to cell-cell contact sites
The Hippo/YAP pathway controls cell proliferation through sensing physical and spatial organization of cells. How cell-cell contact is sensed by Hippo signaling is poorly understood. Here, we identified the cell adhesion molecule KIRREL1 as an upstream positive regulator of the mammalian Hippo pathway. KIRREL1 physically interacts with SAV1 and recruits SAV1 to cell-cell contact sites. Consistent with the hypothesis that KIRREL1-mediated cell adhesion suppresses YAP activity, knockout of KIRREL1 increases YAP activity in neighboring cells. Analyzing pan-cancer CRISPR proliferation screen data reveals KIRREL1 as the top plasma membrane protein showing strong correlation with known Hippo regulators, highlighting a critical role of KIRREL1 in regulating Hippo signaling and cell proliferation. During liver regeneration in mice, KIRREL1 is upregulated, and its genetic ablation enhances hepatic YAP activity, hepatocyte reprogramming and biliary epithelial cell proliferation. Our data suggest that KIRREL1 functions as a feedback regulator of the mammalian Hippo pathway through sensing cell-cell interaction and recruiting SAV1 to cell-cell contact sites.
DOI: 10.1017/s0033291701003257
2001
Cited 84 times
Depression, APOE genotype and subjective memory impairment: a cross-sectional study in an African-Caribbean population
Subjective memory impairment (SMI) is common in older populations but its aetiology and clinical significance is uncertain. Depression has been reported to be strongly associated with SMI. Associations with objective cognitive impairment are less clear cut. Other factors suggested to be associated with SMI include poor physical health and the apolipoprotein E (APOE) epsilon4 allele. Studies of SMI have been predominantly confined to white Caucasian populations.A community study was carried out in a UK African-Caribbean population aged 55-75, sampled from primary care lists. Twenty-three per cent were classified with SMI. Depression was defined using the 10-item Geriatric Depression Scale. Other aetiological factors investigated were education, objective cognitive function, APOE genotype, disablement and vascular disease/risk. The principal analysis was restricted to 243 participants scoring > 20 on the Mini-Mental State Examination (85%). A second analysis included all 290 participants.Depression, self-reported physical impairment and APOE epsilon4 were associated with SMI. The association between SMI and physical impairment was not explained by depression, vascular disease/risk, or disability/handicap. The association between epsilon4 and SMI increased as MMSE scores decreased and was particularly strong in those with depression. The epsilon4 allele was present in 69% (95% CI 41-89%) of those with depression and SMI compared with 28% (20-36%) of those with neither.Depression may not be a sufficient explanation for subjective memory complaints. Memory complaints in the presence of depression are associated with high prevalence of epsilon4 and therefore, presumably, a raised risk of subsequent dementia.
DOI: 10.1074/jbc.m111.317370
2012
Cited 47 times
Yeast Sterol Regulatory Element-binding Protein (SREBP) Cleavage Requires Cdc48 and Dsc5, a Ubiquitin Regulatory X Domain-containing Subunit of the Golgi Dsc E3 Ligase
Schizosaccharomyces pombe Sre1 is a membrane-bound transcription factor that controls adaptation to hypoxia. Like its mammalian homolog, sterol regulatory element-binding protein (SREBP), Sre1 activation requires release from the membrane. However, in fission yeast, this release occurs through a strikingly different mechanism that requires the Golgi Dsc E3 ubiquitin ligase complex and the proteasome. The mechanistic details of Sre1 cleavage, including the link between the Dsc E3 ligase complex and proteasome, are not well understood. Here, we present results of a genetic selection designed to identify additional components required for Sre1 cleavage. From the selection, we identified two new components of the fission yeast SREBP pathway: Dsc5 and Cdc48. The AAA (ATPase associated with diverse cellular activities) ATPase Cdc48 and Dsc5, a ubiquitin regulatory X domain-containing protein, interact with known Dsc complex components and are required for SREBP cleavage. These findings provide a mechanistic link between the Dsc E3 ligase complex and the proteasome in SREBP cleavage and add to a growing list of similarities between the Dsc E3 ligase and membrane E3 ligases involved in endoplasmic reticulum-associated degradation.
DOI: 10.1038/srep42728
2017
Cited 37 times
Identification of a novel NAMPT inhibitor by CRISPR/Cas9 chemogenomic profiling in mammalian cells
Abstract Chemogenomic profiling is a powerful and unbiased approach to elucidate pharmacological targets and the mechanism of bioactive compounds. Until recently, genome-wide, high-resolution experiments of this nature have been limited to fungal systems due to lack of mammalian genome-wide deletion collections. With the example of a novel nicotinamide phosphoribosyltransferase (NAMPT) inhibitor, we demonstrate that the CRISPR/Cas9 system enables the generation of transient homo- and heterozygous deletion libraries and allows for the identification of efficacy targets and pathways mediating hypersensitivity and resistance relevant to the compound mechanism of action.
DOI: 10.1038/s41598-018-34601-6
2018
Cited 35 times
TRIAMF: A New Method for Delivery of Cas9 Ribonucleoprotein Complex to Human Hematopoietic Stem Cells
Abstract CRISPR/Cas9 mediated gene editing of patient-derived hematopoietic stem and progenitor cells (HSPCs) ex vivo followed by autologous transplantation of the edited HSPCs back to the patient can provide a potential cure for monogenic blood disorders such as β-hemoglobinopathies. One challenge for this strategy is efficient delivery of the ribonucleoprotein (RNP) complex, consisting of purified Cas9 protein and guide RNA, into HSPCs. Because β-hemoglobinopathies are most prevalent in developing countries, it is desirable to have a reliable, efficient, easy-to-use and cost effective delivery method. With this goal in mind, we developed TRansmembrane Internalization Assisted by Membrane Filtration (TRIAMF), a new method to quickly and effectively deliver RNPs into HSPCs by passing a RNP and cell mixture through a filter membrane. We achieved robust gene editing in HSPCs using TRIAMF and demonstrated that the multilineage colony forming capacities and the competence for engraftment in immunocompromised mice of HSPCs were preserved post TRIAMF treatment. TRIAMF is a custom designed system using inexpensive components and has the capacity to process HSPCs at clinical scale.
DOI: 10.1038/s41467-023-39527-w
2023
Cited 4 times
Cancer lineage-specific regulation of YAP responsive elements revealed through large-scale functional epigenomic screens
Abstract YAP is a key transcriptional co-activator of TEADs, it regulates cell growth and is frequently activated in cancer. In Malignant Pleural Mesothelioma (MPM), YAP is activated by loss-of-function mutations in upstream components of the Hippo pathway, while, in Uveal Melanoma (UM), YAP is activated in a Hippo-independent manner. To date, it is unclear if and how the different oncogenic lesions activating YAP impact its oncogenic program, which is particularly relevant for designing selective anti-cancer therapies. Here we show that, despite YAP being essential in both MPM and UM, its interaction with TEAD is unexpectedly dispensable in UM, limiting the applicability of TEAD inhibitors in this cancer type. Systematic functional interrogation of YAP regulatory elements in both cancer types reveals convergent regulation of broad oncogenic drivers in both MPM and UM, but also strikingly selective programs. Our work reveals unanticipated lineage-specific features of the YAP regulatory network that provide important insights to guide the design of tailored therapeutic strategies to inhibit YAP signaling across different cancer types.
DOI: 10.1117/12.814680
2009
Cited 40 times
Experimental result and simulation analysis for the use of pixelated illumination from source mask optimization for 22nm logic lithography process
We demonstrate experimentally for the first time the feasibility of applying SMO technology using pixelated illumination. Wafer images of SRAM contact holes were obtained to confirm the feasibility of using SMO for 22nm node lithography. There are still challenges in other areas of SMO integration such as mask build, mask inspection and repair, process modeling, full chip design issues and pixelated illumination, which is the emphasis in this paper. In this first attempt we successfully designed a manufacturable pixelated source and had it fabricated and installed in an exposure tool. The printing result is satisfactory, although there are still some deviations of the wafer image from simulation prediction. Further experiment and modeling of the impact of errors in source design and manufacturing will proceed in more detail. We believe that by tightening all kind of specification and optimizing all procedures will make pixelated illumination a viable technology for 22nm or beyond. Publisher's Note: The author listing for this paper has been updated to include Carsten Russ. The PDF has been updated to reflect this change.
DOI: 10.1038/sj.mp.4000852
2001
Cited 59 times
Identification of sequence variants and analysis of the role of the glycogen synthase kinase 3 β gene and promoter in late onset Alzheimer's disease
Alzheimer's disease (AD) is a disorder characterised by a progressive deterioration in memory and other cognitive functions. Glycogen synthase kinase 3 beta (GSK3 beta) phosphorylates the microtubule associated protein tau at sites that are aberrantly phosphorylated in AD. GSK3 beta binds to presenilin 1 and plays a role in wnt and insulin signalling cascades, both of which have been proposed to be linked to AD. Moreover GSK3 beta activity may be altered in AD brain. These observations suggest a central role for GSK3 beta in AD and led us to investigate GSK3 beta as a candidate gene for AD. We sought to identify sequence variations in the gene and its promoter, as these could have an effect on activity and expression leading to abnormal function. Sequencing over 3000 bp of the GSK3 beta putative promoter revealed there to be five sequence variations, two of which were common (>10%). However on further examination none of these, either alone or in synergy, had any association with late onset AD. Stratification of the data by APOE epsilon 4 status also produced no significant association. Sequencing of the GSK3 beta coding region revealed no variations. This would suggest that the aberrant phosphorylation of tau by GSK3 beta in AD is not due to sequence variations in the gene or its promoter.
DOI: 10.1212/wnl.54.2.397
2000
Cited 55 times
The association between APOE and dementia does not seem to be mediated by vascular factors
<b><i>Objective:</i></b> The effect of <i>APOE</i> on dementia may be mediated through dyslipidemia and atherogenesis through its effect on cholesterol metabolism. The authors investigated this possibility among aged survivors from the UK Medical Research Council Trial of the Treatment of Hypertension in Older Adults. <b><i>Design:</i></b> A total of 370 of 657 survivors from an initial cohort of 1,088 recruited into the trial between 1983 and 1985 were traced in 1994 and agreed to be screened for dementia. Blood samples were analyzed for <i>APOE</i> genotype and serum fibrinogen. Cholesterol level, smoking behavior, blood pressure, body mass index, and EKG recordings had been measured at recruitment 10 to 12 years earlier. Odds ratios (ORs) for the association between <i>APOE</i> ε4/* and both AD and dementia were estimated and adjusted incrementally for the effect of age and premorbid intelligence, cholesterol, other risk factors for vascular disease, and EKG evidence of cardiovascular disease. <b><i>Results:</i></b> The authors diagnosed 24 cases of National Institute of Neurological and Communicative Disorders and Stroke AD from 41 cases of dementia. The crude OR for the association between <i>APOE</i> ε4/* and AD was 3.40 (95% CI 1.30 to 8.91). <i>APOE</i> genotype was associated with serum cholesterol level, and there was a nonsignificant trend for an association with smoking behavior. After adjusting for these and all other vascular risk factors and vascular disease variables listed earlier, the OR for the association between <i>APOE</i> ε4/* and AD increased to 4.81 (1.60 to 14.4). <b><i>Conclusion:</i></b> Presence of <i>APOE</i> ε4/* seems to increase the risk for dementia and AD independently of its effect on dyslipidemia and atherogenesis.
DOI: 10.1016/s0304-3940(01)02289-3
2001
Cited 52 times
The microtubule associated protein Tau gene and Alzheimer's disease – an association study and meta-analysis
Several studies have suggested an association between polymorphisms and an extended haplotype of the microtubule associated protein tau gene and Alzheimer's disease (AD) in synergy with apolipoprotein E (APOE) epsilon 4 status. However these findings have not been consistently replicated. We investigated the role of the tau haplotype in AD by conducting an association study as well as a meta-analysis of all the studies conducted to date. We examined six polymorphisms known to be in the extended tau haplotypes, one in exon 7 and five in and around exon 9 in 200 late onset AD and 189 control samples. All the polymorphisms examined fell into the recognised tau haplotypes. There was no statistical significant association with any of the polymorphisms and late onset AD. Stratification of data by APOE epsilon 4 status also produced no strongly significant association. The meta-analysis showed no significant differences between AD cases and controls, however stratification of data by APOE epsilon 4 status showed a small significant decrease in the H1 haplotype in AD before correction for multiple testing.
DOI: 10.1016/j.molcel.2018.07.040
2018
Cited 25 times
Identification of ICAT as an APC Inhibitor, Revealing Wnt-Dependent Inhibition of APC-Axin Interaction
Adenomatous polyposis coli (APC) and Axin are core components of the β-catenin destruction complex. How APC's function is regulated and whether Wnt signaling influences the direct APC-Axin interaction to inhibit the β-catenin destruction complex is not clear. Through a CRISPR screen of β-catenin stability, we have identified ICAT, a polypeptide previously known to block β-catenin-TCF interaction, as a natural inhibitor of APC. ICAT blocks β-catenin-APC interaction and prevents β-catenin-mediated APC-Axin interaction, enhancing stabilization of β-catenin in cells harboring truncated APC or stimulated with Wnt, but not in cells deprived of a Wnt signal. Using ICAT as a tool to disengage β-catenin-mediated APC-Axin interaction, we demonstrate that Wnt quickly inhibits the direct interaction between APC and Axin. Our study highlights an important scaffolding function of β-catenin in the assembly of the destruction complex and suggests Wnt-inhibited APC-Axin interaction as a mechanism of Wnt-dependent inhibition of the destruction complex.
DOI: 10.1038/s41467-019-12559-x
2019
Cited 20 times
IRF2 is a master regulator of human keratinocyte stem cell fate
Abstract Resident adult epithelial stem cells maintain tissue homeostasis by balancing self-renewal and differentiation. The stem cell potential of human epidermal keratinocytes is retained in vitro but lost over time suggesting extrinsic and intrinsic regulation. Transcription factor-controlled regulatory circuitries govern cell identity, are sufficient to induce pluripotency and transdifferentiate cells. We investigate whether transcriptional circuitry also governs phenotypic changes within a given cell type by comparing human primary keratinocytes with intrinsically high versus low stem cell potential. Using integrated chromatin and transcriptional profiling, we implicate IRF2 as antagonistic to stemness and show that it binds and regulates active cis -regulatory elements at interferon response and antigen presentation genes. CRISPR-KD of IRF2 in keratinocytes with low stem cell potential increases self-renewal, migration and epidermis formation. These data demonstrate that transcription factor regulatory circuitries, in addition to maintaining cell identity, control plasticity within cell types and offer potential for therapeutic modulation of cell function.
DOI: 10.1038/s42003-021-02272-1
2021
Cited 15 times
Genome-wide CRISPR screen identifies protein pathways modulating tau protein levels in neurons
Abstract Aggregates of hyperphosphorylated tau protein are a pathological hallmark of more than 20 distinct neurodegenerative diseases, including Alzheimer’s disease, progressive supranuclear palsy, and frontotemporal dementia. While the exact mechanism of tau aggregation is unknown, the accumulation of aggregates correlates with disease progression. Here we report a genome-wide CRISPR screen to identify modulators of endogenous tau protein for the first time. Primary screens performed in SH-SY5Y cells, identified positive and negative regulators of tau protein levels. Hit validation of the top 43 candidate genes was performed using Ngn2-induced human cortical excitatory neurons. Using this approach, genes and pathways involved in modulation of endogenous tau levels were identified, including chromatin modifying enzymes, neddylation and ubiquitin pathway members, and components of the mTOR pathway. TSC1, a critical component of the mTOR pathway, was further validated in vivo, demonstrating the relevance of this screening strategy. These findings may have implications for treating neurodegenerative diseases in the future.
DOI: 10.1021/acschembio.1c00920
2022
Cited 9 times
DRUG-seq Provides Unbiased Biological Activity Readouts for Neuroscience Drug Discovery
Unbiased transcriptomic RNA-seq data has provided deep insights into biological processes. However, its impact in drug discovery has been narrow given high costs and low throughput. Proof-of-concept studies with Digital RNA with pertUrbation of Genes (DRUG)-seq demonstrated the potential to address this gap. We extended the DRUG-seq platform by subjecting it to rigorous testing and by adding an open-source analysis pipeline. The results demonstrate high reproducibility and ability to resolve the mechanism(s) of action for a diverse set of compounds. Furthermore, we demonstrate how this data can be incorporated into a drug discovery project aiming to develop therapeutics for schizophrenia using human stem cell-derived neurons. We identified both an on-target activation signature, induced by a set of chemically distinct positive allosteric modulators of the N-methyl-d-aspartate (NMDA) receptor, and independent off-target effects. Overall, the protocol and open-source analysis pipeline are a step toward industrializing RNA-seq for high-complexity transcriptomics studies performed at a saturating scale.
DOI: 10.1016/s0006-3223(97)00326-0
1998
Cited 45 times
Apolipoprotein E: Depressive illness, depressive symptoms, and Alzheimer's disease
Background: The apolipoprotein E (ApoE) ɛ4 and ɛ2 alleles may influence the age of onset of depressive illness. Depressive illness of late onset is also a risk factor for Alzheimer's disease (AD), and there is some evidence that the ApoE ɛ2 allele is associated with depressive symptomatology in AD. Depressive symptomatology in AD may thus share common genetic risk factors with late-onset depressive illness. Methods: The frequency of the ɛ2 and ɛ4 alleles of ApoE and their effects on age of onset of disease in three independent groups of subjects, with depressive illness, with AD, and controls, were compared in a defined population from Southeast London. Results: The frequency of the ApoE ɛ2 allele was significantly lower in the depressive illness group compared with the control group and was associated with a later mean age at onset. Subjects with depressive symptomatology in AD had a higher frequency of the ApoE ɛ2 allele and had a significantly later age of onset of depressive illness compared with the nondepressed AD group. Conclusions: The presence of the ApoE ɛ2 allele in AD is found to be highly associated with depressive symptomatology, and it is proposed that this subgroup represents the presence of delayed depressive illness and that there are common genetic risk factors between AD and depressive illness. The apolipoprotein E (ApoE) ɛ4 and ɛ2 alleles may influence the age of onset of depressive illness. Depressive illness of late onset is also a risk factor for Alzheimer's disease (AD), and there is some evidence that the ApoE ɛ2 allele is associated with depressive symptomatology in AD. Depressive symptomatology in AD may thus share common genetic risk factors with late-onset depressive illness. The frequency of the ɛ2 and ɛ4 alleles of ApoE and their effects on age of onset of disease in three independent groups of subjects, with depressive illness, with AD, and controls, were compared in a defined population from Southeast London. The frequency of the ApoE ɛ2 allele was significantly lower in the depressive illness group compared with the control group and was associated with a later mean age at onset. Subjects with depressive symptomatology in AD had a higher frequency of the ApoE ɛ2 allele and had a significantly later age of onset of depressive illness compared with the nondepressed AD group. The presence of the ApoE ɛ2 allele in AD is found to be highly associated with depressive symptomatology, and it is proposed that this subgroup represents the presence of delayed depressive illness and that there are common genetic risk factors between AD and depressive illness.
DOI: 10.1038/sj.mp.4000559
1999
Cited 45 times
Two novel variants in the DOPA decarboxylase gene: association with bipolar affective disorder
DOPA decarboxylase (DDC), also known as aromatic L-amino acid decarboxylase (AADC), is an enzyme involved in the synthesis of the important neurotransmitters dopamine, norepinephrine, and serotonin. In addition, it participates in the synthesis of trace amines; compounds suggested to act as endogenous modulators of central neurotransmission. Thus, DDC is regarded as a potential susceptibility gene for a variety of neuropsychiatric disorders. The aim of the present study was to examine the role of DDC in bipolar affective disorder (BPAD). By screening 10 individuals for sequence variations in the coding region of the DDC gene as well as in the neuron-specific promoter and 5' untranslated regions we were able to identify two fairly frequent variants: a 1-bp deletion in the promoter and a 4-bp deletion in the untranslated exon 1. Both deletions affect putative binding sites for known transcription factors, suggesting a possible functional impact at the level of expression. The two variants were applied in an association study including 80 Danish bipolar patients, 112 English bipolar patients, 223 Danish controls, and 349 English controls. Analyzing the combined material, a significant association was found between the 1-bp deletion and BPAD with P-values of 0.037 (allelic) and 0.021 (genotypic). The frequency of the 1-bp deletion was 13.3% in patients and 9.4% in controls with a corresponding odds ratio of 1. 48 (95% CI: 1.02-2.15). The results presented suggest that DDC may act as a minor susceptibility gene for bipolar affective disorder.
DOI: 10.1007/s00401-007-0280-z
2007
Cited 34 times
Increase in the relative expression of tau with four microtubule binding repeat regions in frontotemporal lobar degeneration and progressive supranuclear palsy brains
DOI: 10.1080/17482960801934403
2008
Cited 32 times
Evaluation of the Golgi trafficking protein VPS54 (<i>wobbler</i>) as a candidate for ALS
VPS54 is a component of the Golgi-associated retrograde protein (GARP) complex of vesicle sorting proteins. A missense mutation of Vps54 is responsible for motor neuron disease in the wobbler mouse, but the human gene on chromosome 2p14–15 has not been evaluated as a disease gene. We completely sequenced the 22 coding exons from 96 individuals with sporadic ALS, 96 individuals with familial ALS, and 96 controls. Twenty-one novel SNPs were identified. The non-synonymous variant, T360A, was observed in one patient and 0/910 controls. Several polymorphic non-synonymous SNPs were also observed in patients and controls. These initial data suggest that mutations in VPS54 are not a major cause of ALS.
DOI: 10.1007/s00294-015-0480-3
2015
Cited 22 times
Mitochondrial genome sequences reveal evolutionary relationships of the Phytophthora 1c clade species
Phytophthora infestans is one of the most destructive plant pathogens of potato and tomato globally. The pathogen is closely related to four other Phytophthora species in the 1c clade including P. phaseoli, P. ipomoeae, P. mirabilis and P. andina that are important pathogens of other wild and domesticated hosts. P. andina is an interspecific hybrid between P. infestans and an unknown Phytophthora species. We have sequenced mitochondrial genomes of the sister species of P. infestans and examined the evolutionary relationships within the clade. Phylogenetic analysis indicates that the P. phaseoli mitochondrial lineage is basal within the clade. P. mirabilis and P. ipomoeae are sister lineages and share a common ancestor with the Ic mitochondrial lineage of P. andina. These lineages in turn are sister to the P. infestans and P. andina Ia mitochondrial lineages. The P. andina Ic lineage diverged much earlier than the P. andina Ia mitochondrial lineage and P. infestans. The presence of two mitochondrial lineages in P. andina supports the hybrid nature of this species. The ancestral state of the P. andina Ic lineage in the tree and its occurrence only in the Andean regions of Ecuador, Colombia and Peru suggests that the origin of this species hybrid in nature may occur there.
DOI: 10.1128/genomea.00849-16
2016
Cited 20 times
Genome Sequence of <i>Spizellomyces punctatus</i>
ABSTRACT Spizellomyces punctatus is a basally branching chytrid fungus that is found in the Chytridiomycota phylum. Spizellomyces species are common in soil and of importance in terrestrial ecosystems. Here, we report the genome sequence of S. punctatus , which will facilitate the study of this group of early diverging fungi.
DOI: 10.1002/1096-8628(20001204)96:6<736::aid-ajmg8>3.0.co;2-2
2000
Cited 42 times
Systematic screening of the 14-3-3 eta (?) chain gene for polymorphic variants and case-control analysis in schizophrenia
The neuronal protein 14-3-3 eta is a candidate gene for schizophrenia because it maps chromosome 22q12, a region implicated in the disease by linkage analysis, and is involved in brain development. We systematically screened this gene for polymorphic variants by comparison of public EST sequence data (five cDNAs and 72 ESTs, 21,155 bp of sequence) in parallel with single-stranded conformational polymorphism analysis, and we compared these methods by using a simple power calculation. Twelve potential polymorphisms were identified from EST sequence comparison, and two of these (a 5'-VNTR and 753G/A) were confirmed by SSCP analysis and sequencing. Three additional infrequent polymorphisms (-408T/G; 177 C/G; and 989 A/G) were found by SSCP only. We next examined these variants for association with schizophrenia. One variant in untranslated region of exon 1 (-408 T/G) was found to occur more frequently in the schizophrenic subjects (8%) than the controls (3%; P = 0.01). After fivefold correction of the P value for multiple testing, marginal association was found. Haplotype analysis of pairs of polymorphisms provided no evidence for association of this gene with schizophrenia in the population studied. Am. J. Med. Genet. (Neuropsychiatr. Genet. ) 96:736-743, 2000.
DOI: 10.1080/17482960802103107
2008
Cited 27 times
50bp deletion in the promoter for superoxide dismutase 1 (SOD1) reduces SOD1 expression in vitro and may correlate with increased age of onset of sporadic amyotrophic lateral sclerosis
The objective was to test the hypothesis that a described association between homozygosity for a 50bp deletion in the SOD1 promoter 1684bp upstream of the SOD1 ATG and an increased age of onset in SALS can be replicated in additional SALS and control sample sets from other populations. Our second objective was to examine whether this deletion attenuates expression of the SOD1 gene. Genomic DNA from more than 1200 SALS cases from Ireland, Scotland, Quebec and the USA was genotyped for the 50bp SOD1 promoter deletion. Reporter gene expression analysis, electrophoretic mobility shift assays and chromatin immunoprecipitation studies were utilized to examine the functional effects of the deletion. The genetic association for homozygosity for the promoter deletion with an increased age of symptom onset was confirmed overall in this further study (p=0.032), although it was only statistically significant in the Irish subset, and remained highly significant in the combined set of all cohorts (p=0.001). Functional studies demonstrated that this polymorphism reduces the activity of the SOD1 promoter by approximately 50%. In addition we revealed that the transcription factor SP1 binds within the 50bp deletion region in vitro and in vivo. Our findings suggest the hypothesis that this deletion reduces expression of the SOD1 gene and that levels of the SOD1 protein may modify the phenotype of SALS within selected populations.
DOI: 10.1016/s0304-3940(00)01785-7
2001
Cited 33 times
The extended haplotype of the microtubule associated protein tau gene is not associated with Pick's disease
Pick's disease (PiD) is a rare neurodegenerative condition and is a member of a heterogeneous group of disorders known as tauopathies, so-called because of the predominantly neuronal aberrant tau accumulations found in these diseases. The tauopathy, familial frontotemporal dementia (FTD), is caused by mutations in the tau gene. Moreover, progressive supranuclear palsy (PSP) is associated with the tau H1 haplotype. In certain familial forms of FTD and in PSP the microtubule-binding four repeat tau isoform principally accumulates in neuropathological lesions. However, in PiD three repeat tau accumulations are found. We therefore investigated whether either the tau H1 or H2 haplotype was associated with PiD. Our results indicate a slight increase in H2H2 frequency in Pick's cases which is not statistically significant.
DOI: 10.1038/sj.mp.4001049
2002
Cited 32 times
Novel polymorphisms in the somatostatin receptor 5 (SSTR5) gene associated with bipolar affective disorder
The somatostatin receptor 5 (SSTR5) gene is a candidate gene for bipolar affective disorder (BPAD) as well as for other neuropsychiatric disorders. The gene is positioned on chromosome 16p13.3, a region that has been implicated by a few linkage studies to potentially harbor a disease susceptibility gene for BPAD. Recent evidence shows that the dopamine D2 receptor (DRD2) and SSTR5 interact physically to form heterodimers with enhanced functional activity. Brain D2 dopamine receptors are one of the major targets of neuroleptic treatments in psychiatric disorders. In this study we systematically screened the promoter and coding region of the SSTR5 gene for genetic variation that could contribute to the development of neuropsychiatric disorders. Eleven novel single nucleotide polymorphisms (SNPs) were identified including four missense SNPs, Leu48Met, Ala52Val, Pro109Ser and Pro335Leu. We carried out an association study of BPAD using 80 Danish cases and 144 control subjects, and replication analysis using 55 British cases and 88 control subjects. For the Danish population, association was suggested between silent SNP G573A and BPAD (P = 0.008). For the British population we found association to BPAD with missense mutation Leu48Met (P = 0.003) and missense mutation Pro335Leu (P = 0.004). The statistical significance of the association was, however, greatly reduced after correcting for multiple testing. When combining genotypes from Leu48Met and Pro335Leu into haplotypes, association to BPAD was found in the British population (P = 0.0007). This haplotype association was not replicated in the Danish population. Our results may indicate that the SSTR5 gene is involved in the etiology of BPAD or may exist in linkage disequilibrium with a susceptibility gene close to SSTR5. However, given the marginal statistical significance and the potential for false-positive results in association studies with candidate genes, further studies are needed to clarify this hypothesis.
DOI: 10.1016/j.neulet.2005.10.042
2006
Cited 29 times
Association analysis of the glycogen synthase kinase-3β gene in bipolar disorder
Glycogen synthase kinase-3β (GSK3β) is a target of lithium as well as sodium valproate, both of which are effective mood stabilizing prophylatics/treatments for bipolar disorder, a highly heritable psychiatric disorder. Though it is not clear whether the mood stabilizing effects of these drugs act directly through GSK3β, it is a good candidate for mediating at least part of lithium's action, and is an aetiological candidate gene for the disease itself. Recently, a potential locus for bipolar disorder was reported on chromosome 3q, close to 3q13.37 where GSK3β maps. We conducted an association study to test the hypothesis that polymorphism of GSK3β is involved in susceptibility to bipolar disorder by examining association between GSK3β-gene polymorphisms and bipolar disorder. Of the five polymorphisms we examined, three were very rare in the study population and were not examined further. Neither of the remaining two polymorphisms we examined showed association with bipolar disorder. Thus, it is unlikely that the GSK3β-gene is a risk factor for bipolar disorder in our sample, but we cannot exclude the gene completely as other unknown polymorphisms in the gene may increase susceptibility.
DOI: 10.1083/jcb.201706106
2018
Cited 15 times
TRRAP is a central regulator of human multiciliated cell formation
The multiciliated cell (MCC) is an evolutionarily conserved cell type, which in vertebrates functions to promote directional fluid flow across epithelial tissues. In the conducting airway, MCCs are generated by basal stem/progenitor cells and act in concert with secretory cells to perform mucociliary clearance to expel pathogens from the lung. Studies in multiple systems, including Xenopus laevis epidermis, murine trachea, and zebrafish kidney, have uncovered a transcriptional network that regulates multiple steps of multiciliogenesis, ultimately leading to an MCC with hundreds of motile cilia extended from their apical surface, which beat in a coordinated fashion. Here, we used a pool-based short hairpin RNA screening approach and identified TRRAP, an essential component of multiple histone acetyltransferase complexes, as a central regulator of MCC formation. Using a combination of immunofluorescence, signaling pathway modulation, and genomic approaches, we show that (a) TRRAP acts downstream of the Notch2-mediated basal progenitor cell fate decision and upstream of Multicilin to control MCC differentiation; and (b) TRRAP binds to the promoters and regulates the expression of a network of genes involved in MCC differentiation and function, including several genes associated with human ciliopathies.
DOI: 10.1016/s0140-6736(05)70292-0
1998
Cited 31 times
K variant of butyrycholinesterase and late-onset Alzheimer's disease
The function of butyrycholinesterase and its relation to acetylcholinesterase has been much debated. However, evidence 1 GomezRamos P Moran MA Ultrastructural localisation of butyrycholinesterase in senile plaques in the brains of aged and Alzheimer's disease patients. Mol Chem Neuropathol. 1997; 30: 161-173 Crossref PubMed Scopus (29) Google Scholar suggesting a role of the cholinesterases in the formation of plaques and tangles has given rise to speculation that genetic variants of butyrycholinesterase, with altered enzyme activity, may be associated with Alzheimer's disease. Lehmann and co-workers 2 Lehmann DJ Johnston C Smith AD Synergy between the genes for butyrycholinesterase K variant and apolipoprotein E4 in late-onset confirmed Alzheimer's disease. Hum Mol Genet. 1997; 6: 1933-1936 Crossref PubMed Scopus (184) Google Scholar proposed that the allelic frequency of the most common variant of butyrycholinesterase, the K variant, is increased in Alzheimer's disease compared with controls. In addition they found that the observed odds ratio of developing Alzheimer's disease in people carrying both the K variant and the ApoE ɛ4 alleles was greater than would have been predicted if these alleles had independent effects. This strong synergistic effect between the K variant and the ApoE ɛ4 alleles appeared greater in people over the age of 75 years.
DOI: 10.1038/ncomms3672
2013
Cited 15 times
Semiconductor-based DNA sequencing of histone modification states
The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues. Semiconductor-based, non-optical DNA sequencing technologies such as Ion Torrent sequencing offer speed and cost advantages compared with alternative techniques. Cheng et al. demonstrate a protocol allowing the use of Ion Torrent technology to sequence DNA from chromatin immunoprecipitation experiments.
DOI: 10.1055/s-2008-1034906
1998
Cited 26 times
Die Diagnostik der herpetischen Uveitis und Keratouveitis
In epithelial viral keratitis as in viral retinitis, the diagnosis is made on the basis of typical clinical findings. A laboratory confirmation is achieved in over 80% using routine laboratory methods. In contrast, it is almost impossible to confirm the diagnosis of stromal herpetic keratitis in vivo using the currently available laboratory methods. Nothing is known about the situation in cases of viral anterior uveitis.Of 52 patients with granulomatous anterior uveitis, 31 were diagnosed on the basis of clinical findings as active herpetic uveitis (group 1), 14 as active granulomatous uveitis of unknown origin (group 2), and 7 had inactive disease after quietening down of herpetic uveitis (group 3). From all patients, aqueous humor was collected at the time of diagnosis and processed for viral culture, Herpes antigen ELISA, and amplification of viral DNA of HSV-1 and VZV.Viral growth in culture was found in only one case in group 3. In this group, viral antigen or viral DNA were detected in no case. Herpes antigen was found in 5/31 cases (16%) in group 1 and in 1/11 cases (9%) in group 2, and viral DNA was found in 8/31 cases from group 1 (5x HSV-1 and 3x VZV) and in 5/14 cases (31%) from group 2. After combination of antigen detection and DNA amplification, the presence of virus was confirmed in 14/45 cases (29%).Virus culture has not proven useful in the diagnosis of viral anterior segment disease. Despite their high overall sensitivity, neither antigen ELISA nor the amplification of viral DNA proved sensitive enough to establish a viral etiology. Nevertheless, a laboratory confirmation should be attempted in granulomatous uveitis of unknown origin after preclusion of an underlying systemic disease because of the consequences of a diagnosis of viral anterior segment disease for treatment and prognosis.
DOI: 10.1159/000051267
2001
Cited 26 times
Apolipoprotein E Genotype, Vascular Risk and Early Cognitive Impairment in an African Caribbean Population
A reduced risk of Alzheimer’s disease (AD) associated with the apolipoprotein E (APOE) &amp;#917;4 allele is reported in populations of African origin. In order to clarify possible reasons for this, we examined the association between APOE genotype and early cognitive impairment in a community-based African Caribbean UK population aged 55–75 years. APOE genotype was available for 202 participants, 57 (28%) of whom were classified as having relative cognitive impairment on a battery of neuropsychological tests. Cognitive impairment was negatively associated with &amp;#917;2 and positively but more weakly associated with &amp;#917;4. Effects of both alleles increased markedly after age 70. The effect of &amp;#917;4 was increased in combination with hypertension, diabetes or lower educational attainment, but these factors did not influence &amp;#917;2 effects. Cholesterol and triglyceride levels partially explained effects of &amp;#917;2, but did not account for those of &amp;#917;4. A reduced association between &amp;#917;4 and later AD in populations of African origin is unlikely to be explained by reduced cognitive effects or by differential mortality. However, it may be accounted for by vascular comorbidity. The different patterns of association between &amp;#917;2 and &amp;#917;4 alleles suggest different pathways of effect.
DOI: 10.1097/00019442-200403000-00011
2004
Cited 23 times
Genes Related to Vascular Disease (APOE, VLDL-R, DCP-1) and Other Vascular Factors in Late-Life Depression
The authors asked whether polymorphic variation at three genes related to vascular disease, and other vascular disease risk factors, determine late-life depression.A group of 370 participants, representing 57% of survivors of an initial cohort of 1,083 participants in the Medical Research Council treatment trial of hypertension in older adults, had been screened for depression at baseline and were traced and genotyped for genetic analysis 11 years later. Genetic analyses were performed to establish variability at three polymorphisms related to vascular disease: APOE encoding for apolipoprotein-E, VLDL-R encoding for the VLDL cholesterol-receptor, and DCP-1 encoding for angiotensin-converter enzyme. Information on vascular disease and its risk factors (ECG ischemia or arrhythmia, body mass index, serum cholesterol, smoking status, and systolic/diastolic blood pressure) and cognitive functioning was also available from baseline.The authors found no association between the three studied polymorphisms and depression. Female gender, higher diastolic blood pressure, poorer cognitive functioning, and smoking status at baseline were all associated with depression independently of antidepressant and NSAIDs use, age, ECG-established vascular disease, and the remaining vascular disease risk factors studied.This study found no association between late-life depression and three polymorphisms related to vascular disease. Depression was found to independently associated with smoking, female gender, poorer cognitive functioning, and higher diastolic blood pressure. Taken together, this study does not seem to support the notion of a specific link between the studied vascular risk factors or these vascular-related loci and late-life depression.
DOI: 10.1016/j.neulet.2007.11.004
2008
Cited 17 times
SOD1A4V-mediated ALS: Absence of a closely linked modifier gene and origination in Asia
Familial amyotrophic lateral sclerosis (ALS) accounts for 10% of all ALS. Approximately 20% of cases are due to mutations in the Cu/Zn superoxide dismutase gene (SOD1). In North America, SOD1A4V is the most common SOD1 mutation. Carriers of the SOD1A4V mutation share a common phenotype with rapid disease progression and death on average occurring at 1.4 years (versus 3–5 years with other dominant SOD1 mutations). Previous studies of SOD1A4V carriers identified a common haplotype around the SOD1 locus, suggesting a common founder for most SOD1A4V patients. In the current study we sequenced the entire common haplotypic region around SOD1 to test the hypothesis that polymorphisms in either previously undescribed coding regions or non-coding regions around SOD1 are responsible for the more aggressive phenotype in SOD1A4V–mediated ALS. We narrowed the conserved region around the SOD1 gene in SOD1A4V ALS to 2.8 Kb and identified five novel SNPs therein. None of these variants was specifically found in all SOD1A4V patients. It therefore appears likely that the aggressive nature of the SOD1A4V mutation is not a result of a modifying factor within the region around the SOD1 gene. Founder analysis estimates that the A4V mutation occurred 540 generations (∼12,000 years) ago (95% CI 480–700). The conserved minimal haplotype is statistically more similar to Asian than European population DNA sets, suggesting that the A4V mutation arose in native Asian–Americans who reached the Americas through the Bering Strait.
DOI: 10.1016/j.neulet.2005.08.058
2006
Cited 18 times
Variants in candidate ALS modifier genes linked to Cu/Zn superoxide dismutase do not explain divergent survival phenotypes
Familial amyotrophic lateral sclerosis (ALS) accounts for 10% of all ALS cases; approximately 25% are due to mutations in the Cu/Zn superoxide dismutase gene (SOD1). In North America, SOD1(A4V) is the most common SOD1 mutation. A4V ALS cases typically have a very short survival (1-1.5 years versus 3-5 years for other dominant SOD1 mutations). A recent study of A4V carriers identified a common haplotype around the SOD1 locus, suggesting the hypothesis that genetic variations within the haplotypic region might accelerate the course of A4V cases. By contrast, SOD1(D90A/D90A) ALS cases have a very slow progression (>10 years), raising the reciprocal hypothesis that modifier genes linked to SOD1 ameliorate the phenotype of recessively inherited SOD1(D90A/D90A) mutations. In the present study, DNA sequencing of four genes within the haplotypic region shared in A4V and D90A ALS patients revealed 15 novel variants, but none result in changes in amino acid sequences specifically associated with SOD1(D90A/D90A) or SOD1(A4V) ALS. We conclude that mutations within coding regions of genes around the SOD1 locus are not responsible for the more aggressive and more benign natures of the SOD1(A4V) and SOD1(D90A/D90A) mutations, respectively.
DOI: 10.1371/journal.pone.0235551
2020
Cited 8 times
An iron-dependent metabolic vulnerability underlies VPS34-dependence in RKO cancer cells
VPS34 is a key regulator of endomembrane dynamics and cargo trafficking, and is essential in cultured cell lines and in mice. To better characterize the role of VPS34 in cell growth, we performed unbiased cell line profiling studies with the selective VPS34 inhibitor PIK-III and identified RKO as a VPS34-dependent cellular model. Pooled CRISPR screen in the presence of PIK-III revealed endolysosomal genes as genetic suppressors. Dissecting VPS34-dependent alterations with transcriptional profiling, we found the induction of hypoxia response and cholesterol biosynthesis as key signatures. Mechanistically, acute VPS34 inhibition enhanced lysosomal degradation of transferrin and low-density lipoprotein receptors leading to impaired iron and cholesterol uptake. Excess soluble iron, but not cholesterol, was sufficient to partially rescue the effects of VPS34 inhibition on mitochondrial respiration and cell growth, indicating that iron limitation is the primary driver of VPS34-dependency in RKO cells. Loss of RAB7A, an endolysosomal marker and top suppressor in our genetic screen, blocked transferrin receptor degradation, restored iron homeostasis and reversed the growth defect as well as metabolic alterations due to VPS34 inhibition. Altogether, our findings suggest that impaired iron mobilization via the VPS34-RAB7A axis drive VPS34-dependence in certain cancer cells.
DOI: 10.1212/01.wnl.0000147264.60349.eb
2004
Cited 16 times
No association of the <i>SOD1</i> locus and disease susceptibility or phenotype in sporadic ALS
Mutations in the copper zinc superoxide dismutase gene (<i>SOD1</i>) are found in 20% of familial and 3% of sporadic ALS patients. SOD1 protein aggregation can be detected in motor neurons of mutation-negative sporadic cases but a pathogenic role for wild-type SOD1 in ALS has not been demonstrated. In this study of 233 ALS cases and 248 controls the authors found no significant association between four individual single nucleotide polymorphisms and a deletion spanning the SOD1 locus (or their combined haplotypes), and disease susceptibility, or phenotype.
DOI: 10.1186/gb-2010-11-s1-p3
2010
Cited 9 times
Analyzing and minimizing bias in Illumina sequencing libraries
Although Illumina shot-gun reads cover most genomes almost completely, sequences with extreme base compositions are often underrepresented or missing. Bias can potentially be introduced at any step during the library construction in the lab, on the Illumina instrument, in data processing or at the sequence analysis stage. Here we set out to evaluate sources of bias and ameliorate the effects. To dissect the library construction process, we developed a panel of qPCR assays for loci ranging from 6% to 90% GC that work well in a pool of three microbial DNA samples of different base composition: Plasmodium falciparum (19% GC), Escherichia coli (51% GC) and Rhodobacter sphaeroides (69% GC). We also developed qPCR assays for loci in the human genome that represent four categories of underrepresented sequence motifs as well as GC-rich promoters known to be underrepresented or missing in 'whole' genome sequencing data sets. We tracked the relative abundance of these loci throughout the standard Illumina library protocol and saw no significant introduction of bias in the initial steps including shearing, end repair, adaptor ligation and size selection. However, GC-rich and extremely GC-poor sequences were depleted during the subsequent PCR-enrichment step. Using qPCR as a readout, we tested different PCR enzymes, the addition of betaine and/or DMSO, and thermocycling profile variations. The choice of PCR instrument itself and the ramp rate had a significant effect on the GC profile of the PCR product, especially when using the recommended amplification conditions (Phusion HF and 10s denaturation per cycle). Our optimized conditions produce PCR-amplified libraries that display little systematic bias between 15% and 80% GC that resulted during sample preparation. We saw significantly improved representation of challenging human sequence motifs both in the PCR-amplified library (qPCR assay) and in the final Illumina reads. Our conditions are also more reliable and robust because they minimize the effect of PCR instrument and ramp rate. These conditions are currently being implemented in the Sequencing Platform at the Broad Institute. Finally, we still observe some bias in the sequencing readout, which is introduced by steps subsequent to sample preparation, including cluster generation and sequencing. These sources of bias are the object of ongoing investigations.
DOI: 10.1101/168443
2017
Cited 7 times
P53 toxicity is a hurdle to CRISPR/CAS9 screening and engineering in human pluripotent stem cells
SUMMARY CRISPR/Cas9 has revolutionized our ability to engineer genomes and to conduct genome-wide screens in human cells. While some cell types are easily modified with Cas9, human pluripotent stem cells (hPSCs) poorly tolerate Cas9 and are difficult to engineer. Using a stable Cas9 cell line or transient delivery of ribonucleoproteins (RNPs) we achieved an average insertion or deletion efficiency greater than 80%. This high efficiency made it apparent that double strand breaks (DSBs) induced by Cas9 are toxic and kill most treated hPSCs. Cas9 toxicity creates an obstacle to the high-throughput use CRISPR/Cas9 for genome-engineering and screening in hPSCs. We demonstrated the toxic response is tp53 -dependent and the toxic effect of tp53 severely reduces the efficiency of precise genome-engineering in hPSCs. Our results highlight that CRISPR-based therapies derived from hPSCs should proceed with caution. Following engineering, it is critical to monitor for tp53 function, especially in hPSCs which spontaneously acquire tp53 mutations.
DOI: 10.1016/s0006-3223(98)00060-2
1999
Cited 16 times
CYP2D6 polymorphisms in Alzheimer’s disease, with and without extrapyramidal signs, showing no apolipoprotein E ε4 effect modification
Allelic variation at the CYP2D6 gene has been reported to be associated with Parkinsons' disease (PD) and Lewy body dementia (LBD), but not with Alzheimer's disease (AD). AD has been associated with apolipoprotein E (apoE) epsilon 4 allele loading.We examined CYP2D6 and apoE polimorphisms in a sample of 259 patients with dementia, 210 of whom had a diagnosis of AD, and 107 healthy controls.We found that the allelic frequency in our AD sample did not vary from that in the controls. The debrisoquine hydroxylase poor metabolize phenotype was not more prevalent among AD cases than among controls in contrast to that reported for PD and LBD. We also found that CYP2D6 status does not modify the risk effect for AD conferred by apoE epsilon 4 alleles.These findings provide some support to the notion that, at a genetic level, at least at this locus, AD could be distinct from PD and LBD.
DOI: 10.1101/812628
2019
Cited 6 times
In-Depth Characterization and Validation in <i>BRG1</i>-Mutant Lung Cancers Define Novel Catalytic Inhibitors of SWI/SNF Chromatin Remodeling
Abstract Members of the ATP-dependent SWI/SNF chromatin remodeling complexes are among the most frequently mutated genes in cancer, suggesting their dysregulation plays a critical role. The synthetic lethality between SWI/SNF catalytic subunits BRM/SMARCA2 and BRG1/SMARCA4 has instigated great interest in targeting BRM. Here we have performed a critical and in-depth investigation of novel dual inhibitors (BRM011 and BRM014) of BRM and BRG1 in order to validate their utility as chemical probes of SWI/SNF catalytic function, while obtaining insights into the therapeutic potential of SWI/SNF inhibition. In corroboration of on-target activity, we discovered compound resistant mutations through pooled screening of BRM variants in BRG1 -mut cancer cells. Strikingly, genome-wide transcriptional and chromatin profiling (ATAC-Seq) provided further evidence of pharmacological perturbation of SWI/SNF chromatin remodeling as BRM011 treatment induced specific changes in chromatin accessibility and gene expression similar to genetic depletion of BRM. Finally, these compounds have the capacity to inhibit the growth of tumor-xenografts, yielding important insights into the feasibility of developing BRM/BRG1 ATPase inhibitors for the treatment of BRG1 -mut lung cancers. Overall, our studies not only establish the feasibility of inhibiting SWI/SNF catalytic function, providing a framework for SWI/SNF therapeutic targeting, but have also yielded successful elucidation of small-molecule inhibitors that will be of importance in probing SWI/SNF function in various disease contexts.
DOI: 10.1080/14660820500397057
2006
Cited 9 times
No association of<i>DYNC1H1</i>with sporadic ALS in a case‐control study of a northern European derived population: A tagging SNP approach
The cytoplasmic dynein‐dynactin complex has been implicated in the aetiology of motor neuron degeneration in both mouse models and human forms of motor neuron disease. We have previously shown that mutations in the cytoplasmic dynein 1 heavy chain 1 gene (Dync1h1) are causal in a mouse model of late‐onset motor neuron degeneration but have found no association of the homologous sites in human DYNC1H1 with human motor neuron disease. Here we extend these analyses to investigate the DYNC1H1 genomic locus to determine if it is associated with sporadic amyotrophic lateral sclerosis (ALS) in a northern European‐derived population. Among the 16 single nucleotide polymorphisms (SNPs) we examined, just two SNPs (rs2251644 and rs941793) were sufficient to tag the majority of haplotypic variation (r2⩾0.85) and these were tested in a case‐control association study with 266 North American sporadic ALS patients and 225 matched controls. We found no association between genetic variation at DYNC1H1 and sporadic ALS (rs2251644; p = 0.538, rs941793; p = 0.204, haplotype; p = 0.956). In addition we investigated patterns of diversity at DYNC1H1 in Japanese and Cameroonian populations to establish the evolutionary history for this gene and observed reduced genetic diversity in the northern Europeans suggestive of selection at this locus.
DOI: 10.1021/acschembio.8b00656
2018
Cited 6 times
Previously Uncharacterized Vacuolar-type ATPase Binding Site Discovered from Structurally Similar Compounds with Distinct Mechanisms of Action
Using a comprehensive chemical genetics approach, we identified a member of the lignan natural product family, HTP-013, which exhibited significant cytotoxicity across various cancer cell lines. Correlation of compound activity across a panel of reporter gene assays suggested the vacuolar-type ATPase (v-ATPase) as a potential target for this compound. Additional cellular studies and a yeast haploinsufficiency screen strongly supported this finding. Competitive photoaffinity labeling experiments demonstrated that the ATP6V0A2 subunit of the v-ATPase complex binds directly to HTP-013, and further mutagenesis library screening identified resistance-conferring mutations in ATP6V0A2. The positions of these mutations suggest the molecule binds a novel pocket within the domain of the v-ATPase complex responsible for proton translocation. While other mechanisms of v-ATPase regulation have been described, such as dissociation of the complex or inhibition by natural products including bafilomycin A1 and concanamycin, this work provides detailed insight into a distinct binding pocket within the v-ATPase complex.
DOI: 10.1016/j.neulet.2009.07.010
2009
Cited 6 times
DNA sequence analysis of the conserved region around the SOD1 gene locus in recessively inherited ALS
Familial amyotrophic lateral sclerosis (ALS) accounts for 10% of all ALS cases; 12-23% are associated with mutations in the Cu/Zn superoxide dismutase gene (SOD1). All ALS-linked SOD1 mutations present with a dominant pattern of inheritance apart from the aspartate to alanine mutation in exon 4 (D90A). This mutation has been observed in dominant, recessive and apparently sporadically cases. SOD1(D90A/D90A) ALS cases have a very slow disease progression (>10 years), raising the hypothesis that modifier genes linked to SOD1 ameliorate the phenotype of recessively inherited SOD1(D90A/D90A) mutations. Previous sequence analysis of a conserved haplotype region around the SOD1 gene did not reveal any functional polymorphisms within known coding or putative regulatory regions. In the current study we expanded the previous analyses by sequencing the entire SOD1 conserved haplotypic region. Although many polymorphisms were identified, none of these variants explain the slowly progressive phenotype observed in patients with recessive SOD1(D90A) mutations. This study disproves the hypothesis that there is a tightly linked genetic protective factor specifically located close to the SOD1 gene in SOD1(D90A) mediated ALS.
DOI: 10.1186/1742-4690-6-s3-p400
2009
Cited 5 times
P09-20 LB. Ultra-deep sequencing of full-length HIV-1 genomes identifies rapid viral evolution during acute infection
Open Access Poster presentation P09-20 LB. Ultra-deep sequencing of full-length HIV-1 genomes identifies rapid viral evolution during acute infection MR Henn1, C Boutwell3, N Lennon1, K Power3, C Malboeuf1, P Charlebois1, A Gladden3, J Levin1, M Casali1, L Philips3, A Berlin1, A Berical3, R Erlich1, S Anderson1, H Streeck3, M Kemper3, E Ryan1, Y Wang3, L Green1, K Axten3, Z Brumme3, C Brumme3, C Russ1, E Rosenberg3, H Jessen2, M Altfeld3, C Nusbaum1, B Walker3, B Birren1 and TM Allen*3
DOI: 10.7554/elife.17290.014
2016
Cited 3 times
Author response: Functional CRISPR screening identifies the ufmylation pathway as a regulator of SQSTM1/p62
DOI: 10.1097/nen.0b013e318093f40d
2007
Cited 4 times
A Familial Form of Pallidoluysionigral Degeneration and Amyotrophic Lateral Sclerosis With Divergent Clinical Presentations
We describe a family with a rapidly progressive neurodegenerative disorder characterized by amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) but with unusual neuropathologic features that include pallidoluysionigral degeneration. The proband presented with primary progressive aphasia that evolved into mutism. He subsequently developed dementia with mild disinhibition and parkinsonism and late in the disease showed evidence of motor neuron disease. Two other cases (the proband's mother and maternal uncle) had features of ALS exclusively. All 3 had a young onset (fourth decade) and rapid clinical course, with average time from onset of symptoms to death of 4 years. Postmortem neuropathologic examination of the proband and his uncle showed ALS changes and extensive pallidoluysionigral degeneration without neurofibrillary tangles, ubiquitin inclusions, or detectable abnormalities in the dentate nucleus of the cerebellum. Although this exceptional combination of neuropathologic features has been described in rare cases of sporadic ALS-FTD, no pedigrees have ever been reported. In 2 affected members of this family, we failed to identify mutations in genes associated with weakness, movement disorders, or dementia, including ALS, FTD, selected spinocerebellar ataxias, and Huntington disease. Thus, this disorder may represent a novel autosomal dominantly inherited and rapidly progressive neurodegenerative disorder with a spectrum of clinical presentations but common neuropathologic features.
DOI: 10.1038/sj.mp.4000941
2002
Cited 5 times
Identification of genomic organisation, sequence variants and analysis of the role of the human dishevelled 1 gene in late onset Alzheimer's disease
Alzheimer's disease (AD) is a disorder characterised by a progressive deterioration in memory and other cognitive functions. Neurofibrillary tangles (NFT) are a major pathological hallmark of AD, these are aggregations of paired helical filaments (PHF) comprised of the hyperphosphorylated microtubule associated protein tau. Several kinases, such as glycogen synthase kinase 3 beta (GSK3beta) and c-Jun N-terminal kinase (JNK), phosphorylate tau at sites that are phosphorylated in PHF. Dishevelled 1 (DVL1) is thought to act as a positive regulator of the wnt signalling pathway, and inhibits GSK3beta activity preventing beta-catenin degradation and thus allowing wnt target gene expression. JNK activation is also regulated by DVL1, however it is unclear if this is via the wnt signalling pathway. These observations suggest a central role for DVL1 in tau phosphorylation and AD and led us to investigate DVL1 as a candidate gene for this disorder. We determined the genomic structure of the DVL1 gene by sequencing and data mining and searched for sequence variations in the coding sequences and flanking introns. The DVL1 gene spans a region of approximately 13.8 kb (not including the 5' untranslated region) and is encoded by 15 exons. Analysis of over 4.3 kb of sequence, including 98% of exonic sequences and introns 2, 3, 6, 7, 9, 10, 11 and 12, revealed there to be six rare (< or =6%) sequence variations. None of these had any association with late onset AD. This would suggest that polymorphic variations in the coding sequences of DVL1 are not important in AD. However further analysis of regulatory regions may lead to the identification of other sequence variations which may be implicated in AD.
DOI: 10.7554/elife.19090.045
2016
Author response: Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome
Article Figures and data Abstract Introduction Results and discussion Materials and methods Appendix 1. Observations and hypotheses relevant to Cbs duplication and evolution in tetrahymenine ciliates Data availability References Decision letter Author response Article and author information Metrics Abstract The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena’s germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum. https://doi.org/10.7554/eLife.19090.001 Introduction The establishment of distinct genomic lineages (cellular or nuclear) in the life cycles of phylogenetically diverse organisms has allowed the evolution of a wide variety of programmed, somatic lineage-specific DNA rearrangement mechanisms. Some cases mediate the generation of protein products specific to a differentiated cell type, such as sigmaK of the Bacillus subtilis mother cell (Kunkel et al., 1990) or the vast diversity of vertebrate immunoglobulins (Schatz, 2004). Other cases result in genome-wide chromosome restructuring, as was first recognized by microscopic observation of parasitic nematodes over 125 years ago (Boveri, 1887) and since documented in several eukaryotic branches, including vertebrates (Bachmann-Waldmann et al., 2004; Smith et al., 2012; Sun et al., 2014; Wang and Davis, 2014). This large-scale phenomenon has been most thoroughly studied in the phylum Ciliophora, or ciliates, a deep-branching and diverse group of protozoa (Bracht et al., 2013; Chalker and Yao, 2011; Coyne et al., 2012; Vogt et al., 2013; Yao et al., 2014). Although unicellular, ciliates carry two distinct nuclei that display a remarkable form of germline/soma differentiation (Figure 1A; Orias et al., 2011); the smaller, diploid, transcriptionally silent germline nucleus (micronucleus or MIC) contains the genetic material transmitted across sexual generations, whereas the larger, polyploid, actively expressed somatic nucleus (macronucleus or MAC) supports all the vegetative functions of the cell. Despite differing in several fundamental features of eukaryotic nuclei, the MAC is derived from a mitotic sibling of the MIC during sexual reproduction in a process that involves extensive, genome-wide programmed DNA rearrangements. Figure 1 with 1 supplement see all Download asset Open asset Nuclear dualism and genome rearrangement in Tetrahymena. (A) Schematic of two stages of Tetrahymena life cycle showing major characteristics of micronuclei (MIC; red) and macronuclei (MAC; blue) and nuclear events of conjugation. (B) Main events of programmed genome rearrangement. A portion of the MIC genome is shown in red, with internal eliminated sequences (IES) shown as open boxes and the Cbs sequence in black. The corresponding MAC regions (blue) lack the IESs, with the flanking MAC-destined sequences (MDSs) joined (represented by ^ symbols). Breakage and addition of telomeres (orange boxes) has occurred at the former site of the Cbs. https://doi.org/10.7554/eLife.19090.002 The extent and nature of ciliate genome rearrangement vary widely within the phylum, but the two main events are chromosome fragmentation and DNA elimination (Figure 1B). In the widely studied model organism, Tetrahymena thermophila, the five MIC chromosomes are fragmented at sites of the 15 bp Chromosome breakage sequence (Cbs) (Yao et al., 1990) into about 200 MAC chromosomes (Eisen et al., 2006). Other characterized ciliates also undergo extensive chromosome fragmentation but do not display a conserved cis-acting breakage signal. It has been suggested that the evolutionary advantage of chromosome fragmentation may relate to the high ploidy of MACs (~45N for all but one chromosome in Tetrahymena, ~800 N in Paramecium, ~2000 N in Oxytricha) and their amitotic division mechanism, which could damage larger chromosomes or be physically constrained by their entanglement (Coyne et al., 1996). This amitotic mechanism also results in unequal chromosome segregation, which can lead to the generation of phenotypic diversity among the vegetative descendants of a single cell ('phenotypic assortment', documented in Tetrahymena (Orias and Flacks, 1975). In addition, fragmentation permits differential copy number control (observed in Tetrahymena (reviewed in Yao et al. [1979]), Oxytricha and other ciliates (Baird and Klobutcher, 1991; Steinbruck, 1983; Swart et al., 2013). Concomitantly with fragmentation, thousands of Internal Eliminated Sequences (IESs; first described in Tetrahymena [Yao et al., 1984]) are spliced from the Tetrahymena MIC genome. In Paramecium tetraurelia, a fellow oligohymenophorean ciliate distantly related to Tetrahymena (Baroin-Tourancheau et al., 1992), partial assembly of the MIC genome has revealed the presence of about 45,000 short, unique copy IESs, many lying within the MIC progenitors of MAC genes (Arnaiz et al., 2012). The more distantly related spirotrichous ciliate, Oxytricha trifallax undergoes an extreme type of genome rearrangement. Roughly 16,000 MAC chromosomes (most carrying only a single gene) (Swart et al., 2013) are derived from a MIC genome ten times the size of the MAC genome, in a process that also involves extensive 'unscrambling' of non-contiguous MIC genome sequences (Chen et al., 2014). A leitmotif of programmed genome rearrangements in many organisms is the involvement of mobile DNA elements. In some cases, this involvement is as an agent of the event, through domesticated gene products (e.g. Rag recombinases [Fugmann, 2010; Jones and Gellert, 2004; Kapitonov and Koonin, 2015], HO endonuclease [Koufopanou and Burt, 2005]); in other cases, mobile elements are a target of programmed rearrangement events (e.g. the B. subtilis Skin element that interrupts the sigK gene [Takemaru et al., 1995]). It has long been recognized that many ciliate IESs contain transposable elements (TEs) and/or their remnants and hypothesized that their elimination is a form of MAC genome self-defense (Klobutcher and Herrick, 1997). In both Tetrahymena and Paramecium, IES elimination requires the action of proteins domesticated from piggyBac transposases (Baudry et al., 2009; Cheng et al., 2010; Shieh and Chalker, 2013), as well as proteins and histone modifications associated with epigenetic TE silencing in other organisms (Chalker et al., 2013). In Oxytricha, germline-limited transposons mediate their own excision and also contribute to other programmed rearrangement events (Nowacki et al., 2009). The evolutionary origins of chromosome fragmentation are less clear, but, at least in Tetrahymena, features of Cbs suggest a possible link to mobile elements ([Ashlock et al., 2016; Fan and Yao, 2000; Hamilton et al., 2006b] and this study). Thus, the study of programmed DNA rearrangement in ciliates may help shed light on the delicate evolutionary balance that exists between mobile elements and the genomes they occupy. Despite germline sequencing efforts in three model ciliates, Tetrahymena (Fass et al., 2011), Paramecium (Arnaiz et al., 2012), and Oxytricha (Chen et al., 2014), there is no complete picture of the architectural relationship between ciliate germline and somatic genomes. Here, we report the sequencing, assembly, and analysis of the 157 Mb MIC genome of T. thermophila strain SB210, the same strain whose 103 Mb MAC genome sequence we have previously characterized (Coyne et al., 2008; Eisen et al., 2006; Hamilton et al., 2006a). We constructed full-length super-assemblies of all five MIC chromosomes, providing a unique resource for ciliate genome analysis. By mapping a set of germline deletions against these super-assemblies, we delimited the locations of the five MIC centromeres. We mapped 225 instances of the Cbs, which define the ends of all 181 stably maintained MAC chromosomes as well as several short-lived, ‘Non-Maintained Chromosomes’ (NMCs), some of which contain a number of active genes. Additionally, we report multiple cases of short and long-range Cbs duplications in T. thermophila and the conservation of Cbs sequence and location in three other Tetrahymena species. We showed that approximately one third (54 Mb) of the MIC genome is eliminated in the form of around 12,000 IESs, and mapped the precise locations of over 7500, revealing their enrichment at the centers and ends of MIC chromosomes. Our comparative analysis of MIC-limited TEs shows that the majority are related to DNA (Class 2) transposons from a variety of families and suggests multiple invasions of the genome and potentially recent transpositional activity. We analyzed IES junctions and excision variability genome-wide, greatly extending previous reports of their imprecision (e.g. [Austerberry et al., 1989; Li and Pearlman, 1996; Wells et al., 1994]), and yet we also report a very limited number of unusual, precisely excised IESs that interrupt protein-coding regions. Our results provide the first genome-wide picture of programmed DNA rearrangements in T. thermophila, and support a view of the germline genome as a complex and dynamic entity, on both developmental and evolutionary timescales. Results and discussion Germline chromosome structure MIC genome sequencing and chromosome-length assembly Shotgun sequencing and assembly of the T. thermophila MIC genome is described in 'Materials and methods', and statistics are summarized in Supplementary file 1A. The final assembly is 157 Mb in length and composed of 1464 scaffolds, whereas the MAC genome assembly is 103 Mb and contains 1158 scaffolds. To fully understand the inter-relationship of the MAC and MIC genomes, it is essential to join the scaffolds of each separate assembly into complete MAC and MIC chromosomes. Extensive genome closure and HAPPY mapping efforts have produced super-assemblies of every MAC chromosome ([Coyne et al., 2008; Hamilton et al., 2006a]; Supplementary file 1B) but considerable uncertainty remains as to scaffold placement and/or orientation on several chromosomes. Likewise, although genetic mapping can assign some MAC chromosomes/scaffolds to locations on one of the five MIC chromosomes, their order and orientation can be hard to determine. By a MIC-MAC cross-alignment ‘tiling’ method (described in Materials and methods and Figure 1—figure supplement 1), we used each assembly to improve the other. By this process, most of the larger MIC scaffolds were linked into five chromosome-length super-assemblies that together incorporate 152 Mb of the total 157 Mb MIC assembly (Supplementary file 1C,D; also see ‘MIC-scaff’ and corresponding ‘MAC-scaff’ schematic concatenations in Figure 2). While the super-assemblies are admittedly not perfect, their uncertainties are on a small scale, and thus the maps allow observations of general trends in MIC chromosome architecture. To our knowledge, these are the first assemblies of nearly full-length ciliate MIC chromosomes and thus represent novel resources for genomic analyses. We have incorporated them into a browser (http://www.jcvi.org/jbrowse/?data=tta2mic) that relates the MIC and MAC genomes and includes many other features described below. Figure 2 with 1 supplement see all Download asset Open asset MIC chromosome landscapes. For each chromosome, the top panel shows the density of several genomic features, measured as number of base pairs (span) per 500 kb sliding window (100 kb slide increment). Purple = simple sequence repetitive DNA (note that exclusion of those simple sequence repeats that overlap with TEs has minimal effect on the distribution pattern). Blue = putative TEs. Green = high-confidence IESs. Orange = protein-coding sequences. The corresponding chromosome-length super-assembly (Super-Asm) is shown immediately below, each Cbs indicated by a vertical tick. Red ticks indicate Cbs’s flanking putative centromeres (see main text and Figure 2—figure supplement 1). In the 'MIC-scaff' schematic, the scaffolds comprising each MIC chromosome super-assembly are depicted as horizontal lines (alternating in vertical position to delineate each from its neighbors). The ‘MAC-scaff’ schematic indicates the positions of MAC scaffolds (many of which are complete, fully sequenced MAC chromosomes) derived from the corresponding regions of the MIC chromosome. Note that, because IESs are absent from MAC scaffolds, their lengths are actually shorter, but for simplicity of viewing, these lengths have been stretched so that MAC-scaff endpoints line up with their corresponding positions in the MIC. Chromosomes are stacked so that their centers align vertically. https://doi.org/10.7554/eLife.19090.004 MIC centromeres Centromeric loci play essential, highly conserved roles in the faithful segregation of chromosomes during meiosis and mitosis (Bloom, 2014). Recent studies (Plohl et al., 2014; Topp and Dawe, 2006) have greatly increased understanding of centromere structure and function, but much still remains unclear. Several biological features of Tetrahymena, as well as its powerful experimental toolbox, have made this organism a useful model for studies of centromeric heterochromatin (Cervantes et al., 2006; Cui and Gorovsky, 2006; Papazyan et al., 2014), recombination (Lukaszewicz et al., 2013; Shodhan et al., 2014), chromosome cohesion (Howard-Till et al., 2013), and centromere evolution (Elde et al., 2011; Malik and Henikoff, 2002, 2009), all of which would benefit from better genetic and molecular definition of its centromeres. The full-length chromosomal super-assemblies described above make this possible. We demarcated Tetrahymena centromeric regions using germline, mitotically stable, chromosomal deletions isolated in a separate study (Cassidy-Hanley et al.; manuscript in preparation). Each deletion was mapped in relation to chromosome breakage sites along the length of each MIC chromosome ([Figure 2—figure supplement 1]; [Cassidy-Hanley et al., 1994]). We observed that chromosome arm deletions never extend into the central regions of MIC chromosome super-assemblies, presumably because they are essential for centromere function. Operationally (because of how the deletions were mapped), two unique Cbs’s flank each putative centromere region (see red hash marks in Figure 2). Cytologically, all five Tetrahymena MIC chromosomes appear metacentric and, as expected, the midpoints of the chromosomal super-assembly lie near the centromeric region midpoints (Table 1). We also note, as described in Supplementary file 1E, that MAC chromosomes derived from MIC centromeric regions tend to be unusually large. The five putative centromeric regions range between 5.0 and 10.3 Mb and together comprise 37.8 Mb, or 24.7% of the assembled MIC genome. These estimates are subject to change in either direction for the following reasons. The centromere regions of the MIC assembly are highly fragmented (Table 1, column 5; Figure 2); missing sequence would increase their size. On the other hand, the precise endpoints of the deletions are unknown, and the complete region between flanking Cbs’s may not be required for centromere function. Table 1 MIC centromere regions and centric MAC chromosomes. https://doi.org/10.7554/eLife.19090.006 MIC chromosomeL-Cbs location (Mb)R-Cbs location (Mb)Cen length (Mb)# super-contigs in CenMIC chromosome length (Mb)Cen midpoint (Mb)Chromosome midpoint (Mb)113.9823.249.268736.3218.6118.1629.8114.855.047725.5112.3312.7639.9820.3210.3412031.5215.1515.76412.2318.346.117431.7215.2915.86510.3717.397.026227.4713.8813.74Total37.77 (24.7%)152.54 L-Cbs and R-Cbs represent the most Cen-proximal Cbs on the left and right chromosome arms, respectively. Centromere locations were established by deletion mapping (see text for details). For chromosomes 2, 4, and 5, the L-1 and R-1 Cbs flank the putative centromere region. The remaining centromeres contain Cbs’s. Cbs 3L-3 and 3R-1 flank the chromosome 3 centromere, while Cbs 1L-6 and Cbs 1R-11 flank the centromere region of chromosome 1. Locations in Mb use the far (telomere) end of the left arm as the origin. Centromeric and pericentromeric regions generally contain repetitive sequences, often consisting of large arrays of tandem repeats interspersed with transposable elements (TEs) (Buscaino et al., 2010; Hayden and Willard, 2012; López-Flores and Garrido-Ramos, 2012; Plohl et al., 2014). We plotted the densities along each MIC chromosome of both simple sequence repeats (Figure 2, purple lines) and putative TEs and their remnants (blue lines; see below for a description of TE characterization) and found that both types of repetitive sequence are more prevalent in the putative centromeric regions than in the chromosome arms. These observations of large, repeat-rich centromeric regions are consistent with the 'meiotic drive' hypothesis (Elde et al., 2011; Malik and Henikoff, 2002, 2009)—that in organisms, such as Tetrahymena, that undergo exclusively female meiosis (in which only one of the four meiotic products becomes a gamete), competition between sister chromosomes for transmission during meiosis will result in rapid evolution and expansion of centromeric sequences. During formation of a new MAC in Tetrahymena, the centromeric histone H3 disappears from differentiating MACs, suggesting the programmed elimination of Cen-specific sequences (Cervantes et al., 2006; Cui and Gorovsky, 2006). The close, linear packing of MAC chromosome precursors along the entire length of MIC chromosomes and the presence of retained, macronuclear-destined sequences (MDSs) interspersed throughout the Tetrahymena centromere regions suggests that IES removal is sufficient to account for this centromere loss. In Paramecium, IESs found in MIC regions that give rise to MAC chromosomes are generally very short and non-repetitive (Arnaiz et al., 2012), thus not resembling typical centromeric DNA. However, these regions are separated by large (and as yet unassembled) blocks of repetitive DNA (Arnaiz et al., 2012; Le Mouël et al., 2003), which seem more likely to represent centromeres. Centromeric histone H3 also disappears during MAC differentiation in Paramecium, and this disappearance is dependent on factors required for IES excision (Lhuillier-Akakpo et al., 2016), suggesting that the centromeres of both organisms, despite their apparent dissimilarities, are eliminated as IESs. Chromosome fragmentation In contrast to most eukaryotes, programmed somatic chromosome breakage and de novo telomere addition are part of the normal life cycles of several groups, including ciliates (Coyne et al., 1996) and certain parasitic nematodes (Müller and Tobler, 2000). Among these organisms, many details of the process differ markedly (Amar, 1994; Baird and Klobutcher, 1989; Caron, 1992; Duret et al., 2008; Forney and Blackburn, 1988; Herrick et al., 1987; Le Mouël et al., 2003; Scott et al., 1993). Tetrahymena carries out chromosome breakage and telomere addition with high specificity and reliability. In T. thermophila and related species (Coyne and Yao, 1996), these processes are driven by the necessary and sufficient cis-acting DNA element, Cbs (Chromosome breakage sequence), a highly conserved 15-mer (Fan and Yao, 2000; Hamilton et al., 2006b; Yao et al., 1990). De novo telomere addition by telomerase occurs within a region ~5–25 bp on each side of a Cbs (Fan and Yao, 1996); the Cbs itself and its immediate flanking sequences are found only in the MIC. Thanks to our chromosome super-assemblies, we can now investigate chromosome breakage throughout the entire T. thermophila genome. The chromosome breakage sequence (Cbs) family We identified 225 Cbs’s in the MIC genome assembly (Supplementary file 2A), including those associated with the ends of every MAC chromosome (Supplementary file 2B); thus, the Cbs family is responsible for all developmentally programmed chromosome breakage in T. thermophila. Positioning this complete set of breakage signals on the MIC chromosome super-assemblies makes T. thermophila the first ciliate in which the complete linear relationship between MIC and MAC chromosomes has been defined (see ‘Super-Asm’ schematic in Figure 2). As expected, the majority of MAC chromosomes are generated by cleavage at Cbs's that are consecutively spaced along MIC chromosomes. However, we identified seven complex MAC chromosomes that are generated not simply by conventional fragmentation, but also by the site-specific joining of non-contiguous segments of germline DNA. The non-contiguity has been experimentally confirmed for three cases, eliminating the possibility that they are genome assembly artifacts. The formation of these complex chromosomes is currently under investigation and will be reported in detail separately. The rearrangement events have been accounted for in the MIC/MAC comparative genome browser described above (http://www.jcvi.org/jbrowse/?data=tta2mic). Nearly half the 225 Cbs’s have the consensus C-rich strand sequence: 5'-TAAACCAACCTCTTT-3', and none has more than two substitutions to this sequence (Table 2). Confirming earlier studies (Hamilton et al., 2006b), 10 of the 15 nucleotide positions are completely conserved, while five show limited degeneracy, summarized as follows: 5’-WAAACCAACCYCNHW-3’ (W = A or T; Y = C or T; H = A, C or T; N = any nucleotide; Figure 3). Cbs’s identified in several related tetrahymenine species ([Coyne and Yao, 1996]) and below) fall within the same range of variability. All the positions occupied by T's in the consensus (found mostly toward the 3’ end), and only these positions, exhibit some degeneracy. Only at positions 13 and 14 have we observed more than one type of substitution (13T→A, C, or G, 14T→A or C). Figure 3 Download asset Open asset Conservation of the 15 bp chromosome breakage sequence. Nucleotide conservation was calculated at every position, as described in (Hamilton et al., 2006a), for the 225 Cbs’s and their 15 bp flanking sequences, aligned on the C-rich Cbs strand. The Cbs element occupies positions 16 to 30. At any given position in the logo plot, two bits represent maximum conservation (only one nucleotide occupies that position), and 0 bits corresponds to no conservation (all four nucleotides are equally frequent). https://doi.org/10.7554/eLife.19090.007 Table 2 Variation within the Cbs family. Pink and gray shading: single- and double-substituted variants, respectively. https://doi.org/10.7554/eLife.19090.008 Cbs designation Count Cbs nucleotide position Number of substitutions Total substitutions per subset 1 11 13 14 15 canonical10900: 1091A 53A111C 8C113A 7A113C 2C114A 9A114C 4C115A 10A11: 931A,11C5AC21A,13A2AA21A,13C1AC21A,14C2AC21A,15A8AA211C,13A1CA211C,13G1CG211C,14C1CC211C,15A1CA214A,15A1AA22: 23Total2257117141720 The limited Cbs degeneracy may reflect the specificity of the yet to be identified trans-acting factor(s) that physically interact with the Cbs. Pot2p is the first factor shown to associate specifically with Cbs regions in vivo, at the time of chromosome breakage (Cranert et al., 2014). Pot2p is a paralog of Pot1p, which is required for telomere maintenance. Pot2p may recruit factor(s) required for chromosome breakage and/or de novo telomere addition. As previously noted for the consensus sequence (Yao et al., 1987), every functional Cbs contains a permuted copy (C2A2C2) of the T. thermophila telomeric repeat C4A2. More generally, the Cbs consensus shares with Tetrahymena telomeric repeats a striking C vs. G strand asymmetry; of the 117 non-consensus functional Cbs sequences, only one contains a substitution on the C-rich strand to a G (at position 13) whereas 27 contain a substitution to C (Table 2). The likelihood of this ratio being due to chance alone is low (probability of chi square << 0.01). Whether these sequence parallels between Cbs and telomeres are coincidental or related to Cbs function may be established when the mechanisms of chromosome breakage and telomere addition are better understood. Many innovations in the realm of programmed genome rearrangement have resulted from the domestication of genes originally associated with mobile DNA elements; examples are found in multicellular organisms (Kapitonov and Koonin, 2015) and microbial eukaryotes (Barsoum et al., 2010; Koufopanou and Burt, 2005; Levin and Moran, 2011; Sinzelle et al., 2009), including ciliates (Baudry et al., 2009; Cheng et al., 2010; Vogt et al., 2013). The Cbs resembles the target site of a homing endonuclease, with its relatively long, non-palindromic sequence and limited degeneracy (Fan and Yao, 2000; Hamilton et al., 2006b); another superficial resemblance is to transposase binding sites found at transposon termini. It seems likely that Cbs and the yet unknown protein(s) that recognize it and initiate breakage had their origins in a mobile DNA element that invaded the germline genome and was subsequently domesticated. Conservation of chromosome breakage sites across Tetrahymena species Cbs-mediated chromosome breakage has only been found in tetrahymenine ciliates. Earlier studies of this group (Coyne and Yao, 1996) showed strong evolutionary conservation of the Cbs sequence, but only one or two Cbs's per species were sequenced. To examine the evolutionary conservation of Cbs sequences and their locations within the germline genome, we conducted a pilot study of 12 consecutive breakage site locations in T. thermophila and three other Tetrahymena species, using the strategy described in Materials and methods (a more comprehensive study will be published separately). Strikingly, MAC chromosome ends were highly conserved in all four species, indicating strong conservation of breakage sites. Indeed, with just one exception in T. borealis, the location of every chromosome breakage site in the four species has remained identical since their divergence, down to the MIC genome interval between the same two consecutive homologous genes (Supplementary file 2C). The only detected differences are the deletion of DNA sequences surrounding T. borealis Cbs 3L-25 and a novel breakage site in T. malaccensis, between Cbs 3L-24 and 3L-25 (numbered according to T. thermophila). MAC chromosome lengths in this region are also strongly conserved among all four species (Figure 4A, Supplementary file 2D). Figure 4 Download asset Open asset Conservation of chromosome breakage sites and Cbs in four Tetrahymena species. (A) Conservation of MAC chromosome lengths: X-axis: Cbs 3L-15 to 26 (evenly spaced). Y-axis: Length of the MAC scaffolds in each species whose ends are defined by the flanking Cbs’s. Circle: an extra Cbs site in T. malaccensis creates two MAC chromosomes in this region; length = sum of the two MAC chromosome lengths. (B) Summary of Cbs sequence data at nine chromosome breakage sites; filled in box = sequence available; if no text = single, consensus Cbs in same orientation as T. thermophila; Cbs sequence variants, duplications (DUP) and inversion (INV) indicated; final column = possible last common ancestor (LCA) Cbs, requiring a minimum number of mutations in the clade. (C) Inferred possible descent from Cbs of LCA at each of the nine chromosome breakage sites. Branch tips: Cbs consensus (Cns) or variant in T.the, T.mal., T.ell., and T.bor, in that order (colors consistent with parts A and B; missing branch = unsequenced Cbs). Terminally split branch = local Cbs duplication. Dots indicate minimal number of mutational events; placed in the longest branches when there is a choice. Reverse arrow (T. bor. 3L-22) indicates Cbs inversion. https://doi.org/10.7554/eLife.19090.009 We sequenced the MIC Cbs regions for 22 of the 27 novel species/breakage-site combinations (see Figure 4B). No previously unidentified Cbs variants were observed in the 26 sequenced Cbs's (which include four locally duplicated Cbs's, see below). Importantly, there was consistency in the specific Cbs isoform found at a given breakage site in all four species, as expected if they represent a clade descended from a common ancestral Cbs at that site (see Figure 4C). This conclusion is further supported by the observation that Cbs’s at a given homologous breakage site display the same orientation with respect to MAC-retained flanking regions, with the single exception of T. borealis Cbs 3L-22 (Figure 4B and C). In contrast to the conservation of the Cbs itself, there is little or no conservation of the 200 bp of adjacent sequence (not shown). Assuming the most parsimonious number of mutations to explain the Cbs variants observed at these nine homologous breakage sites, the rate of fixation of functional Cbs mutations is low; 11 mutations ca