ϟ

Kerrie Barry

Here are all the papers by Kerrie Barry that you can download and read on OA.mg.
Kerrie Barry’s last known institution is . Download Kerrie Barry PDFs here.

Claim this Profile →
DOI: 10.1038/nature08670
2010
Cited 3,715 times
Genome sequence of the palaeopolyploid soybean
Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.
DOI: 10.1126/science.1221748
2012
Cited 1,424 times
The Paleozoic Origin of Enzymatic Lignin Decomposition Reconstructed from 31 Fungal Genomes
Dating Wood Rot Specific lineages within the basidiomycete fungi, white rot species, have evolved the ability to break up a major structural component of woody plants, lignin, relative to their non–lignin-decaying brown rot relatives. Through the deep phylogenetic sampling of fungal genomes, Floudas et al. (p. 1715 ; see the Perspective by Hittinger ) mapped the detailed evolution of wood-degrading enzymes. A key peroxidase and other enzymes involved in lignin decay were present in the common ancestor of the Agaricomycetes. These genes then expanded through gene duplications in parallel, giving rise to white rot lineages.
DOI: 10.1038/nature06269
2007
Cited 1,207 times
Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite
Wood-feeding higher termites are a very successful group, important in facilitating carbon turnover in the environment. It is not the termites themselves that perform the key reactions that makes their lifestyle possible, but the lignocellulase-degrading symbiotic bacteria found in their hindgut. A metagenomic analysis of gut microbes from over 150 tree-living termites from a Costa Rican rainforest has revealed a diverse range of bacterial cellulase and xylan hydrolase genes, as well as genes important in other symbiotic functions. The data set includes about 1,000 bacterial lignocellulose hydrolase enzymes, some of them expressed in situ, in living termites. This work shows that termites are a rich reservoir of bacterial enzymes that might be used in the conversion of woody material into biofuels. Wood-feeding 'higher' termites rely on their hindgut symbionts for the intitial steps in cellulose degradation. Metagenomic analysis of this microbial community reveals a diverse range of bacterial cellulase and hydrolase genes, as well as genes important in other metabolic functions, such as H2 metabolism, CO2-reductive acetogenesis and N2 fixation. From the standpoints of both basic research and biotechnology, there is considerable interest in reaching a clearer understanding of the diversity of biological mechanisms employed during lignocellulose degradation. Globally, termites are an extremely successful group of wood-degrading organisms1 and are therefore important both for their roles in carbon turnover in the environment and as potential sources of biochemical catalysts for efforts aimed at converting wood into biofuels. Only recently have data supported any direct role for the symbiotic bacteria in the gut of the termite in cellulose and xylan hydrolysis2. Here we use a metagenomic analysis of the bacterial community resident in the hindgut paunch of a wood-feeding ‘higher’ Nasutitermes species (which do not contain cellulose-fermenting protozoa) to show the presence of a large, diverse set of bacterial genes for cellulose and xylan hydrolysis. Many of these genes were expressed in vivo or had cellulase activity in vitro, and further analyses implicate spirochete and fibrobacter species in gut lignocellulose degradation. New insights into other important symbiotic functions including H2 metabolism, CO2-reductive acetogenesis and N2 fixation are also provided by this first system-wide gene analysis of a microbial community specialized towards plant lignocellulose degradation. Our results underscore how complex even a 1-μl environment can be.
DOI: 10.1038/ng.3008
2014
Cited 1,045 times
A reference genome for common bean and genome-wide analysis of dual domestications
Common bean (Phaseolus vulgaris L.) is the most important grain legume for human consumption and has a role in sustainable agriculture owing to its ability to fix atmospheric nitrogen. We assembled 473 Mb of the 587-Mb genome and genetically anchored 98% of this sequence in 11 chromosome-scale pseudomolecules. We compared the genome for the common bean against the soybean genome to find changes in soybean resulting from polyploidy. Using resequencing of 60 wild individuals and 100 landraces from the genetically differentiated Mesoamerican and Andean gene pools, we confirmed 2 independent domestications from genetic pools that diverged before human colonization. Less than 10% of the 74 Mb of sequence putatively involved in domestication was shared by the two domestication events. We identified a set of genes linked with increased leaf and seed size and combined these results with quantitative trait locus data from Mesoamerican cultivars. Genes affected by domestication may be useful for genomics-enabled crop improvement.
DOI: 10.1016/j.cell.2017.09.030
2017
Cited 917 times
Insights into Land Plant Evolution Garnered from the Marchantia polymorpha Genome
The evolution of land flora transformed the terrestrial environment. Land plants evolved from an ancestral charophycean alga from which they inherited developmental, biochemical, and cell biological attributes. Additional biochemical and physiological adaptations to land, and a life cycle with an alternation between multicellular haploid and diploid generations that facilitated efficient dispersal of desiccation tolerant spores, evolved in the ancestral land plant. We analyzed the genome of the liverwort Marchantia polymorpha, a member of a basal land plant lineage. Relative to charophycean algae, land plant genomes are characterized by genes encoding novel biochemical pathways, new phytohormone signaling pathways (notably auxin), expanded repertoires of signaling pathways, and increased diversity in some transcription factor families. Compared with other sequenced land plants, M. polymorpha exhibits low genetic redundancy in most regulatory pathways, with this portion of its genome resembling that predicted for the ancestral land plant.PaperClip/cms/asset/a5798ed4-7289-43d4-b493-90cae7386787/mmc12.mp3Loading ...(mp3, 2.95 MB) Download audio
DOI: 10.1038/ng.3223
2015
Cited 836 times
Convergent losses of decay mechanisms and rapid turnover of symbiosis genes in mycorrhizal mutualists
Francis Martin and colleagues report genome sequences for 18 species of mycorrhizal fungi and a phylogenomic analysis including 32 other fungal genomes. The study identifies cell wall-degradation genes lost in all true ectomycorrhizal species and, using gene expression data, finds candidate genes for the establishment of symbiosis. To elucidate the genetic bases of mycorrhizal lifestyle evolution, we sequenced new fungal genomes, including 13 ectomycorrhizal (ECM), orchid (ORM) and ericoid (ERM) species, and five saprotrophs, which we analyzed along with other fungal genomes. Ectomycorrhizal fungi have a reduced complement of genes encoding plant cell wall–degrading enzymes (PCWDEs), as compared to their ancestral wood decayers. Nevertheless, they have retained a unique array of PCWDEs, thus suggesting that they possess diverse abilities to decompose lignocellulose. Similar functional categories of nonorthologous genes are induced in symbiosis. Of induced genes, 7–38% are orphan genes, including genes that encode secreted effector-like proteins. Convergent evolution of the mycorrhizal habit in fungi occurred via the repeated evolution of a 'symbiosis toolkit', with reduced numbers of PCWDEs and lineage-specific suites of mycorrhiza-induced genes.
DOI: 10.1038/nbt.2196
2012
Cited 766 times
Reference genome sequence of the model plant Setaria
We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).
DOI: 10.1038/nature13308
2014
Cited 710 times
The genome of Eucalyptus grandis
Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.
DOI: 10.1038/nbt1247
2006
Cited 657 times
Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities
Enhanced biological phosphorus removal (EBPR) is one of the best-studied microbially mediated industrial processes because of its ecological and economic relevance. Despite this, it is not well understood at the metabolic level. Here we present a metagenomic analysis of two lab-scale EBPR sludges dominated by the uncultured bacterium, “Candidatus Accumulibacter phosphatis.” The analysis sheds light on several controversies in EBPR metabolic models and provides hypotheses explaining the dominance of A. phosphatis in this habitat, its lifestyle outside EBPR and probable cultivation requirements. Comparison of the same species from different EBPR sludges highlights recent evolutionary dynamics in the A. phosphatis genome that could be linked to mechanisms for environmental adaptation. In spite of an apparent lack of phylogenetic overlap in the flanking communities of the two sludges studied, common functional themes were found, at least one of them complementary to the inferred metabolism of the dominant organism. The present study provides a much needed blueprint for a systems-level understanding of EBPR and illustrates that metagenomics enables detailed, often novel, insights into even well-studied biological systems.
DOI: 10.1126/science.1250092
2014
Cited 601 times
Ancient hybridizations among the ancestral genomes of bread wheat
The allohexaploid bread wheat genome consists of three closely related subgenomes (A, B, and D), but a clear understanding of their phylogenetic history has been lacking. We used genome assemblies of bread wheat and five diploid relatives to analyze genome-wide samples of gene trees, as well as to estimate evolutionary relatedness and divergence times. We show that the A and B genomes diverged from a common ancestor ~7 million years ago and that these genomes gave rise to the D genome through homoploid hybrid speciation 1 to 2 million years later. Our findings imply that the present-day bread wheat genome is a product of multiple rounds of hybrid speciation (homoploid and polyploid) and lay the foundation for a new framework for understanding the wheat genome as a multilevel phylogenetic mosaic.
DOI: 10.1371/journal.ppat.1003037
2012
Cited 519 times
Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes Fungi
The class Dothideomycetes is one of the largest groups of fungi with a high level of ecological diversity including many plant pathogens infecting a broad range of hosts. Here, we compare genome features of 18 members of this class, including 6 necrotrophs, 9 (hemi)biotrophs and 3 saprotrophs, to analyze genome structure, evolution, and the diverse strategies of pathogenesis. The Dothideomycetes most likely evolved from a common ancestor more than 280 million years ago. The 18 genome sequences differ dramatically in size due to variation in repetitive content, but show much less variation in number of (core) genes. Gene order appears to have been rearranged mostly within chromosomal boundaries by multiple inversions, in extant genomes frequently demarcated by adjacent simple repeats. Several Dothideomycetes contain one or more gene-poor, transposable element (TE)-rich putatively dispensable chromosomes of unknown function. The 18 Dothideomycetes offer an extensive catalogue of genes involved in cellulose degradation, proteolysis, secondary metabolism, and cysteine-rich small secreted proteins. Ancestors of the two major orders of plant pathogens in the Dothideomycetes, the Capnodiales and Pleosporales, may have had different modes of pathogenesis, with the former having fewer of these genes than the latter. Many of these genes are enriched in proximity to transposable elements, suggesting faster evolution because of the effects of repeat induced point (RIP) mutations. A syntenic block of genes, including oxidoreductases, is conserved in most Dothideomycetes and upregulated during infection in L. maculans, suggesting a possible function in response to oxidative stress.
DOI: 10.1105/tpc.113.119982
2014
Cited 472 times
Insights into the Maize Pan-Genome and Pan-Transcriptome
Abstract Genomes at the species level are dynamic, with genes present in every individual (core) and genes in a subset of individuals (dispensable) that collectively constitute the pan-genome. Using transcriptome sequencing of seedling RNA from 503 maize (Zea mays) inbred lines to characterize the maize pan-genome, we identified 8681 representative transcript assemblies (RTAs) with 16.4% expressed in all lines and 82.7% expressed in subsets of the lines. Interestingly, with linkage disequilibrium mapping, 76.7% of the RTAs with at least one single nucleotide polymorphism (SNP) could be mapped to a single genetic position, distributed primarily throughout the nonpericentromeric portion of the genome. Stepwise iterative clustering of RTAs suggests, within the context of the genotypes used in this study, that the maize genome is restricted and further sampling of seedling RNA within this germplasm base will result in minimal discovery. Genome-wide association studies based on SNPs and transcript abundance in the pan-genome revealed loci associated with the timing of the juvenile-to-adult vegetative and vegetative-to-reproductive developmental transitions, two traits important for fitness and adaptation. This study revealed the dynamic nature of the maize pan-genome and demonstrated that a substantial portion of variation may lie outside the single reference genome for a species.
DOI: 10.1126/science.1122050
2006
Cited 427 times
Genomic Islands and the Ecology and Evolution of <i>Prochlorococcus</i>
Prochlorococcus ecotypes are a useful system for exploring the origin and function of diversity among closely related microbes. The genetic variability between phenotypically distinct strains that differ by less that 1% in 16 S ribosomal RNA sequences occurs mostly in genomic islands. Island genes appear to have been acquired in part by phage-mediated lateral gene transfer, and some are differentially expressed under light and nutrient stress. Furthermore, genome fragments directly recovered from ocean ecosystems indicate that these islands are variable among cooccurring Prochlorococcus cells. Genomic islands in this free-living photoautotroph share features with pathogenicity islands of parasitic bacteria, suggesting a general mechanism for niche differentiation in microbial species.
DOI: 10.1038/nature05192
2006
Cited 424 times
Symbiosis insights through metagenomic analysis of a microbial consortium
Symbioses between bacteria and eukaryotes are ubiquitous, yet our understanding of the interactions driving these associations is hampered by our inability to cultivate most host-associated microbes. Here we use a metagenomic approach to describe four co-occurring symbionts from the marine oligochaete Olavius algarvensis, a worm lacking a mouth, gut and nephridia. Shotgun sequencing and metabolic pathway reconstruction revealed that the symbionts are sulphur-oxidizing and sulphate-reducing bacteria, all of which are capable of carbon fixation, thus providing the host with multiple sources of nutrition. Molecular evidence for the uptake and recycling of worm waste products by the symbionts suggests how the worm could eliminate its excretory system, an adaptation unique among annelid worms. We propose a model that describes how the versatile metabolism within this symbiotic consortium provides the host with an optimal energy supply as it shuttles between the upper oxic and lower anoxic coastal sediments that it inhabits.
DOI: 10.1093/molbev/msq029
2010
Cited 422 times
Insights into the Evolution of Mitochondrial Genome Size from Complete Sequences of Citrullus lanatus and Cucurbita pepo (Cucurbitaceae)
The mitochondrial genomes of seed plants are unusually large and vary in size by at least an order of magnitude. Much of this variation occurs within a single family, the Cucurbitaceae, whose genomes range from an estimated 390 to 2,900 kb in size. We sequenced the mitochondrial genomes of Citrullus lanatus (watermelon: 379,236 nt) and Cucurbita pepo (zucchini: 982,833 nt)—the two smallest characterized cucurbit mitochondrial genomes—and determined their RNA editing content. The relatively compact Citrullus mitochondrial genome actually contains more and longer genes and introns, longer segmental duplications, and more discernibly nuclear-derived DNA. The large size of the Cucurbita mitochondrial genome reflects the accumulation of unprecedented amounts of both chloroplast sequences (>113 kb) and short repeated sequences (>370 kb). A low mutation rate has been hypothesized to underlie increases in both genome size and RNA editing frequency in plant mitochondria. However, despite its much larger genome, Cucurbita has a significantly higher synonymous substitution rate (and presumably mutation rate) than Citrullus but comparable levels of RNA editing. The evolution of mutation rate, genome size, and RNA editing are apparently decoupled in Cucurbitaceae, reflecting either simple stochastic variation or governance by different factors.
DOI: 10.1186/s13059-017-1151-0
2017
Cited 397 times
Comparative genomics reveals high biological diversity and specific adaptations in the industrially and medically important fungal genus Aspergillus
The fungal genus Aspergillus is of critical importance to humankind. Species include those with industrial applications, important pathogens of humans, animals and crops, a source of potent carcinogenic contaminants of food, and an important genetic model. The genome sequences of eight aspergilli have already been explored to investigate aspects of fungal biology, raising questions about evolution and specialization within this genus.We have generated genome sequences for ten novel, highly diverse Aspergillus species and compared these in detail to sister and more distant genera. Comparative studies of key aspects of fungal biology, including primary and secondary metabolism, stress response, biomass degradation, and signal transduction, revealed both conservation and diversity among the species. Observed genomic differences were validated with experimental studies. This revealed several highlights, such as the potential for sex in asexual species, organic acid production genes being a key feature of black aspergilli, alternative approaches for degrading plant biomass, and indications for the genetic basis of stress response. A genome-wide phylogenetic analysis demonstrated in detail the relationship of the newly genome sequenced species with other aspergilli.Many aspects of biological differences between fungal species cannot be explained by current knowledge obtained from genome sequences. The comparative genomics and experimental study, presented here, allows for the first time a genus-wide view of the biological diversity of the aspergilli and in many, but not all, cases linked genome differences to phenotype. Insights gained could be exploited for biotechnological and medical applications of fungi.
DOI: 10.1038/nbt1214
2006
Cited 381 times
Sequencing genomes from single cells by polymerase cloning
DOI: 10.1038/nature11681
2012
Cited 357 times
Algal genomes reveal evolutionary mosaicism and the fate of nucleomorphs
Cryptophyte and chlorarachniophyte algae are transitional forms in the widespread secondary endosymbiotic acquisition of photosynthesis by engulfment of eukaryotic algae. Unlike most secondary plastid-bearing algae, miniaturized versions of the endosymbiont nuclei (nucleomorphs) persist in cryptophytes and chlorarachniophytes. To determine why, and to address other fundamental questions about eukaryote-eukaryote endosymbiosis, we sequenced the nuclear genomes of the cryptophyte Guillardia theta and the chlorarachniophyte Bigelowiella natans. Both genomes have >21,000 protein genes and are intron rich, and B. natans exhibits unprecedented alternative splicing for a single-celled organism. Phylogenomic analyses and subcellular targeting predictions reveal extensive genetic and biochemical mosaicism, with both host- and endosymbiont-derived genes servicing the mitochondrion, the host cell cytosol, the plastid and the remnant endosymbiont cytosol of both algae. Mitochondrion-to-nucleus gene transfer still occurs in both organisms but plastid-to-nucleus and nucleomorph-to-nucleus transfers do not, which explains why a small residue of essential genes remains locked in each nucleomorph.
DOI: 10.1038/nmeth1043
2007
Cited 327 times
Use of simulated data sets to evaluate the fidelity of metagenomic processing methods
Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes. These data sets were designed to model real metagenomes in terms of complexity and phylogenetic composition. We assembled sampled reads using three commonly used genome assemblers (Phrap, Arachne and JAZZ), and predicted genes using two popular gene-finding pipelines (fgenesb and CRITICA/GLIMMER). The phylogenetic origins of the assembled contigs were predicted using one sequence similarity-based (blast hit distribution) and two sequence composition-based (PhyloPythia, oligonucleotide frequencies) binning methods. We explored the effects of the simulated community structure and method combinations on the fidelity of each processing step by comparison to the corresponding isolate genomes. The simulated data sets are available online to facilitate standardized benchmarking of tools for metagenomic analysis.
DOI: 10.1126/science.1246275
2013
Cited 319 times
Horizontal Transfer of Entire Genomes via Mitochondrial Fusion in the Angiosperm <i>Amborella</i>
Shaping Plant Evolution Amborella trichopoda is understood to be the most basal extant flowering plant and its genome is anticipated to provide insights into the evolution of plant life on Earth (see the Perspective by Adams ). To validate and assemble the sequence, Chamala et al. (p. 1516 ) combined fluorescent in situ hybridization (FISH), genomic mapping, and next-generation sequencing. The Amborella Genome Project (p. 10.1126/science.1241089 ) was able to infer that a whole-genome duplication event preceded the evolution of this ancestral angiosperm, and Rice et al. (p. 1468 ) found that numerous genes in the mitochondrion were acquired by horizontal gene transfer from other plants, including almost four entire mitochondrial genomes from mosses and algae.
DOI: 10.1038/ismej.2009.154
2010
Cited 313 times
Metagenomic insights into evolution of a heavy metal-contaminated groundwater microbial community
Abstract Understanding adaptation of biological communities to environmental change is a central issue in ecology and evolution. Metagenomic analysis of a stressed groundwater microbial community reveals that prolonged exposure to high concentrations of heavy metals, nitric acid and organic solvents (∼50 years) has resulted in a massive decrease in species and allelic diversity as well as a significant loss of metabolic diversity. Although the surviving microbial community possesses all metabolic pathways necessary for survival and growth in such an extreme environment, its structure is very simple, primarily composed of clonal denitrifying γ- and β-proteobacterial populations. The resulting community is overabundant in key genes conferring resistance to specific stresses including nitrate, heavy metals and acetone. Evolutionary analysis indicates that lateral gene transfer could have a key function in rapid response and adaptation to environmental contamination. The results presented in this study have important implications in understanding, assessing and predicting the impacts of human-induced activities on microbial communities ranging from human health to agriculture to environmental management, and their responses to environmental changes.
DOI: 10.1038/nature20803
2017
Cited 306 times
Evolutionary genomics of the cold-adapted diatom Fragilariopsis cylindrus
The Southern Ocean houses a diverse and productive community of organisms. Unicellular eukaryotic diatoms are the main primary producers in this environment, where photosynthesis is limited by low concentrations of dissolved iron and large seasonal fluctuations in light, temperature and the extent of sea ice. How diatoms have adapted to this extreme environment is largely unknown. Here we present insights into the genome evolution of a cold-adapted diatom from the Southern Ocean, Fragilariopsis cylindrus, based on a comparison with temperate diatoms. We find that approximately 24.7 per cent of the diploid F. cylindrus genome consists of genetic loci with alleles that are highly divergent (15.1 megabases of the total genome size of 61.1 megabases). These divergent alleles were differentially expressed across environmental conditions, including darkness, low iron, freezing, elevated temperature and increased CO2. Alleles with the largest ratio of non-synonymous to synonymous nucleotide substitutions also show the most pronounced condition-dependent expression, suggesting a correlation between diversifying selection and allelic differentiation. Divergent alleles may be involved in adaptation to environmental fluctuations in the Southern Ocean.
DOI: 10.1073/pnas.1603941113
2016
Cited 284 times
Comparative genomics of biotechnologically important yeasts
Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the clade sister to the known CUG-Ser clade. Our well-resolved yeast phylogeny shows that some traits, such as methylotrophy, are restricted to single clades, whereas others, such as l-rhamnose utilization, have patchy phylogenetic distributions. Gene clusters, with variable organization and distribution, encode many pathways of interest. Genomics can predict some biochemical traits precisely, but the genomic basis of others, such as xylose utilization, remains unresolved. Our data also provide insight into early evolution of ascomycetes. We document the loss of H3K9me2/3 heterochromatin, the origin of ascomycete mating-type switching, and panascomycete synteny at the MAT locus. These data and analyses will facilitate the engineering of efficient biosynthetic and degradative pathways and gateways for genomic manipulation.
DOI: 10.1038/s41467-020-18795-w
2020
Cited 279 times
Large-scale genome sequencing of mycorrhizal fungi provides insights into the early evolution of symbiotic traits
Abstract Mycorrhizal fungi are mutualists that play crucial roles in nutrient acquisition in terrestrial ecosystems. Mycorrhizal symbioses arose repeatedly across multiple lineages of Mucoromycotina, Ascomycota, and Basidiomycota. Considerable variation exists in the capacity of mycorrhizal fungi to acquire carbon from soil organic matter. Here, we present a combined analysis of 135 fungal genomes from 73 saprotrophic, endophytic and pathogenic species, and 62 mycorrhizal species, including 29 new mycorrhizal genomes. This study samples ecologically dominant fungal guilds for which there were previously no symbiotic genomes available, including ectomycorrhizal Russulales, Thelephorales and Cantharellales. Our analyses show that transitions from saprotrophy to symbiosis involve (1) widespread losses of degrading enzymes acting on lignin and cellulose, (2) co-option of genes present in saprotrophic ancestors to fulfill new symbiotic functions, (3) diversification of novel, lineage-specific symbiosis-induced genes, (4) proliferation of transposable elements and (5) divergent genetic innovations underlying the convergent origins of the ectomycorrhizal guild.
DOI: 10.1038/s41467-017-02292-8
2017
Cited 275 times
Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure
While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.
DOI: 10.1105/tpc.111.087189
2011
Cited 269 times
Origins and Recombination of the Bacterial-Sized Multichromosomal Mitochondrial Genome of Cucumber
Members of the flowering plant family Cucurbitaceae harbor the largest known mitochondrial genomes. Here, we report the 1685-kb mitochondrial genome of cucumber (Cucumis sativus). We help solve a 30-year mystery about the origins of its large size by showing that it mainly reflects the proliferation of dispersed repeats, expansions of existing introns, and the acquisition of sequences from diverse sources, including the cucumber nuclear and chloroplast genomes, viruses, and bacteria. The cucumber genome has a novel structure for plant mitochondria, mapping as three entirely or largely autonomous circular chromosomes (lengths 1556, 84, and 45 kb) that vary in relative abundance over a twofold range. These properties suggest that the three chromosomes replicate independently of one another. The two smaller chromosomes are devoid of known functional genes but nonetheless contain diagnostic mitochondrial features. Paired-end sequencing conflicts reveal differences in recombination dynamics among chromosomes, for which an explanatory model is developed, as well as a large pool of low-frequency genome conformations, many of which may result from asymmetric recombination across intermediate-sized and sometimes highly divergent repeats. These findings highlight the promise of genome sequencing for elucidating the recombinational dynamics of plant mitochondrial genomes.
DOI: 10.1073/pnas.1119912109
2012
Cited 263 times
Comparative genomics of <i>Ceriporiopsis subvermispora</i> and <i>Phanerochaete chrysosporium</i> provide insight into selective ligninolysis
Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We also observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn(2+). Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium.
DOI: 10.1186/1471-2164-15-549
2014
Cited 262 times
Genome sequencing of four Aureobasidium pullulans varieties: biotechnological potential, stress tolerance, and description of new species
Aureobasidium pullulans is a black-yeast-like fungus used for production of the polysaccharide pullulan and the antimycotic aureobasidin A, and as a biocontrol agent in agriculture. It can cause opportunistic human infections, and it inhabits various extreme environments. To promote the understanding of these traits, we performed de-novo genome sequencing of the four varieties of A. pullulans.The 25.43-29.62 Mb genomes of these four varieties of A. pullulans encode between 10266 and 11866 predicted proteins. Their genomes encode most of the enzyme families involved in degradation of plant material and many sugar transporters, and they have genes possibly associated with degradation of plastic and aromatic compounds. Proteins believed to be involved in the synthesis of pullulan and siderophores, but not of aureobasidin A, are predicted. Putative stress-tolerance genes include several aquaporins and aquaglyceroporins, large numbers of alkali-metal cation transporters, genes for the synthesis of compatible solutes and melanin, all of the components of the high-osmolarity glycerol pathway, and bacteriorhodopsin-like proteins. All of these genomes contain a homothallic mating-type locus.The differences between these four varieties of A. pullulans are large enough to justify their redefinition as separate species: A. pullulans, A. melanogenum, A. subglaciale and A. namibiae. The redundancy observed in several gene families can be linked to the nutritional versatility of these species and their particular stress tolerance. The availability of the genome sequences of the four Aureobasidium species should improve their biotechnological exploitation and promote our understanding of their stress-tolerance mechanisms, diverse lifestyles, and pathogenic potential.
DOI: 10.1073/pnas.0801980105
2008
Cited 260 times
A korarchaeal genome reveals insights into the evolution of the Archaea
The candidate division Korarchaeota comprises a group of uncultivated microorganisms that, by their small subunit rRNA phylogeny, may have diverged early from the major archaeal phyla Crenarchaeota and Euryarchaeota. Here, we report the initial characterization of a member of the Korarchaeota with the proposed name, "Candidatus Korarchaeum cryptofilum," which exhibits an ultrathin filamentous morphology. To investigate possible ancestral relationships between deep-branching Korarchaeota and other phyla, we used whole-genome shotgun sequencing to construct a complete composite korarchaeal genome from enriched cells. The genome was assembled into a single contig 1.59 Mb in length with a G + C content of 49%. Of the 1,617 predicted protein-coding genes, 1,382 (85%) could be assigned to a revised set of archaeal Clusters of Orthologous Groups (COGs). The predicted gene functions suggest that the organism relies on a simple mode of peptide fermentation for carbon and energy and lacks the ability to synthesize de novo purines, CoA, and several other cofactors. Phylogenetic analyses based on conserved single genes and concatenated protein sequences positioned the korarchaeote as a deep archaeal lineage with an apparent affinity to the Crenarchaeota. However, the predicted gene content revealed that several conserved cellular systems, such as cell division, DNA replication, and tRNA maturation, resemble the counterparts in the Euryarchaeota. In light of the known composition of archaeal genomes, the Korarchaeota might have retained a set of cellular features that represents the ancestral archaeal form.
DOI: 10.1111/tpj.12319
2013
Cited 254 times
Anchoring and ordering <scp>NGS</scp> contig assemblies by population sequencing (<scp>POPSEQ</scp>)
Summary Next‐generation whole‐genome shotgun assemblies of complex genomes are highly useful, but fail to link nearby sequence contigs with each other or provide a linear order of contigs along individual chromosomes. Here, we introduce a strategy based on sequencing progeny of a segregating population that allows de novo production of a genetically anchored linear assembly of the gene space of an organism. We demonstrate the power of the approach by reconstructing the chromosomal organization of the gene space of barley, a large, complex and highly repetitive 5.1 Gb genome. We evaluate the robustness of the new assembly by comparison to a recently released physical and genetic framework of the barley genome, and to various genetically ordered sequence‐based genotypic datasets. The method is independent of the need for any prior sequence resources, and will enable rapid and cost‐efficient establishment of powerful genomic information for many species.
DOI: 10.1371/journal.pgen.1001129
2010
Cited 238 times
An Insect Herbivore Microbiome with High Plant Biomass-Degrading Capacity
Herbivores can gain indirect access to recalcitrant carbon present in plant cell walls through symbiotic associations with lignocellulolytic microbes. A paradigmatic example is the leaf-cutter ant (Tribe: Attini), which uses fresh leaves to cultivate a fungus for food in specialized gardens. Using a combination of sugar composition analyses, metagenomics, and whole-genome sequencing, we reveal that the fungus garden microbiome of leaf-cutter ants is composed of a diverse community of bacteria with high plant biomass-degrading capacity. Comparison of this microbiome's predicted carbohydrate-degrading enzyme profile with other metagenomes shows closest similarity to the bovine rumen, indicating evolutionary convergence of plant biomass degrading potential between two important herbivorous animals. Genomic and physiological characterization of two dominant bacteria in the fungus garden microbiome provides evidence of their capacity to degrade cellulose. Given the recent interest in cellulosic biofuels, understanding how large-scale and rapid plant biomass degradation occurs in a highly evolved insect herbivore is of particular relevance for bioenergy.
DOI: 10.1186/s13059-015-0582-8
2015
Cited 238 times
A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome
Polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible to construct a mapping population.
DOI: 10.1038/s41467-020-14981-y
2020
Cited 219 times
Marker-free carotenoid-enriched rice generated through targeted gene insertion using CRISPR-Cas9
Abstract Targeted insertion of transgenes at pre-determined plant genomic safe harbors provides a desirable alternative to insertions at random sites achieved through conventional methods. Most existing cases of targeted gene insertion in plants have either relied on the presence of a selectable marker gene in the insertion cassette or occurred at low frequency with relatively small DNA fragments (&lt;1.8 kb). Here, we report the use of an optimized CRISPR-Cas9-based method to achieve the targeted insertion of a 5.2 kb carotenoid biosynthesis cassette at two genomic safe harbors in rice. We obtain marker-free rice plants with high carotenoid content in the seeds and no detectable penalty in morphology or yield. Whole-genome sequencing reveals the absence of off-target mutations by Cas9 in the engineered plants. These results demonstrate targeted gene insertion of marker-free DNA in rice using CRISPR-Cas9 genome editing, and offer a promising strategy for genetic improvement of rice and other crops.
DOI: 10.1073/pnas.1703088114
2017
Cited 202 times
Insights into the red algae and eukaryotic evolution from the genome of <i>Porphyra umbilicalis</i> (Bangiophyceae, Rhodophyta)
Porphyra umbilicalis (laver) belongs to an ancient group of red algae (Bangiophyceae), is harvested for human food, and thrives in the harsh conditions of the upper intertidal zone. Here we present the 87.7-Mbp haploid Porphyra genome (65.8% G + C content, 13,125 gene loci) and elucidate traits that inform our understanding of the biology of red algae as one of the few multicellular eukaryotic lineages. Novel features of the Porphyra genome shared by other red algae relate to the cytoskeleton, calcium signaling, the cell cycle, and stress-tolerance mechanisms including photoprotection. Cytoskeletal motor proteins in Porphyra are restricted to a small set of kinesins that appear to be the only universal cytoskeletal motors within the red algae. Dynein motors are absent, and most red algae, including Porphyra, lack myosin. This surprisingly minimal cytoskeleton offers a potential explanation for why red algal cells and multicellular structures are more limited in size than in most multicellular lineages. Additional discoveries further relating to the stress tolerance of bangiophytes include ancestral enzymes for sulfation of the hydrophilic galactan-rich cell wall, evidence for mannan synthesis that originated before the divergence of green and red algae, and a high capacity for nutrient uptake. Our analyses provide a comprehensive understanding of the red algae, which are both commercially important and have played a major role in the evolution of other algal groups through secondary endosymbioses.
DOI: 10.1371/journal.pgen.1003233
2013
Cited 201 times
Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens
The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.
DOI: 10.1038/s41559-019-0834-1
2019
Cited 197 times
Megaphylogeny resolves global patterns of mushroom evolution
Abstract Mushroom-forming fungi (Agaricomycetes) have the greatest morphological diversity and complexity of any group of fungi. They have radiated into most niches and fulfil diverse roles in the ecosystem, including wood decomposers, pathogens or mycorrhizal mutualists. Despite the importance of mushroom-forming fungi, large-scale patterns of their evolutionary history are poorly known, in part due to the lack of a comprehensive and dated molecular phylogeny. Here, using multigene and genome-based data, we assemble a 5,284-species phylogenetic tree and infer ages and broad patterns of speciation/extinction and morphological innovation in mushroom-forming fungi. Agaricomycetes started a rapid class-wide radiation in the Jurassic, coinciding with the spread of (sub)tropical coniferous forests and a warming climate. A possible mass extinction, several clade-specific adaptive radiations and morphological diversification of fruiting bodies followed during the Cretaceous and the Paleogene, convergently giving rise to the classic toadstool morphology, with a cap, stalk and gills (pileate-stipitate morphology). This morphology is associated with increased rates of lineage diversification, suggesting it represents a key innovation in the evolution of mushroom-forming fungi. The increase in mushroom diversity started during the Mesozoic-Cenozoic radiation event, an era of humid climate when terrestrial communities dominated by gymnosperms and reptiles were also expanding.
DOI: 10.1093/molbev/msv337
2015
Cited 190 times
Comparative Genomics of Early-Diverging Mushroom-Forming Fungi Provides Insights into the Origins of Lignocellulose Decay Capabilities
Evolution of lignocellulose decomposition was one of the most ecologically important innovations in fungi. White-rot fungi in the Agaricomycetes (mushrooms and relatives) are the most effective microorganisms in degrading both cellulose and lignin components of woody plant cell walls (PCW). However, the precise evolutionary origins of lignocellulose decomposition are poorly understood, largely because certain early-diverging clades of Agaricomycetes and its sister group, the Dacrymycetes, have yet to be sampled, or have been undersampled, in comparative genomic studies. Here, we present new genome sequences of ten saprotrophic fungi, including members of the Dacrymycetes and early-diverging clades of Agaricomycetes (Cantharellales, Sebacinales, Auriculariales, and Trechisporales), which we use to refine the origins and evolutionary history of the enzymatic toolkit of lignocellulose decomposition. We reconstructed the origin of ligninolytic enzymes, focusing on class II peroxidases (AA2), as well as enzymes that attack crystalline cellulose. Despite previous reports of white rot appearing as early as the Dacrymycetes, our results suggest that white-rot fungi evolved later in the Agaricomycetes, with the first class II peroxidases reconstructed in the ancestor of the Auriculariales and residual Agaricomycetes. The exemplars of the most ancient clades of Agaricomycetes that we sampled all lack class II peroxidases, and are thus concluded to use a combination of plesiomorphic and derived PCW degrading enzymes that predate the evolution of white rot.
DOI: 10.1038/nmicrobiol.2017.87
2017
Cited 168 times
A parts list for fungal cellulosomes revealed by comparative genomics
Cellulosomes are large, multiprotein complexes that tether plant biomass-degrading enzymes together for improved hydrolysis1. These complexes were first described in anaerobic bacteria, where species-specific dockerin domains mediate the assembly of enzymes onto cohesin motifs interspersed within protein scaffolds1. The versatile protein assembly mechanism conferred by the bacterial cohesin–dockerin interaction is now a standard design principle for synthetic biology2,3. For decades, analogous structures have been reported in anaerobic fungi, which are known to assemble by sequence-divergent non-catalytic dockerin domains (NCDDs)4. However, the components, modular assembly mechanism and functional role of fungal cellulosomes remain unknown5,6. Here, we describe a comprehensive set of proteins critical to fungal cellulosome assembly, including conserved scaffolding proteins unique to the Neocallimastigomycota. High-quality genomes of the anaerobic fungi Anaeromyces robustus, Neocallimastix californiae and Piromyces finnis were assembled with long-read, single-molecule technology. Genomic analysis coupled with proteomic validation revealed an average of 312 NCDD-containing proteins per fungal strain, which were overwhelmingly carbohydrate active enzymes (CAZymes), with 95 large fungal scaffoldins identified across four genera that bind to NCDDs. Fungal dockerin and scaffoldin domains have no similarity to their bacterial counterparts, yet several catalytic domains originated via horizontal gene transfer with gut bacteria. However, the biocatalytic activity of anaerobic fungal cellulosomes is expanded by the inclusion of GH3, GH6 and GH45 enzymes. These findings suggest that the fungal cellulosome is an evolutionarily chimaeric structure—an independently evolved fungal complex that co-opted useful activities from bacterial neighbours within the gut microbiome. This study identifies the proteins critical to fungal cellulosome assembly, characterizing the complex as evolutionarily chimeric — an independently evolved fungal complex co-opted catalytic activities from bacteria coexisting within the gut.
DOI: 10.1111/nph.14974
2018
Cited 167 times
Comparative genomics and transcriptomics depict ericoid mycorrhizal fungi as versatile saprotrophs and plant mutualists
Some soil fungi in the Leotiomycetes form ericoid mycorrhizal (ERM) symbioses with Ericaceae. In the harsh habitats in which they occur, ERM plant survival relies on nutrient mobilization from soil organic matter (SOM) by their fungal partners. The characterization of the fungal genetic machinery underpinning both the symbiotic lifestyle and SOM degradation is needed to understand ERM symbiosis functioning and evolution, and its impact on soil carbon (C) turnover. We sequenced the genomes of the ERM fungi Meliniomyces bicolor, M. variabilis, Oidiodendron maius and Rhizoscyphus ericae, and compared their gene repertoires with those of fungi with different lifestyles (ecto- and orchid mycorrhiza, endophytes, saprotrophs, pathogens). We also identified fungal transcripts induced in symbiosis. The ERM fungal gene contents for polysaccharide-degrading enzymes, lipases, proteases and enzymes involved in secondary metabolism are closer to those of saprotrophs and pathogens than to those of ectomycorrhizal symbionts. The fungal genes most highly upregulated in symbiosis are those coding for fungal and plant cell wall-degrading enzymes (CWDEs), lipases, proteases, transporters and mycorrhiza-induced small secreted proteins (MiSSPs). The ERM fungal gene repertoire reveals a capacity for a dual saprotrophic and biotrophic lifestyle. This may reflect an incomplete transition from saprotrophy to the mycorrhizal habit, or a versatile life strategy similar to fungal endophytes.
DOI: 10.1038/s41588-018-0246-1
2018
Cited 156 times
Investigation of inter- and intraspecies variation through genome sequencing of Aspergillus section Nigri
Aspergillus section Nigri comprises filamentous fungi relevant to biomedicine, bioenergy, health, and biotechnology. To learn more about what genetically sets these species apart, as well as about potential applications in biotechnology and biomedicine, we sequenced 23 genomes de novo, forming a full genome compendium for the section (26 species), as well as 6 Aspergillus niger isolates. This allowed us to quantify both inter- and intraspecies genomic variation. We further predicted 17,903 carbohydrate-active enzymes and 2,717 secondary metabolite gene clusters, which we condensed into 455 distinct families corresponding to compound classes, 49% of which are only found in single species. We performed metabolomics and genetic engineering to correlate genotypes to phenotypes, as demonstrated for the metabolite aurasperone, and by heterologous transfer of citrate production to Aspergillus nidulans. Experimental and computational analyses showed that both secondary metabolism and regulation are key factors that are significant in the delineation of Aspergillus species. De novo assembly of 23 Aspergillus section Nigri and 6 Aspergillus niger genome sequences allows for inter- and intraspecies comparisons and prediction of secondary metabolite gene clusters.
DOI: 10.1016/j.simyco.2020.01.003
2020
Cited 146 times
101 Dothideomycetes genomes: A test case for predicting lifestyles and emergence of pathogens
Dothideomycetes is the largest class of kingdom Fungi and comprises an incredible diversity of lifestyles, many of which have evolved multiple times. Plant pathogens represent a major ecological niche of the class Dothideomycetes and they are known to infect most major food crops and feedstocks for biomass and biofuel production. Studying the ecology and evolution of Dothideomycetes has significant implications for our fundamental understanding of fungal evolution, their adaptation to stress and host specificity, and practical implications with regard to the effects of climate change and on the food, feed, and livestock elements of the agro-economy. In this study, we present the first large-scale, whole-genome comparison of 101 Dothideomycetes introducing 55 newly sequenced species. The availability of whole-genome data produced a high-confidence phylogeny leading to reclassification of 25 organisms, provided a clearer picture of the relationships among the various families, and indicated that pathogenicity evolved multiple times within this class. We also identified gene family expansions and contractions across the Dothideomycetes phylogeny linked to ecological niches providing insights into genome evolution and adaptation across this group. Using machine-learning methods we classified fungi into lifestyle classes with >95 % accuracy and identified a small number of gene families that positively correlated with these distinctions. This can become a valuable tool for genome-based prediction of species lifestyle, especially for rarely seen and poorly studied species.
DOI: 10.1038/s41598-018-24686-4
2018
Cited 145 times
Comparative genomics provides insights into the lifestyle and reveals functional heterogeneity of dark septate endophytic fungi
Dark septate endophytes (DSE) are a form-group of root endophytic fungi with elusive functions. Here, the genomes of two common DSE of semiarid areas, Cadophora sp. and Periconia macrospinosa were sequenced and analyzed with another 32 ascomycetes of different lifestyles. Cadophora sp. (Helotiales) and P. macrospinosa (Pleosporales) have genomes of 70.46 Mb and 54.99 Mb with 22,766 and 18,750 gene models, respectively. The majority of DSE-specific protein clusters lack functional annotation with no similarity to characterized proteins, implying that they have evolved unique genetic innovations. Both DSE possess an expanded number of carbohydrate active enzymes (CAZymes), including plant cell wall degrading enzymes (PCWDEs). Those were similar in three other DSE, and contributed a signal for the separation of root endophytes in principal component analyses of CAZymes, indicating shared genomic traits of DSE fungi. Number of secreted proteases and lipases, aquaporins, and genes linked to melanin synthesis were also relatively high in our fungi. In spite of certain similarities between our two DSE, we observed low levels of convergence in their gene family evolution. This suggests that, despite originating from the same habitat, these two fungi evolved along different evolutionary trajectories and display considerable functional differences within the endophytic lifestyle.
DOI: 10.1038/s41586-020-03127-1
2021
Cited 145 times
Genomic mechanisms of climate adaptation in polyploid bioenergy switchgrass
Abstract Long-term climate change and periodic environmental extremes threaten food and fuel security 1 and global crop productivity 2–4 . Although molecular and adaptive breeding strategies can buffer the effects of climatic stress and improve crop resilience 5 , these approaches require sufficient knowledge of the genes that underlie productivity and adaptation 6 —knowledge that has been limited to a small number of well-studied model systems. Here we present the assembly and annotation of the large and complex genome of the polyploid bioenergy crop switchgrass ( Panicum virgatum ). Analysis of biomass and survival among 732 resequenced genotypes, which were grown across 10 common gardens that span 1,800 km of latitude, jointly revealed extensive genomic evidence of climate adaptation. Climate–gene–biomass associations were abundant but varied considerably among deeply diverged gene pools. Furthermore, we found that gene flow accelerated climate adaptation during the postglacial colonization of northern habitats through introgression of alleles from a pre-adapted northern gene pool. The polyploid nature of switchgrass also enhanced adaptive potential through the fractionation of gene function, as there was an increased level of heritable genetic diversity on the nondominant subgenome. In addition to investigating patterns of climate adaptation, the genome resources and gene–trait associations developed here provide breeders with the necessary tools to increase switchgrass yield for the sustainable production of bioenergy.
DOI: 10.1111/1462-2920.13669
2017
Cited 143 times
Comparative genomics of <i>Mortierella elongata</i> and its bacterial endosymbiont <i>Mycoavidus cysteinexigens</i>
Endosymbiosis of bacteria by eukaryotes is a defining feature of cellular evolution. In addition to well-known bacterial origins for mitochondria and chloroplasts, multiple origins of bacterial endosymbiosis are known within the cells of diverse animals, plants and fungi. Early-diverging lineages of terrestrial fungi harbor endosymbiotic bacteria belonging to the Burkholderiaceae. We sequenced the metagenome of the soil-inhabiting fungus Mortierella elongata and assembled the complete circular chromosome of its endosymbiont, Mycoavidus cysteinexigens, which we place within a lineage of endofungal symbionts that are sister clade to Burkholderia. The genome of M. elongata strain AG77 features a core set of primary metabolic pathways for degradation of simple carbohydrates and lipid biosynthesis, while the M. cysteinexigens (AG77) genome is reduced in size and function. Experiments using antibiotics to cure the endobacterium from the host demonstrate that the fungal host metabolism is highly modulated by presence/absence of M. cysteinexigens. Independent comparative phylogenomic analyses of fungal and bacterial genomes are consistent with an ancient origin for M. elongata - M. cysteinexigens symbiosis, most likely over 350 million years ago and concomitant with the terrestrialization of Earth and diversification of land fungi and plants.
DOI: 10.1038/ncomms12662
2016
Cited 140 times
Ectomycorrhizal ecology is imprinted in the genome of the dominant symbiotic fungus Cenococcum geophilum
The most frequently encountered symbiont on tree roots is the ascomycete Cenococcum geophilum, the only mycorrhizal species within the largest fungal class Dothideomycetes, a class known for devastating plant pathogens. Here we show that the symbiotic genomic idiosyncrasies of ectomycorrhizal basidiomycetes are also present in C. geophilum with symbiosis-induced, taxon-specific genes of unknown function and reduced numbers of plant cell wall-degrading enzymes. C. geophilum still holds a significant set of genes in categories known to be involved in pathogenesis and shows an increased genome size due to transposable elements proliferation. Transcript profiling revealed a striking upregulation of membrane transporters, including aquaporin water channels and sugar transporters, and mycorrhiza-induced small secreted proteins (MiSSPs) in ectomycorrhiza compared with free-living mycelium. The frequency with which this symbiont is found on tree roots and its possible role in water and nutrient transport in symbiosis calls for further studies on mechanisms of host and environmental adaptation.
DOI: 10.1038/s41559-017-0347-8
2017
Cited 140 times
Genome expansion and lineage-specific genetic innovations in the forest pathogenic fungi Armillaria
Abstract Armillaria species are both devastating forest pathogens and some of the largest terrestrial organisms on Earth. They forage for hosts and achieve immense colony sizes via rhizomorphs, root-like multicellular structures of clonal dispersal. Here, we sequenced and analysed the genomes of four Armillaria species and performed RNA sequencing and quantitative proteomic analysis on the invasive and reproductive developmental stages of A. ostoyae . Comparison with 22 related fungi revealed a significant genome expansion in Armillaria , affecting several pathogenicity-related genes, lignocellulose-degrading enzymes and lineage-specific genes expressed during rhizomorph development. Rhizomorphs express an evolutionarily young transcriptome that shares features with the transcriptomes of both fruiting bodies and vegetative mycelia. Several genes show concomitant upregulation in rhizomorphs and fruiting bodies and share cis -regulatory signatures in their promoters, providing genetic and regulatory insights into complex multicellularity in fungi. Our results suggest that the evolution of the unique dispersal and pathogenicity mechanisms of Armillaria might have drawn upon ancestral genetic toolkits for wood-decay, morphogenesis and complex multicellularity.
DOI: 10.1038/s41467-019-14051-y
2020
Cited 130 times
A comparative genomics study of 23 Aspergillus species from section Flavi
Abstract Section Flavi encompasses both harmful and beneficial Aspergillus species, such as Aspergillus oryzae , used in food fermentation and enzyme production, and Aspergillus flavus , food spoiler and mycotoxin producer. Here, we sequence 19 genomes spanning section Flavi and compare 31 fungal genomes including 23 Flavi species. We reassess their phylogenetic relationships and show that the closest relative of A. oryzae is not A. flavus , but A. minisclerotigenes or A. aflatoxiformans and identify high genome diversity, especially in sub-telomeric regions. We predict abundant CAZymes (598 per species) and prolific secondary metabolite gene clusters (73 per species) in section Flavi . However, the observed phenotypes (growth characteristics, polysaccharide degradation) do not necessarily correlate with inferences made from the predicted CAZyme content. Our work, including genomic analyses, phenotypic assays, and identification of secondary metabolites, highlights the genetic and metabolic diversity within section Flavi .
DOI: 10.1038/s41564-020-00861-0
2021
Cited 123 times
Genomic and functional analyses of fungal and bacterial consortia that enable lignocellulose breakdown in goat gut microbiomes
The herbivore digestive tract is home to a complex community of anaerobic microbes that work together to break down lignocellulose. These microbiota are an untapped resource of strains, pathways and enzymes that could be applied to convert plant waste into sugar substrates for green biotechnology. We carried out more than 400 parallel enrichment experiments from goat faeces to determine how substrate and antibiotic selection influence membership, activity, stability and chemical productivity of herbivore gut communities. We assembled 719 high-quality metagenome-assembled genomes (MAGs) that are unique at the species level. More than 90% of these MAGs are from previously unidentified herbivore gut microorganisms. Microbial consortia dominated by anaerobic fungi outperformed bacterially dominated consortia in terms of both methane production and extent of cellulose degradation, which indicates that fungi have an important role in methane release. Metabolic pathway reconstructions from MAGs of 737 bacteria, archaea and fungi suggest that cross-domain partnerships between fungi and methanogens enabled production of acetate, formate and methane, whereas bacterially dominated consortia mainly produced short-chain fatty acids, including propionate and butyrate. Analyses of carbohydrate-active enzyme domains present in each anaerobic consortium suggest that anaerobic bacteria and fungi employ mostly complementary hydrolytic strategies. The division of labour among herbivore anaerobes to degrade plant biomass could be harnessed for industrial bioprocessing.
DOI: 10.1111/tpj.14500
2019
Cited 117 times
Construction and comparison of three reference‐quality genome assemblies for soybean
We report reference-quality genome assemblies and annotations for two accessions of soybean (Glycine max) and for one accession of Glycine soja, the closest wild relative of G. max. The G. max assemblies provided are for widely used US cultivars: the northern line Williams 82 (Wm82) and the southern line Lee. The Wm82 assembly improves the prior published assembly, and the Lee and G. soja assemblies are new for these accessions. Comparisons among the three accessions show generally high structural conservation, but nucleotide difference of 1.7 single-nucleotide polymorphisms (snps) per kb between Wm82 and Lee, and 4.7 snps per kb between these lines and G. soja. snp distributions and comparisons with genotypes of the Lee and Wm82 parents highlight patterns of introgression and haplotype structure. Comparisons against the US germplasm collection show placement of the sequenced accessions relative to global soybean diversity. Analysis of a pan-gene collection shows generally high conservation, with variation occurring primarily in genomically clustered gene families. We found approximately 40-42 inversions per chromosome between either Lee or Wm82v4 and G. soja, and approximately 32 inversions per chromosome between Wm82 and Lee. We also investigated five domestication loci. For each locus, we found two different alleles with functional differences between G. soja and the two domesticated accessions. The genome assemblies for multiple cultivated accessions and for the closest wild ancestor of soybean provides a valuable set of resources for identifying causal variants that underlie traits for the domestication and improvement of soybean, serving as a basis for future research and crop improvement efforts for this important crop species.
DOI: 10.1073/pnas.1817822116
2019
Cited 116 times
Transcriptomic atlas of mushroom development reveals conserved genes behind complex multicellularity in fungi
The evolution of complex multicellularity has been one of the major transitions in the history of life. In contrast to simple multicellular aggregates of cells, it has evolved only in a handful of lineages, including animals, embryophytes, red and brown algae, and fungi. Despite being a key step toward the evolution of complex organisms, the evolutionary origins and the genetic underpinnings of complex multicellularity are incompletely known. The development of fungal fruiting bodies from a hyphal thallus represents a transition from simple to complex multicellularity that is inducible under laboratory conditions. We constructed a reference atlas of mushroom formation based on developmental transcriptome data of six species and comparisons of >200 whole genomes, to elucidate the core genetic program of complex multicellularity and fruiting body development in mushroom-forming fungi (Agaricomycetes). Nearly 300 conserved gene families and >70 functional groups contained developmentally regulated genes from five to six species, covering functions related to fungal cell wall remodeling, targeted protein degradation, signal transduction, adhesion, and small secreted proteins (including effector-like orphan genes). Several of these families, including F-box proteins, expansin-like proteins, protein kinases, and transcription factors, showed expansions in Agaricomycetes, many of which convergently expanded in multicellular plants and/or animals too, reflecting convergent solutions to genetic hurdles imposed by complex multicellularity among independently evolved lineages. This study provides an entry point to studying mushroom development and complex multicellularity in one of the largest clades of complex eukaryotic organisms.
DOI: 10.1038/s41587-020-0681-2
2020
Cited 107 times
A genome resource for green millet Setaria viridis enables discovery of agronomically valuable loci
Abstract Wild and weedy relatives of domesticated crops harbor genetic variants that can advance agricultural biotechnology. Here we provide a genome resource for the wild plant green millet ( Setaria viridis ), a model species for studies of C 4 grasses, and use the resource to probe domestication genes in the close crop relative foxtail millet ( Setaria italica ). We produced a platinum-quality genome assembly of S. viridis and de novo assemblies for 598 wild accessions and exploited these assemblies to identify loci underlying three traits: response to climate, a ‘loss of shattering’ trait that permits mechanical harvest and leaf angle, a predictor of yield in many grass crops. With CRISPR–Cas9 genome editing, we validated Less Shattering1 ( SvLes1 ) as a gene whose product controls seed shattering. In S. italica , this gene was rendered nonfunctional by a retrotransposon insertion in the domesticated loss-of-shattering allele SiLes1-TE (transposable element). This resource will enhance the utility of S. viridis for dissection of complex traits and biotechnological improvement of panicoid crops.
DOI: 10.1101/2023.01.31.526407
2023
Cited 19 times
Chromosome-level genomes of multicellular algal sisters to land plants illuminate signaling network evolution
The filamentous and unicellular algae of the class Zygnematophyceae are the closest algal relatives of land plants. Inferring the properties of the last common ancestor shared by these algae and land plants allows us to identify decisive traits that enabled the conquest of land by plants. We sequenced four genomes of filamentous Zygnematophyceae (three strains of Zygnema circumcarinatum and one strain of Z. cylindricum) and generated chromosome-scale assemblies for all strains of the emerging model system Z. circumcarinatum. Comparative genomic analyses reveal expanded genes for signaling cascades, environmental response, and intracellular trafficking that we associate with multicellularity. Gene family analyses suggest that Zygnematophyceae share all the major enzymes with land plants for cell wall polysaccharide synthesis, degradation, and modifications; most of the enzymes for cell wall innovations, especially for polysaccharide backbone synthesis, were gained more than 700 million years ago. In Zygnematophyceae, these enzyme families expanded, forming co-expressed modules. Transcriptomic profiling of over 19 growth conditions combined with co-expression network analyses uncover cohorts of genes that unite environmental signaling with multicellular developmental programs. Our data shed light on a molecular chassis that balances environmental response and growth modulation across more than 600 million years of streptophyte evolution.
DOI: 10.1073/pnas.1005297107
2010
Cited 200 times
Adaptation to herbivory by the Tammar wallaby includes bacterial and glycoside hydrolase profiles different from other herbivores
Metagenomic and bioinformatic approaches were used to characterize plant biomass conversion within the foregut microbiome of Australia's "model" marsupial, the Tammar wallaby (Macropus eugenii). Like the termite hindgut and bovine rumen, key enzymes and modular structures characteristic of the "free enzyme" and "cellulosome" paradigms of cellulose solubilization remain either poorly represented or elusive to capture by shotgun sequencing methods. Instead, multigene polysaccharide utilization loci-like systems coupled with genes encoding beta-1,4-endoglucanases and beta-1,4-endoxylanases--which have not been previously encountered in metagenomic datasets--were identified, as were a diverse set of glycoside hydrolases targeting noncellulosic polysaccharides. Furthermore, both rrs gene and other phylogenetic analyses confirmed that unique clades of the Lachnospiraceae, Bacteroidales, and Gammaproteobacteria are predominant in the Tammar foregut microbiome. Nucleotide composition-based sequence binning facilitated the assemblage of more than two megabase pairs of genomic sequence for one of the novel Lachnospiraceae clades (WG-2). These analyses show that WG-2 possesses numerous glycoside hydrolases targeting noncellulosic polysaccharides. These collective data demonstrate that Australian macropods not only harbor unique bacterial lineages underpinning plant biomass conversion, but their repertoire of glycoside hydrolases is distinct from those of the microbiomes of higher termites and the bovine rumen.
DOI: 10.1126/science.1138438
2007
Cited 192 times
The <i>Calyptogena magnifica</i> Chemoautotrophic Symbiont Genome
Chemoautotrophic endosymbionts are the metabolic cornerstone of hydrothermal vent communities, providing invertebrate hosts with nearly all of their nutrition. The Calyptogena magnifica (Bivalvia: Vesicomyidae) symbiont, Candidatus Ruthia magnifica, is the first intracellular sulfur-oxidizing endosymbiont to have its genome sequenced, revealing a suite of metabolic capabilities. The genome encodes major chemoautotrophic pathways as well as pathways for biosynthesis of vitamins, cofactors, and all 20 amino acids required by the clam.
DOI: 10.1073/pnas.1103039108
2011
Cited 180 times
Comparative genomics of xylose-fermenting fungi for enhanced biofuel production
Cellulosic biomass is an abundant and underused substrate for biofuel production. The inability of many microbes to metabolize the pentose sugars abundant within hemicellulose creates specific challenges for microbial biofuel production from cellulosic material. Although engineered strains of Saccharomyces cerevisiae can use the pentose xylose, the fermentative capacity pales in comparison with glucose, limiting the economic feasibility of industrial fermentations. To better understand xylose utilization for subsequent microbial engineering, we sequenced the genomes of two xylose-fermenting, beetle-associated fungi, Spathaspora passalidarum and Candida tenuis. To identify genes involved in xylose metabolism, we applied a comparative genomic approach across 14 Ascomycete genomes, mapping phenotypes and genotypes onto the fungal phylogeny, and measured genomic expression across five Hemiascomycete species with different xylose-consumption phenotypes. This approach implicated many genes and processes involved in xylose assimilation. Several of these genes significantly improved xylose utilization when engineered into S. cerevisiae, demonstrating the power of comparative methods in rapidly identifying genes for biomass conversion while reflecting on fungal ecology.
DOI: 10.1186/gb-2011-12-2-r20
2011
Cited 144 times
Comparative genomics of the social amoebae Dictyostelium discoideum and Dictyostelium purpureum
The social amoebae (Dictyostelia) are a diverse group of Amoebozoa that achieve multicellularity by aggregation and undergo morphogenesis into fruiting bodies with terminally differentiated spores and stalk cells. There are four groups of dictyostelids, with the most derived being a group that contains the model species Dictyostelium discoideum.We have produced a draft genome sequence of another group dictyostelid, Dictyostelium purpureum, and compare it to the D. discoideum genome. The assembly (8.41 × coverage) comprises 799 scaffolds totaling 33.0 Mb, comparable to the D. discoideum genome size. Sequence comparisons suggest that these two dictyostelids shared a common ancestor approximately 400 million years ago. In spite of this divergence, most orthologs reside in small clusters of conserved synteny. Comparative analyses revealed a core set of orthologous genes that illuminate dictyostelid physiology, as well as differences in gene family content. Interesting patterns of gene conservation and divergence are also evident, suggesting function differences; some protein families, such as the histidine kinases, have undergone little functional change, whereas others, such as the polyketide synthases, have undergone extensive diversification. The abundant amino acid homopolymers encoded in both genomes are generally not found in homologous positions within proteins, so they are unlikely to derive from ancestral DNA triplet repeats. Genes involved in the social stage evolved more rapidly than others, consistent with either relaxed selection or accelerated evolution due to social conflict.The findings from this new genome sequence and comparative analysis shed light on the biology and evolution of the Dictyostelia.
DOI: 10.1105/tpc.17.00154
2017
Cited 135 times
The Sequences of 1504 Mutants in the Model Rice Variety Kitaake Facilitate Rapid Functional Genomic Studies
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportion of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. This work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations.
DOI: 10.1371/journal.pgen.1007322
2018
Cited 130 times
Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts
Unlike most other fungi, molds of the genus Trichoderma (Hypocreales, Ascomycota) are aggressive parasites of other fungi and efficient decomposers of plant biomass. Although nutritional shifts are common among hypocrealean fungi, there are no examples of such broad substrate versatility as that observed in Trichoderma. A phylogenomic analysis of 23 hypocrealean fungi (including nine Trichoderma spp. and the related Escovopsis weberi) revealed that the genus Trichoderma has evolved from an ancestor with limited cellulolytic capability that fed on either fungi or arthropods. The evolutionary analysis of Trichoderma genes encoding plant cell wall-degrading carbohydrate-active enzymes and auxiliary proteins (pcwdCAZome, 122 gene families) based on a gene tree / species tree reconciliation demonstrated that the formation of the genus was accompanied by an unprecedented extent of lateral gene transfer (LGT). Nearly one-half of the genes in Trichoderma pcwdCAZome (41%) were obtained via LGT from plant-associated filamentous fungi belonging to different classes of Ascomycota, while no LGT was observed from other potential donors. In addition to the ability to feed on unrelated fungi (such as Basidiomycota), we also showed that Trichoderma is capable of endoparasitism on a broad range of Ascomycota, including extant LGT donors. This phenomenon was not observed in E. weberi and rarely in other mycoparasitic hypocrealean fungi. Thus, our study suggests that LGT is linked to the ability of Trichoderma to parasitize taxonomically related fungi (up to adelphoparasitism in strict sense). This may have allowed primarily mycotrophic Trichoderma fungi to evolve into decomposers of plant biomass.
DOI: 10.1073/pnas.1418963111
2014
Cited 129 times
Analysis of clock-regulated genes in <i>Neurospora</i> reveals widespread posttranscriptional control of metabolic potential
Neurospora crassa has been for decades a principal model for filamentous fungal genetics and physiology as well as for understanding the mechanism of circadian clocks. Eukaryotic fungal and animal clocks comprise transcription-translation-based feedback loops that control rhythmic transcription of a substantial fraction of these transcriptomes, yielding the changes in protein abundance that mediate circadian regulation of physiology and metabolism: Understanding circadian control of gene expression is key to understanding eukaryotic, including fungal, physiology. Indeed, the isolation of clock-controlled genes (ccgs) was pioneered in Neurospora where circadian output begins with binding of the core circadian transcription factor WCC to a subset of ccg promoters, including those of many transcription factors. High temporal resolution (2-h) sampling over 48 h using RNA sequencing (RNA-Seq) identified circadianly expressed genes in Neurospora, revealing that from ∼10% to as much 40% of the transcriptome can be expressed under circadian control. Functional classifications of these genes revealed strong enrichment in pathways involving metabolism, protein synthesis, and stress responses; in broad terms, daytime metabolic potential favors catabolism, energy production, and precursor assembly, whereas night activities favor biosynthesis of cellular components and growth. Discriminative regular expression motif elicitation (DREME) identified key promoter motifs highly correlated with the temporal regulation of ccgs. Correlations between ccg abundance from RNA-Seq, the degree of ccg-promoter activation as reported by ccg-promoter-luciferase fusions, and binding of WCC as measured by ChIP-Seq, are not strong. Therefore, although circadian activation is critical to ccg rhythmicity, posttranscriptional regulation plays a major role in determining rhythmicity at the mRNA level.
DOI: 10.1186/1471-2164-13-444
2012
Cited 124 times
Comparative genomics of the white-rot fungi, Phanerochaete carnosa and P. chrysosporium, to elucidate the genetic basis of the distinct wood types they colonize
Softwood is the predominant form of land plant biomass in the Northern hemisphere, and is among the most recalcitrant biomass resources to bioprocess technologies. The white rot fungus, Phanerochaete carnosa, has been isolated almost exclusively from softwoods, while most other known white-rot species, including Phanerochaete chrysosporium, were mainly isolated from hardwoods. Accordingly, it is anticipated that P. carnosa encodes a distinct set of enzymes and proteins that promote softwood decomposition. To elucidate the genetic basis of softwood bioconversion by a white-rot fungus, the present study reports the P. carnosa genome sequence and its comparative analysis with the previously reported P. chrysosporium genome.P. carnosa encodes a complete set of lignocellulose-active enzymes. Comparative genomic analysis revealed that P. carnosa is enriched with genes encoding manganese peroxidase, and that the most divergent glycoside hydrolase families were predicted to encode hemicellulases and glycoprotein degrading enzymes. Most remarkably, P. carnosa possesses one of the largest P450 contingents (266 P450s) among the sequenced and annotated wood-rotting basidiomycetes, nearly double that of P. chrysosporium. Along with metabolic pathway modeling, comparative growth studies on model compounds and chemical analyses of decomposed wood components showed greater tolerance of P. carnosa to various substrates including coniferous heartwood.The P. carnosa genome is enriched with genes that encode P450 monooxygenases that can participate in extractives degradation, and manganese peroxidases involved in lignin degradation. The significant expansion of P450s in P. carnosa, along with differences in carbohydrate- and lignin-degrading enzymes, could be correlated to the utilization of heartwood and sapwood preparations from both coniferous and hardwood species.
DOI: 10.1371/journal.pone.0073827
2013
Cited 120 times
Metagenomic Profiling Reveals Lignocellulose Degrading System in a Microbial Community Associated with a Wood-Feeding Beetle
The Asian longhorned beetle (Anoplophora glabripennis) is an invasive, wood-boring pest that thrives in the heartwood of deciduous tree species. A large impediment faced by A. glabripennis as it feeds on woody tissue is lignin, a highly recalcitrant biopolymer that reduces access to sugars and other nutrients locked in cellulose and hemicellulose. We previously demonstrated that lignin, cellulose, and hemicellulose are actively deconstructed in the beetle gut and that the gut harbors an assemblage of microbes hypothesized to make significant contributions to these processes. While lignin degrading mechanisms have been well characterized in pure cultures of white rot basidiomycetes, little is known about such processes in microbial communities associated with wood-feeding insects. The goals of this study were to develop a taxonomic and functional profile of a gut community derived from an invasive population of larval A. glabripennis collected from infested host trees and to identify genes that could be relevant for the digestion of woody tissue and nutrient acquisition. To accomplish this goal, we taxonomically and functionally characterized the A. glabripennis midgut microbiota through amplicon and shotgun metagenome sequencing and conducted a large-scale comparison with the metagenomes from a variety of other herbivore-associated communities. This analysis distinguished the A. glabripennis larval gut metagenome from the gut communities of other herbivores, including previously sequenced termite hindgut metagenomes. Genes encoding enzymes were identified in the A. glabripennis gut metagenome that could have key roles in woody tissue digestion including candidate lignin degrading genes (laccases, dye-decolorizing peroxidases, novel peroxidases and β-etherases), 36 families of glycoside hydrolases (such as cellulases and xylanases), and genes that could facilitate nutrient recovery, essential nutrient synthesis, and detoxification. This community could serve as a reservoir of novel enzymes to enhance industrial cellulosic biofuels production or targets for novel control methods for this invasive and highly destructive insect.
DOI: 10.1038/ismej.2012.10
2012
Cited 118 times
Metagenomic and metaproteomic insights into bacterial communities in leaf-cutter ant fungus gardens
Herbivores gain access to nutrients stored in plant biomass largely by harnessing the metabolic activities of microbes. Leaf-cutter ants of the genus Atta are a hallmark example; these dominant neotropical herbivores cultivate symbiotic fungus gardens on large quantities of fresh plant forage. As the external digestive system of the ants, fungus gardens facilitate the production and sustenance of millions of workers. Using metagenomic and metaproteomic techniques, we characterize the bacterial diversity and physiological potential of fungus gardens from two species of Atta. Our analysis of over 1.2 Gbp of community metagenomic sequence and three 16S pyrotag libraries reveals that in addition to harboring the dominant fungal crop, these ecosystems contain abundant populations of Enterobacteriaceae, including the genera Enterobacter, Pantoea, Klebsiella, Citrobacter and Escherichia. We show that these bacterial communities possess genes associated with lignocellulose degradation and diverse biosynthetic pathways, suggesting that they play a role in nutrient cycling by converting the nitrogen-poor forage of the ants into B-vitamins, amino acids and other cellular components. Our metaproteomic analysis confirms that bacterial glycosyl hydrolases and proteins with putative biosynthetic functions are produced in both field-collected and laboratory-reared colonies. These results are consistent with the hypothesis that fungus gardens are specialized fungus-bacteria communities that convert plant material into energy for their ant hosts. Together with recent investigations into the microbial symbionts of vertebrates, our work underscores the importance of microbial communities in the ecology and evolution of herbivorous metazoans.
DOI: 10.1111/nph.14279
2016
Cited 118 times
Fungal and plant gene expression in the <i>Tulasnella calospora</i>–<i>Serapias vomeracea</i> symbiosis provides clues about nitrogen pathways in orchid mycorrhizas
Summary Orchids are highly dependent on their mycorrhizal fungal partners for nutrient supply, especially during early developmental stages. In addition to organic carbon, nitrogen (N) is probably a major nutrient transferred to the plant because orchid tissues are highly N‐enriched. We know almost nothing about the N form preferentially transferred to the plant or about the key molecular determinants required for N uptake and transfer. We identified, in the genome of the orchid mycorrhizal fungus Tulasnella calospora , two functional ammonium transporters and several amino acid transporters but found no evidence of a nitrate assimilation system, in agreement with the N preference of the free‐living mycelium grown on different N sources. Differential expression in symbiosis of a repertoire of fungal and plant genes involved in the transport and metabolism of N compounds suggested that organic N may be the main form transferred to the orchid host and that ammonium is taken up by the intracellular fungus from the apoplatic symbiotic interface. This is the first study addressing the genetic determinants of N uptake and transport in orchid mycorrhizas, and provides a model for nutrient exchanges at the symbiotic interface, which may guide future experiments.
DOI: 10.1038/ismej.2013.13
2013
Cited 117 times
Community-wide plasmid gene mobilization and selection
Plasmids have long been recognized as an important driver of DNA exchange and genetic innovation in prokaryotes. The success of plasmids has been attributed to their independent replication from the host's chromosome and their frequent self-transfer. It is thought that plasmids accumulate, rearrange and distribute nonessential genes, which may provide an advantage for host proliferation under selective conditions. In order to test this hypothesis independently of biases from culture selection, we study the plasmid metagenome from microbial communities in two activated sludge systems, one of which receives mostly household and the other chemical industry wastewater. We find that plasmids from activated sludge microbial communities carry among the largest proportion of unknown gene pools so far detected in metagenomic DNA, confirming their presumed role of DNA innovators. At a system level both plasmid metagenomes were dominated by functions associated with replication and transposition, and contained a wide variety of antibiotic and heavy metal resistances. Plasmid families were very different in the two metagenomes and grouped in deep-branching new families compared with known plasmid replicons. A number of abundant plasmid replicons could be completely assembled directly from the metagenome, providing insight in plasmid composition without culturing bias. Functionally, the two metagenomes strongly differed in several ways, including a greater abundance of genes for carbohydrate metabolism in the industrial and of general defense factors in the household activated sludge plasmid metagenome. This suggests that plasmids not only contribute to the adaptation of single individual prokaryotic species, but of the prokaryotic community as a whole under local selective conditions.
DOI: 10.1073/pnas.1715954115
2018
Cited 117 times
Linking secondary metabolites to gene clusters through genome sequencing of six diverse <i>Aspergillus</i> species
The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories, model organisms, and human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus, and A. steynii) have been whole-genome PacBio sequenced to provide genetic references in three Aspergillus sections. A. taichungensis and A. candidus also were sequenced for SM elucidation. Thirteen Aspergillus genomes were analyzed with comparative genomics to determine phylogeny and genetic diversity, showing that each presented genome contains 15-27% genes not found in other sequenced Aspergilli. In particular, A. novofumigatus was compared with the pathogenic species A. fumigatus This suggests that A. novofumigatus can produce most of the same allergens, virulence, and pathogenicity factors as A. fumigatus, suggesting that A. novofumigatus could be as pathogenic as A. fumigatus Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences, and predictive algorithms. We thus identify putative SM clusters for aflatoxin, chlorflavonin, and ochrindol in A. ochraceoroseus, A. campestris, and A. steynii, respectively, and novofumigatonin, ent-cycloechinulin, and epi-aszonalenins in A. novofumigatus Our study delivers six fungal genomes, showing the large diversity found in the Aspergillus genus; highlights the potential for discovery of beneficial or harmful SMs; and supports reports of A. novofumigatus pathogenicity. It also shows how biological, biochemical, and genomic information can be combined to identify genes involved in the biosynthesis of specific SMs.
DOI: 10.1111/nph.15297
2018
Cited 115 times
<scp>Genome‐wide association studies</scp> and expression‐based quantitative trait loci analyses reveal roles of <scp>HCT</scp> 2 in caffeoylquinic acid biosynthesis and its regulation by defense‐responsive transcription factors in <i>Populus</i>
3-O-caffeoylquinic acid, also known as chlorogenic acid (CGA), functions as an intermediate in lignin biosynthesis in the phenylpropanoid pathway. It is widely distributed among numerous plant species and acts as an antioxidant in both plants and animals. Using GC-MS, we discovered consistent and extreme variation in CGA content across a population of 739 4-yr-old Populus trichocarpa accessions. We performed genome-wide association studies (GWAS) from 917 P. trichocarpa accessions and expression-based quantitative trait loci (eQTL) analyses to identify key regulators. The GWAS and eQTL analyses resolved an overlapped interval encompassing a hydroxycinnamoyl-CoA:shikimate hydroxycinnamoyl transferase 2 (PtHCT2) that was significantly associated with CGA and partially characterized metabolite abundances. PtHCT2 leaf expression was significantly correlated with CGA abundance and it was regulated by cis-eQTLs containing W-box for WRKY binding. Among all nine PtHCT homologs, PtHCT2 is the only one that responds to infection by the fungal pathogen Sphaerulina musiva (a Populus pathogen). Validation using protoplast-based transient expression system suggests that PtHCT2 is regulated by the defense-responsive WRKY. These results are consistent with reports of CGA functioning as an antioxidant in response to biotic stress. This study provides insights into data-driven and omics-based inference of gene function in woody species.
DOI: 10.7554/elife.36426
2018
Cited 115 times
The Aquilegia genome provides insight into adaptive radiation and reveals an extraordinarily polymorphic chromosome with a unique history
The columbine genus Aquilegia is a classic example of an adaptive radiation, involving a wide variety of pollinators and habitats. Here we present the genome assembly of A. coerulea ‘Goldsmith’, complemented by high-coverage sequencing data from 10 wild species covering the world-wide distribution. Our analyses reveal extensive allele sharing among species and demonstrate that introgression and selection played a role in the Aquilegia radiation. We also present the remarkable discovery that the evolutionary history of an entire chromosome differs from that of the rest of the genome – a phenomenon that we do not fully understand, but which highlights the need to consider chromosomes in an evolutionary context.
DOI: 10.1016/j.ijfoodmicro.2012.05.008
2012
Cited 106 times
The genome of wine yeast Dekkera bruxellensis provides a tool to explore its food-related properties
The yeast Dekkera/Brettanomyces bruxellensis can cause enormous economic losses in wine industry due to production of phenolic off-flavor compounds. D. bruxellensis is a distant relative of baker's yeast Saccharomyces cerevisiae. Nevertheless, these two yeasts are often found in the same habitats and share several food-related traits, such as production of high ethanol levels and ability to grow without oxygen. In some food products, like lambic beer, D. bruxellensis can importantly contribute to flavor development. We determined the 13.4 Mb genome sequence of the D. bruxellensis strain Y879 (CBS2499) and deduced the genetic background of several "food-relevant" properties and evolutionary history of this yeast. Surprisingly, we find that this yeast is phylogenetically distant to other food-related yeasts and most related to Pichia (Komagataella) pastoris, which is an aerobic poor ethanol producer. We further show that the D. bruxellensis genome does not contain an excess of lineage specific duplicated genes nor a horizontally transferred URA1 gene, two crucial events that promoted the evolution of the food relevant traits in the S. cerevisiae lineage. However, D. bruxellensis has several independently duplicated ADH and ADH-like genes, which are likely responsible for metabolism of alcohols, including ethanol, and also a range of aromatic compounds.
DOI: 10.1038/s41559-017-0119
2017
Cited 103 times
Young inversion with multiple linked QTLs under selection in a hybrid zone
Fixed chromosomal inversions can reduce gene flow and promote speciation in two ways: by suppressing recombination and by carrying locally favoured alleles at multiple loci. However, it is unknown whether favoured mutations slowly accumulate on older inversions or if young inversions spread because they capture pre-existing adaptive quantitative trait loci (QTLs). By genetic mapping, chromosome painting and genome sequencing, we have identified a major inversion controlling ecologically important traits in Boechera stricta. The inversion arose since the last glaciation and subsequently reached local high frequency in a hybrid speciation zone. Furthermore, the inversion shows signs of positive directional selection. To test whether the inversion could have captured existing, linked QTLs, we crossed standard, collinear haplotypes from the hybrid zone and found multiple linked phenology QTLs within the inversion region. These findings provide the first direct evidence that linked, locally adapted QTLs may be captured by young inversions during incipient speciation.
DOI: 10.1186/1471-2164-12-334
2011
Cited 100 times
Complete genome sequence of the filamentous anoxygenic phototrophic bacterium Chloroflexus aurantiacus
Chloroflexus aurantiacus is a thermophilic filamentous anoxygenic phototrophic (FAP) bacterium, and can grow phototrophically under anaerobic conditions or chemotrophically under aerobic and dark conditions. According to 16S rRNA analysis, Chloroflexi species are the earliest branching bacteria capable of photosynthesis, and Cfl. aurantiacus has been long regarded as a key organism to resolve the obscurity of the origin and early evolution of photosynthesis. Cfl. aurantiacus contains a chimeric photosystem that comprises some characters of green sulfur bacteria and purple photosynthetic bacteria, and also has some unique electron transport proteins compared to other photosynthetic bacteria.The complete genomic sequence of Cfl. aurantiacus has been determined, analyzed and compared to the genomes of other photosynthetic bacteria.Abundant genomic evidence suggests that there have been numerous gene adaptations/replacements in Cfl. aurantiacus to facilitate life under both anaerobic and aerobic conditions, including duplicate genes and gene clusters for the alternative complex III (ACIII), auracyanin and NADH:quinone oxidoreductase; and several aerobic/anaerobic enzyme pairs in central carbon metabolism and tetrapyrroles and nucleic acids biosynthesis. Overall, genomic information is consistent with a high tolerance for oxygen that has been reported in the growth of Cfl. aurantiacus. Genes for the chimeric photosystem, photosynthetic electron transport chain, the 3-hydroxypropionate autotrophic carbon fixation cycle, CO2-anaplerotic pathways, glyoxylate cycle, and sulfur reduction pathway are present. The central carbon metabolism and sulfur assimilation pathways in Cfl. aurantiacus are discussed. Some features of the Cfl. aurantiacus genome are compared with those of the Roseiflexus castenholzii genome. Roseiflexus castenholzii is a recently characterized FAP bacterium and phylogenetically closely related to Cfl. aurantiacus. According to previous reports and the genomic information, perspectives of Cfl. aurantiacus in the evolution of photosynthesis are also discussed.The genomic analyses presented in this report, along with previous physiological, ecological and biochemical studies, indicate that the anoxygenic phototroph Cfl. aurantiacus has many interesting and certain unique features in its metabolic pathways. The complete genome may also shed light on possible evolutionary connections of photosynthesis.
DOI: 10.1038/s41559-018-0710-4
2018
Cited 97 times
Pezizomycetes genomes reveal the molecular basis of ectomycorrhizal truffle lifestyle
Tuberaceae is one of the most diverse lineages of symbiotic truffle-forming fungi. To understand the molecular underpinning of the ectomycorrhizal truffle lifestyle, we compared the genomes of Piedmont white truffle (Tuber magnatum), Périgord black truffle (Tuber melanosporum), Burgundy truffle (Tuber aestivum), pig truffle (Choiromyces venosus) and desert truffle (Terfezia boudieri) to saprotrophic Pezizomycetes. Reconstructed gene duplication/loss histories along a time-calibrated phylogeny of Ascomycetes revealed that Tuberaceae-specific traits may be related to a higher gene diversification rate. Genomic features in Tuber species appear to be very similar, with high transposon content, few genes coding lignocellulose-degrading enzymes, a substantial set of lineage-specific fruiting-body-upregulated genes and high expression of genes involved in volatile organic compound metabolism. Developmental and metabolic pathways expressed in ectomycorrhizae and fruiting bodies of T. magnatum and T. melanosporum are unexpectedly very similar, owing to the fact that they diverged ~100 Ma. Volatile organic compounds from pungent truffle odours are not the products of Tuber-specific gene innovations, but rely on the differential expression of an existing gene repertoire. These genomic resources will help to address fundamental questions in the evolution of the truffle lifestyle and the ecology of fungi that have been praised as food delicacies for centuries.
DOI: 10.1128/aem.03833-12
2013
Cited 96 times
Leucoagaricus gongylophorus Produces Diverse Enzymes for the Degradation of Recalcitrant Plant Polymers in Leaf-Cutter Ant Fungus Gardens
Plants represent a large reservoir of organic carbon comprised primarily of recalcitrant polymers that most metazoans are unable to deconstruct. Many herbivores gain access to nutrients in this material indirectly by associating with microbial symbionts, and leaf-cutter ants are a paradigmatic example. These ants use fresh foliar biomass as manure to cultivate gardens composed primarily of Leucoagaricus gongylophorus, a basidiomycetous fungus that produces specialized hyphal swellings that serve as a food source for the host ant colony. Although leaf-cutter ants are conspicuous herbivores that contribute substantially to carbon turnover in Neotropical ecosystems, the process through which plant biomass is degraded in their fungus gardens is not well understood. Here we present the first draft genome of L. gongylophorus, and, using genomic and metaproteomic tools, we investigate its role in lignocellulose degradation in the gardens of both Atta cephalotes and Acromyrmex echinatior leaf-cutter ants. We show that L. gongylophorus produces a diversity of lignocellulases in ant gardens and is likely the primary driver of plant biomass degradation in these ecosystems. We also show that this fungus produces distinct sets of lignocellulases throughout the different stages of biomass degradation, including numerous cellulases and laccases that likely play an important role in lignocellulose degradation. Our study provides a detailed analysis of plant biomass degradation in leaf-cutter ant fungus gardens and insight into the enzymes underlying the symbiosis between these dominant herbivores and their obligate fungal cultivar.
DOI: 10.1371/journal.pgen.1004759
2014
Cited 93 times
Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood
Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on fresh-cut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genes involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea's extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes.
DOI: 10.1038/s41467-018-07669-x
2018
Cited 90 times
The genomic landscape of molecular responses to natural drought stress in Panicum hallii
Environmental stress is a major driver of ecological community dynamics and agricultural productivity. This is especially true for soil water availability, because drought is the greatest abiotic inhibitor of worldwide crop yields. Here, we test the genetic basis of drought responses in the genetic model for C4 perennial grasses, Panicum hallii, through population genomics, field-scale gene-expression (eQTL) analysis, and comparison of two complete genomes. While gene expression networks are dominated by local cis-regulatory elements, we observe three genomic hotspots of unlinked trans-regulatory loci. These regulatory hubs are four times more drought responsive than the genome-wide average. Additionally, cis- and trans-regulatory networks are more likely to have opposing effects than expected under neutral evolution, supporting a strong influence of compensatory evolution and stabilizing selection. These results implicate trans-regulatory evolution as a driver of drought responses and demonstrate the potential for crop improvement in drought-prone regions through modification of gene regulatory networks.
DOI: 10.1111/tpj.13940
2018
Cited 89 times
The <i>Physcomitrella patens</i> gene atlas project: large‐scale <scp>RNA</scp> ‐seq based expression data
Summary High‐throughput RNA sequencing ( RNA ‐seq) has recently become the method of choice to define and analyze transcriptomes. For the model moss Physcomitrella patens , although this method has been used to help analyze specific perturbations, no overall reference dataset has yet been established. In the framework of the Gene Atlas project, the Joint Genome Institute selected P. patens as a flagship genome, opening the way to generate the first comprehensive transcriptome dataset for this moss. The first round of sequencing described here is composed of 99 independent libraries spanning 34 different developmental stages and conditions. Upon dataset quality control and processing through read mapping, 28 509 of the 34 361 v3.3 gene models (83%) were detected to be expressed across the samples. Differentially expressed genes ( DEG s) were calculated across the dataset to permit perturbation comparisons between conditions. The analysis of the three most distinct and abundant P. patens growth stages – protonema, gametophore and sporophyte – allowed us to define both general transcriptional patterns and stage‐specific transcripts. As an example of variation of physico‐chemical growth conditions, we detail here the impact of ammonium supplementation under standard growth conditions on the protonemal transcriptome. Finally, the cooperative nature of this project allowed us to analyze inter‐laboratory variation, as 13 different laboratories around the world provided samples. We compare differences in the replication of experiments in a single laboratory and between different laboratories.
DOI: 10.1371/journal.pone.0141586
2015
Cited 88 times
Strand-Specific RNA-Seq Analyses of Fruiting Body Development in Coprinopsis cinerea
The basidiomycete fungus Coprinopsis cinerea is an important model system for multicellular development. Fruiting bodies of C. cinerea are typical mushrooms, which can be produced synchronously on defined media in the laboratory. To investigate the transcriptome in detail during fruiting body development, high-throughput sequencing (RNA-seq) was performed using cDNA libraries strand-specifically constructed from 13 points (stages/tissues) with two biological replicates. The reads were aligned to 14,245 predicted transcripts, and counted for forward and reverse transcripts. Differentially expressed genes (DEGs) between two adjacent points and between vegetative mycelium and each point were detected by Tag Count Comparison (TCC). To validate RNA-seq data, expression levels of selected genes were compared using RPKM values in RNA-seq data and qRT-PCR data, and DEGs detected in microarray data were examined in MA plots of RNA-seq data by TCC. We discuss events deduced from GO analysis of DEGs. In addition, we uncovered both transcription factor candidates and antisense transcripts that are likely to be involved in developmental regulation for fruiting.
DOI: 10.1186/s13059-020-1952-4
2020
Cited 83 times
A willow sex chromosome reveals convergent evolution of complex palindromic repeats
Abstract Background Sex chromosomes have arisen independently in a wide variety of species, yet they share common characteristics, including the presence of suppressed recombination surrounding sex determination loci. Mammalian sex chromosomes contain multiple palindromic repeats across the non-recombining region that show sequence conservation through gene conversion and contain genes that are crucial for sexual reproduction. In plants, it is not clear if palindromic repeats play a role in maintaining sequence conservation in the absence of homologous recombination. Results Here we present the first evidence of large palindromic structures in a plant sex chromosome, based on a highly contiguous assembly of the W chromosome of the dioecious shrub Salix purpurea . The W chromosome has an expanded number of genes due to transpositions from autosomes. It also contains two consecutive palindromes that span a region of 200 kb, with conspicuous 20-kb stretches of highly conserved sequences among the four arms that show evidence of gene conversion. Four genes in the palindrome are homologous to genes in the sex determination regions of the closely related genus Populus , which is located on a different chromosome. These genes show distinct, floral-biased expression patterns compared to paralogous copies on autosomes. Conclusion The presence of palindromes in sex chromosomes of mammals and plants highlights the intrinsic importance of these features in adaptive evolution in the absence of recombination. Convergent evolution is driving both the independent establishment of sex chromosomes as well as their fine-scale sequence structure.
DOI: 10.1128/genomea.01105-17
2017
Cited 81 times
Draft Nuclear Genome Sequence of the Halophilic and Beta-Carotene-Accumulating Green Alga <i>Dunaliella salina</i> Strain CCAP19/18
The halotolerant alga Dunaliella salina is a model for stress tolerance and is used commercially for production of beta-carotene (=pro-vitamin A). The presented draft genome of the genuine strain CCAP19/18 will allow investigations into metabolic processes involved in regulation of stress responses, including carotenogenesis and adaptations to life in high-salinity environments.
DOI: 10.1016/j.molp.2016.03.009
2016
Cited 78 times
Genome-Wide Sequencing of 41 Rice (Oryza sativa L.) Mutated Lines Reveals Diverse Mutations Induced by Fast-Neutron Irradiation
Fast-neutron (FN) irradiation has been used to create mutagenized collections of many plant species (Bolon et al., 2014). FN-induced mutagenesis has clear advantages: it is an efficient means of saturating the genome, and it does not involve time-consuming plant transformation or tissue culture. In rice, most mutant collections, although highly valuable, were generated using either T-DNA insertion or transposon tagging approaches that often induce mutations unlinked to the insertion and complicating analysis (Wang et al., 2013).
DOI: 10.1186/s12870-019-1653-x
2019
Cited 77 times
Genome-wide association analysis of stalk biomass and anatomical traits in maize
Maize stover is an important source of crop residues and a promising sustainable energy source in the United States. Stalk is the main component of stover, representing about half of stover dry weight. Characterization of genetic determinants of stalk traits provide a foundation to optimize maize stover as a biofuel feedstock. We investigated maize natural genetic variation in genome-wide association studies (GWAS) to detect candidate genes associated with traits related to stalk biomass (stalk diameter and plant height) and stalk anatomy (rind thickness, vascular bundle density and area). Using a panel of 942 diverse inbred lines, 899,784 RNA-Seq derived single nucleotide polymorphism (SNP) markers were identified. Stalk traits were measured on 800 members of the panel in replicated field trials across years. GWAS revealed 16 candidate genes associated with four stalk traits. Most of the detected candidate genes were involved in fundamental cellular functions, such as regulation of gene expression and cell cycle progression. Two of the regulatory genes (Zmm22 and an ortholog of Fpa) that were associated with plant height were previously shown to be involved in regulating the vegetative to floral transition. The association of Zmm22 with plant height was confirmed using a transgenic approach. Transgenic lines with increased expression of Zmm22 showed a significant decrease in plant height as well as tassel branch number, indicating a pleiotropic effect of Zmm22. Substantial heritable variation was observed in the association panel for stalk traits, indicating a large potential for improving useful stalk traits in breeding programs. Genome-wide association analyses detected several candidate genes associated with multiple traits, suggesting common regulatory elements underlie various stalk traits. Results of this study provide insights into the genetic control of maize stalk anatomy and biomass.
DOI: 10.1111/nph.15613
2019
Cited 75 times
Phylogenomics of Endogonaceae and evolution of mycorrhizas within Mucoromycota
Endogonales (Mucoromycotina), composed of Endogonaceae and Densosporaceae, is the only known non-Dikarya order with ectomycorrhizal members. They also form mycorrhizal-like association with some nonspermatophyte plants. It has been recently proposed that Endogonales were among the earliest mycorrhizal partners with land plants. It remains unknown whether Endogonales possess genomes with mycorrhizal-lifestyle signatures and whether Endogonales originated around the same time as land plants did. We sampled sporocarp tissue from four Endogonaceae collections and performed shotgun genome sequencing. After binning the metagenome data, we assembled and annotated the Endogonaceae genomes. We performed comparative analysis on plant-cell-wall-degrading enzymes (PCWDEs) and small secreted proteins (SSPs). We inferred phylogenetic placement of Endogonaceae and estimated the ages of Endogonaceae and Endogonales with expanded taxon sampling. Endogonaceae have large genomes with high repeat content, low diversity of PCWDEs, but without elevated SSP/secretome ratios. Dating analysis estimated that Endogonaceae originated in the Permian-Triassic boundary and Endogonales originated in the mid-late Silurian. Mycoplasma-related endobacterium sequences were identified in three Endogonaceae genomes. Endogonaceae genomes possess typical signatures of mycorrhizal lifestyle. The early origin of Endogonales suggests that the mycorrhizal association between Endogonales and plants might have played an important role during the colonization of land by plants.
DOI: 10.1073/pnas.1821543116
2019
Cited 75 times
QTL × environment interactions underlie adaptive divergence in switchgrass across a large latitudinal gradient
Local adaptation is the process by which natural selection drives adaptive phenotypic divergence across environmental gradients. Theory suggests that local adaptation results from genetic trade-offs at individual genetic loci, where adaptation to one set of environmental conditions results in a cost to fitness in alternative environments. However, the degree to which there are costs associated with local adaptation is poorly understood because most of these experiments rely on two-site reciprocal transplant experiments. Here, we quantify the benefits and costs of locally adaptive loci across 17° of latitude in a four-grandparent outbred mapping population in outcrossing switchgrass (Panicum virgatum L.), an emerging biofuel crop and dominant tallgrass species. We conducted quantitative trait locus (QTL) mapping across 10 sites, ranging from Texas to South Dakota. This analysis revealed that beneficial biomass (fitness) QTL generally incur minimal costs when transplanted to other field sites distributed over a large climatic gradient over the 2 y of our study. Therefore, locally advantageous alleles could potentially be combined across multiple loci through breeding to create high-yielding regionally adapted cultivars.
DOI: 10.1186/s13059-020-02162-5
2020
Cited 74 times
A genome assembly and the somatic genetic and epigenetic mutation rate in a wild long-lived perennial Populus trichocarpa
Plants can transmit somatic mutations and epimutations to offspring, which in turn can affect fitness. Knowledge of the rate at which these variations arise is necessary to understand how plant development contributes to local adaption in an ecoevolutionary context, particularly in long-lived perennials.Here, we generate a new high-quality reference genome from the oldest branch of a wild Populus trichocarpa tree with two dominant stems which have been evolving independently for 330 years. By sampling multiple, age-estimated branches of this tree, we use a multi-omics approach to quantify age-related somatic changes at the genetic, epigenetic, and transcriptional level. We show that the per-year somatic mutation and epimutation rates are lower than in annuals and that transcriptional variation is mainly independent of age divergence and cytosine methylation. Furthermore, a detailed analysis of the somatic epimutation spectrum indicates that transgenerationally heritable epimutations originate mainly from DNA methylation maintenance errors during mitotic rather than during meiotic cell divisions.Taken together, our study provides unprecedented insights into the origin of nucleotide and functional variation in a long-lived perennial plant.
DOI: 10.1038/s41467-020-18923-6
2020
Cited 73 times
Genome biology of the paleotetraploid perennial biomass crop Miscanthus
Miscanthus is a perennial wild grass that is of global importance for paper production, roofing, horticultural plantings, and an emerging highly productive temperate biomass crop. We report a chromosome-scale assembly of the paleotetraploid M. sinensis genome, providing a resource for Miscanthus that links its chromosomes to the related diploid Sorghum and complex polyploid sugarcanes. The asymmetric distribution of transposons across the two homoeologous subgenomes proves Miscanthus paleo-allotetraploidy and identifies several balanced reciprocal homoeologous exchanges. Analysis of M. sinensis and M. sacchariflorus populations demonstrates extensive interspecific admixture and hybridization, and documents the origin of the highly productive triploid bioenergy crop M. × giganteus. Transcriptional profiling of leaves, stem, and rhizomes over growing seasons provides insight into rhizome development and nutrient recycling, processes critical for sustainable biomass accumulation in a perennial temperate grass. The Miscanthus genome expands the power of comparative genomics to understand traits of importance to Andropogoneae grasses.
DOI: 10.1371/journal.pgen.1007267
2018
Cited 71 times
Preferential retention of genes from one parental genome after polyploidy illustrates the nature and scope of the genomic conflicts induced by hybridization
Polyploidy is increasingly seen as a driver of both evolutionary innovation and ecological success. One source of polyploid organisms' successes may be their origins in the merging and mixing of genomes from two different species (e.g., allopolyploidy). Using POInT (the Polyploid Orthology Inference Tool), we model the resolution of three allopolyploidy events, one from the bakers' yeast (Saccharomyces cerevisiae), one from the thale cress (Arabidopsis thaliana) and one from grasses including Sorghum bicolor. Analyzing a total of 21 genomes, we assign to every gene a probability for having come from each parental subgenome (i.e., derived from the diploid progenitor species), yielding orthologous segments across all genomes. Our model detects statistically robust evidence for the existence of biased fractionation in all three lineages, whereby genes from one of the two subgenomes were more likely to be lost than those from the other subgenome. We further find that a driver of this pattern of biased losses is the co-retention of genes from the same parental genome that share functional interactions. The pattern of biased fractionation after the Arabidopsis and grass allopolyploid events was surprisingly constant in time, with the same parental genome favored throughout the lineages' history. In strong contrast, the yeast allopolyploid event shows evidence of biased fractionation only immediately after the event, with balanced gene losses more recently. The rapid loss of functionally associated genes from a single subgenome is difficult to reconcile with the action of genetic drift and suggests that selection may favor the removal of specific duplicates. Coupled to the evidence for continuing, functionally-associated biased fractionation after the A. thaliana At-α event, we suggest that, after allopolyploidy, there are functional conflicts between interacting genes encoded in different subgenomes that are ultimately resolved through preferential duplicate loss.
DOI: 10.1038/s41467-020-17302-5
2020
Cited 71 times
Gradual polyploid genome evolution revealed by pan-genomic analysis of Brachypodium hybridum and its diploid progenitors
Abstract Our understanding of polyploid genome evolution is constrained because we cannot know the exact founders of a particular polyploid. To differentiate between founder effects and post polyploidization evolution, we use a pan-genomic approach to study the allotetraploid Brachypodium hybridum and its diploid progenitors. Comparative analysis suggests that most B. hybridum whole gene presence/absence variation is part of the standing variation in its diploid progenitors. Analysis of nuclear single nucleotide variants, plastomes and k-mers associated with retrotransposons reveals two independent origins for B. hybridum , ~1.4 and ~0.14 million years ago. Examination of gene expression in the younger B. hybridum lineage reveals no bias in overall subgenome expression. Our results are consistent with a gradual accumulation of genomic changes after polyploidization and a lack of subgenome expression dominance. Significantly, if we did not use a pan-genomic approach, we would grossly overestimate the number of genomic changes attributable to post polyploidization evolution.
DOI: 10.1111/tpj.14607
2020
Cited 66 times
PEATmoss (<i>Physcomitrella</i> Expression Atlas Tool): a unified gene expression atlas for the model plant <i>Physcomitrella patens</i>
Summary Physcomitrella patens is a bryophyte model plant that is often used to study plant evolution and development. Its resources are of great importance for comparative genomics and evo‐devo approaches. However, expression data from Physcomitrella patens were so far generated using different gene annotation versions and three different platforms: CombiMatrix and NimbleGen expression microarrays and RNA sequencing. The currently available P. patens expression data are distributed across three tools with different visualization methods to access the data. Here, we introduce an interactive expression atlas, Physcomitrella Expression Atlas Tool (PEATmoss), that unifies publicly available expression data for P. patens and provides multiple visualization methods to query the data in a single web‐based tool. Moreover, PEATmoss includes 35 expression experiments not previously available in any other expression atlas. To facilitate gene expression queries across different gene annotation versions, and to access P. patens annotations and related resources, a lookup database and web tool linked to PEATmoss was implemented. PEATmoss can be accessed at https://peatmoss.online.uni-marburg.de
DOI: 10.1007/s13225-020-00455-5
2020
Cited 66 times
Resolving the Mortierellaceae phylogeny through synthesis of multi-gene phylogenetics and phylogenomics
Early efforts to classify Mortierellaceae were based on macro- and micromorphology, but sequencing and phylogenetic studies with ribosomal DNA (rDNA) markers have demonstrated conflicting taxonomic groupings and polyphyletic genera. Although some taxonomic confusion in the family has been clarified, rDNA data alone is unable to resolve higher level phylogenetic relationships within Mortierellaceae. In this study, we applied two parallel approaches to resolve the Mortierellaceae phylogeny: low coverage genome (LCG) sequencing and high-throughput, multiplexed targeted amplicon sequencing to generate sequence data for multi-gene phylogenetics. We then combined our datasets to provide a well-supported genome-based phylogeny having broad sampling depth from the amplicon dataset. Resolving the Mortierellaceae phylogeny into monophyletic genera resulted in 13 genera, 7 of which are newly proposed. Low-coverage genome sequencing proved to be a relatively cost-effective means of generating a high-confidence phylogeny. The multi-gene phylogenetics approach enabled much greater sampling depth and breadth than the LCG approach, but has limitations too. We present this work to resolve some of the taxonomic confusion and provide a genus-level framework to empower future studies on Mortierellaceae diversity and evolution.
DOI: 10.1038/s41467-021-27479-y
2021
Cited 62 times
Genetic determinants of endophytism in the Arabidopsis root mycobiome
Abstract The roots of Arabidopsis thaliana host diverse fungal communities that affect plant health and disease states. Here, we sequence the genomes of 41 fungal isolates representative of the A. thaliana root mycobiota for comparative analysis with other 79 plant-associated fungi. Our analyses indicate that root mycobiota members evolved from ancestors with diverse lifestyles and retain large repertoires of plant cell wall-degrading enzymes (PCWDEs) and effector-like small secreted proteins. We identify a set of 84 gene families associated with endophytism, including genes encoding PCWDEs acting on xylan (family GH10) and cellulose (family AA9). Transcripts encoding these enzymes are also part of a conserved transcriptional program activated by phylogenetically-distant mycobiota members upon host contact. Recolonization experiments with individual fungi indicate that strains with detrimental effects in mono-association with the host colonize roots more aggressively than those with beneficial activities, and dominate in natural root samples. Furthermore, we show that the pectin-degrading enzyme family PL1_7 links aggressiveness of endophytic colonization to plant health.
DOI: 10.1186/s12864-019-6262-4
2019
Cited 60 times
Genome sequence of the model rice variety KitaakeX
Abstract Background The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica ), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica ). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.
DOI: 10.1126/sciadv.abh2488
2021
Cited 55 times
Gene-rich UV sex chromosomes harbor conserved regulators of sexual development
Nonrecombining sex chromosomes, like the mammalian Y, often lose genes and accumulate transposable elements, a process termed degeneration. The correlation between suppressed recombination and degeneration is clear in animal XY systems, but the absence of recombination is confounded with other asymmetries between the X and Y. In contrast, UV sex chromosomes, like those found in bryophytes, experience symmetrical population genetic conditions. Here, we generate nearly gapless female and male chromosome-scale reference genomes of the moss Ceratodon purpureus to test for degeneration in the bryophyte UV sex chromosomes. We show that the moss sex chromosomes evolved over 300 million years ago and expanded via two chromosomal fusions. Although the sex chromosomes exhibit weaker purifying selection than autosomes, we find that suppressed recombination alone is insufficient to drive degeneration. Instead, the U and V sex chromosomes harbor thousands of broadly expressed genes, including numerous key regulators of sexual development across land plants.
DOI: 10.1111/1462-2920.15423
2021
Cited 47 times
Gene family expansions and transcriptome signatures uncover fungal adaptations to wood decay
Summary Because they comprise some of the most efficient wood‐decayers, Polyporales fungi impact carbon cycling in forest environment. Despite continuous discoveries on the enzymatic machinery involved in wood decomposition, the vision on their evolutionary adaptation to wood decay and genome diversity remains incomplete. We combined the genome sequence information from 50 Polyporales species, including 26 newly sequenced genomes and sought for genomic and functional adaptations to wood decay through the analysis of genome composition and transcriptome responses to different carbon sources. The genomes of Polyporales from different phylogenetic clades showed poor conservation in macrosynteny, indicative of genome rearrangements. We observed different gene family expansion/contraction histories for plant cell wall degrading enzymes in core polyporoids and phlebioids and captured expansions for genes involved in signalling and regulation in the lineages of white rotters. Furthermore, we identified conserved cupredoxins, thaumatin‐like proteins and lytic polysaccharide monooxygenases with a yet uncharacterized appended module as new candidate players in wood decomposition. Given the current need for enzymatic toolkits dedicated to the transformation of renewable carbon sources, the observed genomic diversity among Polyporales strengthens the relevance of mining Polyporales biodiversity to understand the molecular mechanisms of wood decay.
DOI: 10.1111/nph.17160
2021
Cited 42 times
Comparative genomics reveals dynamic genome evolution in host specialist ectomycorrhizal fungi
While there has been significant progress characterizing the 'symbiotic toolkit' of ectomycorrhizal (ECM) fungi, how host specificity may be encoded into ECM fungal genomes remains poorly understood. We conducted a comparative genomic analysis of ECM fungal host specialists and generalists, focusing on the specialist genus Suillus. Global analyses of genome dynamics across 46 species were assessed, along with targeted analyses of three classes of molecules previously identified as important determinants of host specificity: small secreted proteins (SSPs), secondary metabolites (SMs) and G-protein coupled receptors (GPCRs). Relative to other ECM fungi, including other host specialists, Suillus had highly dynamic genomes including numerous rapidly evolving gene families and many domain expansions and contractions. Targeted analyses supported a role for SMs but not SSPs or GPCRs in Suillus host specificity. Phylogenomic-based ancestral state reconstruction identified Larix as the ancestral host of Suillus, with multiple independent switches between white and red pine hosts. These results suggest that like other defining characteristics of the ECM lifestyle, host specificity is a dynamic process at the genome level. In the case of Suillus, both SMs and pathways involved in the deactivation of reactive oxygen species appear to be strongly associated with enhanced host specificity.
DOI: 10.1111/nph.17892
2022
Cited 25 times
Evolutionary transition to the ectomycorrhizal habit in the genomes of a hyperdiverse lineage of mushroom‐forming fungi
The ectomycorrhizal (ECM) symbiosis has independently evolved from diverse types of saprotrophic ancestors. In this study, we seek to identify genomic signatures of the transition to the ECM habit within the hyperdiverse Russulaceae. We present comparative analyses of the genomic architecture and the total and secreted gene repertoires of 18 species across the order Russulales, of which 13 are newly sequenced, including a representative of a saprotrophic member of Russulaceae, Gloeopeniophorella convolvens. The genomes of ECM Russulaceae are characterized by a loss of genes for plant cell wall-degrading enzymes (PCWDEs), an expansion of genome size through increased transposable element (TE) content, a reduction in secondary metabolism clusters, and an association of small secreted proteins (SSPs) with TE 'nests', or dense aggregations of TEs. Some PCWDEs have been retained or even expanded, mostly in a species-specific manner. The genome of G. convolvens possesses some characteristics of ECM genomes (e.g. loss of some PCWDEs, TE expansion, reduction in secondary metabolism clusters). Functional specialization in ECM decomposition may drive diversification. Accelerated gene evolution predates the evolution of the ECM habit, indicating that changes in genome architecture and gene content may be necessary to prime the evolutionary switch.
DOI: 10.1093/nar/gkad616
2023
Cited 13 times
JGI Plant Gene Atlas: an updateable transcriptome resource to improve functional gene descriptions across the plant kingdom
Abstract Gene functional descriptions offer a crucial line of evidence for candidate genes underlying trait variation. Conversely, plant responses to environmental cues represent important resources to decipher gene function and subsequently provide molecular targets for plant improvement through gene editing. However, biological roles of large proportions of genes across the plant phylogeny are poorly annotated. Here we describe the Joint Genome Institute (JGI) Plant Gene Atlas, an updateable data resource consisting of transcript abundance assays spanning 18 diverse species. To integrate across these diverse genotypes, we analyzed expression profiles, built gene clusters that exhibited tissue/condition specific expression, and tested for transcriptional response to environmental queues. We discovered extensive phylogenetically constrained and condition-specific expression profiles for genes without any previously documented functional annotation. Such conserved expression patterns and tightly co-expressed gene clusters let us assign expression derived additional biological information to 64 495 genes with otherwise unknown functions. The ever-expanding Gene Atlas resource is available at JGI Plant Gene Atlas (https://plantgeneatlas.jgi.doe.gov) and Phytozome (https://phytozome.jgi.doe.gov/), providing bulk access to data and user-specified queries of gene sets. Combined, these web interfaces let users access differentially expressed genes, track orthologs across the Gene Atlas plants, graphically represent co-expressed genes, and visualize gene ontology and pathway enrichments.
DOI: 10.1038/s41564-023-01448-1
2023
Cited 10 times
Vertical and horizontal gene transfer shaped plant colonization and biomass degradation in the fungal genus Armillaria
The fungal genus Armillaria contains necrotrophic pathogens and some of the largest terrestrial organisms that cause tremendous losses in diverse ecosystems, yet how they evolved pathogenicity in a clade of dominantly non-pathogenic wood degraders remains elusive. Here we show that Armillaria species, in addition to gene duplications and de novo gene origins, acquired at least 1,025 genes via 124 horizontal gene transfer events, primarily from Ascomycota. Horizontal gene transfer might have affected plant biomass degrading and virulence abilities of Armillaria, and provides an explanation for their unusual, soft rot-like wood decay strategy. Combined multi-species expression data revealed extensive regulation of horizontally acquired and wood-decay related genes, putative virulence factors and two novel conserved pathogenicity-induced small secreted proteins, which induced necrosis in planta. Overall, this study details how evolution knitted together horizontally and vertically inherited genes in complex adaptive traits of plant biomass degradation and pathogenicity in important fungal pathogens.
DOI: 10.1073/pnas.2312607121
2024
Cited 4 times
Extraordinary preservation of gene collinearity over three hundred million years revealed in homosporous lycophytes
Homosporous lycophytes (Lycopodiaceae) are a deeply diverged lineage in the plant tree of life, having split from heterosporous lycophytes ( Selaginella and Isoetes ) ~400 Mya. Compared to the heterosporous lineage, Lycopodiaceae has markedly larger genome sizes and remains the last major plant clade for which no chromosome-level assembly has been available. Here, we present chromosomal genome assemblies for two homosporous lycophyte species, the allotetraploid Huperzia asiatica and the diploid Diphasiastrum complanatum . Remarkably, despite that the two species diverged ~350 Mya, around 30% of the genes are still in syntenic blocks. Furthermore, both genomes had undergone independent whole genome duplications, and the resulting intragenomic syntenies have likewise been preserved relatively well. Such slow genome evolution over deep time is in stark contrast to heterosporous lycophytes and is correlated with a decelerated rate of nucleotide substitution. Together, the genomes of H. asiatica and D. complanatum not only fill a crucial gap in the plant genomic landscape but also highlight a potentially meaningful genomic contrast between homosporous and heterosporous species.
DOI: 10.1038/s41477-023-01608-5
2024
Seagrass genomes reveal ancient polyploidy and adaptations to the marine environment
DOI: 10.1016/s1088-3371(96)00004-6
1997
Cited 116 times
DOI: 10.1111/tpj.12569
2014
Cited 70 times
Genome diversity in <i>Brachypodium distachyon:</i> deep sequencing of highly diverse inbred lines
Brachypodium distachyon is small annual grass that has been adopted as a model for the grasses. Its small genome, high-quality reference genome, large germplasm collection, and selfing nature make it an excellent subject for studies of natural variation. We sequenced six divergent lines to identify a comprehensive set of polymorphisms and analyze their distribution and concordance with gene expression. Multiple methods and controls were utilized to identify polymorphisms and validate their quality. mRNA-Seq experiments under control and simulated drought-stress conditions, identified 300 genes with a genotype-dependent treatment response. We showed that large-scale sequence variants had extremely high concordance with altered expression of hundreds of genes, including many with genotype-dependent treatment responses. We generated a deep mRNA-Seq dataset for the most divergent line and created a de novo transcriptome assembly. This led to the discovery of >2400 previously unannotated transcripts and hundreds of genes not present in the reference genome. We built a public database for visualization and investigation of sequence variants among these widely used inbred lines.