ϟ

Xiaofeng Zhu

Here are all the papers by Xiaofeng Zhu that you can download and read on OA.mg.
Xiaofeng Zhu’s last known institution is . Download Xiaofeng Zhu PDFs here.

Claim this Profile →
DOI: 10.1038/ng.2770
2013
Cited 1,224 times
Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis
Using the ImmunoChip custom genotyping array, we analyzed 14,498 subjects with multiple sclerosis and 24,091 healthy controls for 161,311 autosomal variants and identified 135 potentially associated regions (P < 1.0 × 10(-4)). In a replication phase, we combined these data with previous genome-wide association study (GWAS) data from an independent 14,802 subjects with multiple sclerosis and 26,703 healthy controls. In these 80,094 individuals of European ancestry, we identified 48 new susceptibility variants (P < 5.0 × 10(-8)), 3 of which we found after conditioning on previously identified variants. Thus, there are now 110 established multiple sclerosis risk variants at 103 discrete loci outside of the major histocompatibility complex. With high-resolution Bayesian fine mapping, we identified five regions where one variant accounted for more than 50% of the posterior probability of association. This study enhances the catalog of multiple sclerosis risk variants and illustrates the value of fine mapping in the resolution of GWAS signals.
DOI: 10.1038/ng2142
2007
Cited 1,095 times
Population genomics of human gene expression
Genetic variation influences gene expression, and this variation in gene expression can be efficiently mapped to specific genomic regions and variants. Here we have used gene expression profiling of Epstein-Barr virus–transformed lymphoblastoid cell lines of all 270 individuals genotyped in the HapMap Consortium to elucidate the detailed features of genetic variation underlying gene expression variation. We find that gene expression is heritable and that differentiation between populations is in agreement with earlier small-scale studies. A detailed association analysis of over 2.2 million common SNPs per population (5% frequency in HapMap) with gene expression identified at least 1,348 genes with association signals in cis and at least 180 in trans. Replication in at least one independent population was achieved for 37% of cis signals and 15% of trans signals, respectively. Our results strongly support an abundance of cis-regulatory variation in the human genome. Detection of trans effects is limited but suggests that regulatory variation may be the key primary effect contributing to phenotypic variation in humans. We also explore several methodologies that improve the current state of analysis of gene expression variation.
DOI: 10.1126/science.1124779
2006
Cited 708 times
A Common Genetic Variant Is Associated with Adult and Childhood Obesity
Obesity is a heritable trait and a risk factor for many common diseases such as type 2 diabetes, heart disease, and hypertension. We used a dense whole-genome scan of DNA samples from the Framingham Heart Study participants to identify a common genetic variant near the INSIG2 gene associated with obesity. We have replicated the finding in four separate samples composed of individuals of Western European ancestry, African Americans, and children. The obesity-predisposing genotype is present in 10% of individuals. Our study suggests that common genetic polymorphisms are important determinants of obesity.
DOI: 10.1086/427888
2005
Cited 543 times
Genetic Structure, Self-Identified Race/Ethnicity, and Confounding in Case-Control Association Studies
We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multiethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with self-identified race/ethnicity--as opposed to current residence--is the major determinant of genetic structure in the U.S. population. Implications of this genetic structure for case-control association studies are discussed.
DOI: 10.1056/nejmoa1405386
2014
Cited 400 times
Inactivating Mutations in <i>NPC1L1</i> and Protection from Coronary Heart Disease
Ezetimibe lowers plasma levels of low-density lipoprotein (LDL) cholesterol by inhibiting the activity of the Niemann-Pick C1-like 1 (NPC1L1) protein. However, whether such inhibition reduces the risk of coronary heart disease is not known. Human mutations that inactivate a gene encoding a drug target can mimic the action of an inhibitory drug and thus can be used to infer potential effects of that drug.We sequenced the exons of NPC1L1 in 7364 patients with coronary heart disease and in 14,728 controls without such disease who were of European, African, or South Asian ancestry. We identified carriers of inactivating mutations (nonsense, splice-site, or frameshift mutations). In addition, we genotyped a specific inactivating mutation (p.Arg406X) in 22,590 patients with coronary heart disease and in 68,412 controls. We tested the association between the presence of an inactivating mutation and both plasma lipid levels and the risk of coronary heart disease.With sequencing, we identified 15 distinct NPC1L1 inactivating mutations; approximately 1 in every 650 persons was a heterozygous carrier for 1 of these mutations. Heterozygous carriers of NPC1L1 inactivating mutations had a mean LDL cholesterol level that was 12 mg per deciliter (0.31 mmol per liter) lower than that in noncarriers (P=0.04). Carrier status was associated with a relative reduction of 53% in the risk of coronary heart disease (odds ratio for carriers, 0.47; 95% confidence interval, 0.25 to 0.87; P=0.008). In total, only 11 of 29,954 patients with coronary heart disease had an inactivating mutation (carrier frequency, 0.04%) in contrast to 71 of 83,140 controls (carrier frequency, 0.09%).Naturally occurring mutations that disrupt NPC1L1 function were found to be associated with reduced plasma LDL cholesterol levels and a reduced risk of coronary heart disease. (Funded by the National Institutes of Health and others.).
DOI: 10.1038/ng.1081
2012
Cited 374 times
Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke
Genetic factors have been implicated in stroke risk, but few replicated associations have been reported. We conducted a genome-wide association study (GWAS) for ischemic stroke and its subtypes in 3,548 affected individuals and 5,972 controls, all of European ancestry. Replication of potential signals was performed in 5,859 affected individuals and 6,281 controls. We replicated previous associations for cardioembolic stroke near PITX2 and ZFHX3 and for large vessel stroke at a 9p21 locus. We identified a new association for large vessel stroke within HDAC9 (encoding histone deacetylase 9) on chromosome 7p21.1 (including further replication in an additional 735 affected individuals and 28,583 controls) (rs11984041; combined P = 1.87 × 10(-11); odds ratio (OR) = 1.42, 95% confidence interval (CI) = 1.28-1.57). All four loci exhibited evidence for heterogeneity of effect across the stroke subtypes, with some and possibly all affecting risk for only one subtype. This suggests distinct genetic architectures for different stroke subtypes.
DOI: 10.1016/s0140-6736(12)60681-3
2012
Cited 352 times
Identification of new susceptibility loci for osteoarthritis (arcOGEN): a genome-wide association study
Osteoarthritis is the most common form of arthritis worldwide and is a major cause of pain and disability in elderly people. The health economic burden of osteoarthritis is increasing commensurate with obesity prevalence and longevity. Osteoarthritis has a strong genetic component but the success of previous genetic studies has been restricted due to insufficient sample sizes and phenotype heterogeneity.We undertook a large genome-wide association study (GWAS) in 7410 unrelated and retrospectively and prospectively selected patients with severe osteoarthritis in the arcOGEN study, 80% of whom had undergone total joint replacement, and 11,009 unrelated controls from the UK. We replicated the most promising signals in an independent set of up to 7473 cases and 42,938 controls, from studies in Iceland, Estonia, the Netherlands, and the UK. All patients and controls were of European descent.We identified five genome-wide significant loci (binomial test p≤5·0×10(-8)) for association with osteoarthritis and three loci just below this threshold. The strongest association was on chromosome 3 with rs6976 (odds ratio 1·12 [95% CI 1·08-1·16]; p=7·24×10(-11)), which is in perfect linkage disequilibrium with rs11177. This SNP encodes a missense polymorphism within the nucleostemin-encoding gene GNL3. Levels of nucleostemin were raised in chondrocytes from patients with osteoarthritis in functional studies. Other significant loci were on chromosome 9 close to ASTN2, chromosome 6 between FILIP1 and SENP6, chromosome 12 close to KLHDC5 and PTHLH, and in another region of chromosome 12 close to CHST11. One of the signals close to genome-wide significance was within the FTO gene, which is involved in regulation of bodyweight-a strong risk factor for osteoarthritis. All risk variants were common in frequency and exerted small effects.Our findings provide insight into the genetics of arthritis and identify new pathways that might be amenable to future therapeutic intervention.arcOGEN was funded by a special purpose grant from Arthritis Research UK.
DOI: 10.1038/nature10336
2011
Cited 329 times
The landscape of recombination in African Americans
Recombination, together with mutation, gives rise to genetic variation in populations. Here we leverage the recent mixture of people of African and European ancestry in the Americas to build a genetic map measuring the probability of crossing over at each position in the genome, based on about 2.1 million crossovers in 30,000 unrelated African Americans. At intervals of more than three megabases it is nearly identical to a map built in Europeans. At finer scales it differs significantly, and we identify about 2,500 recombination hotspots that are active in people of West African ancestry but nearly inactive in Europeans. The probability of a crossover at these hotspots is almost fully controlled by the alleles an individual carries at PRDM9 (P value < 10−245). We identify a 17-base-pair DNA sequence motif that is enriched in these hotspots, and is an excellent match to the predicted binding target of PRDM9 alleles common in West Africans and rare in Europeans. Sites of this motif are predicted to be risk loci for disease-causing genomic rearrangements in individuals carrying these alleles. More generally, this map provides a resource for research in human genetic variation and evolution. Genetic maps measure the probability of crossovers at each position in a genome and are valuable tools for the study of variation in populations. A genetic map has now been constructed using data from 18,000 African American individuals. Comparison with European genetic maps reveals more than 2,000 recombination hot spots that are active in people of West African ancestry but inactive in most Europeans. The probability of crossover at these hot spots is controlled at the PRDM9 locus. A 17-base-pair DNA sequence motif is enriched at these hot spots, a source of risk for disease-causing genomic rearrangements.
DOI: 10.1007/s00018-007-7218-4
2007
Cited 314 times
Causes of oxidative stress in Alzheimer disease
DOI: 10.1016/j.ajhg.2014.11.011
2015
Cited 307 times
Meta-analysis of Correlated Traits via Summary Statistics from GWASs with an Application in Hypertension
Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes.
DOI: 10.1038/ng.3749
2016
Cited 291 times
Genome-wide association analyses of sleep disturbance traits identify new loci and highlight shared genetics with neuropsychiatric and metabolic traits
Chronic sleep disturbances, associated with cardiometabolic diseases, psychiatric disorders and all-cause mortality, affect 25-30% of adults worldwide. Although environmental factors contribute substantially to self-reported habitual sleep duration and disruption, these traits are heritable and identification of the genes involved should improve understanding of sleep, mechanisms linking sleep to disease and development of new therapies. We report single- and multiple-trait genome-wide association analyses of self-reported sleep duration, insomnia symptoms and excessive daytime sleepiness in the UK Biobank (n = 112,586). We discover loci associated with insomnia symptoms (near MEIS1, TMEM132E, CYCL1 and TGFBI in females and WDR27 in males), excessive daytime sleepiness (near AR-OPHN1) and a composite sleep trait (near PATJ (INADL) and HCRTR2) and replicate a locus associated with sleep duration (at PAX8). We also observe genetic correlation between longer sleep duration and schizophrenia risk (rg = 0.29, P = 1.90 × 10-13) and between increased levels of excessive daytime sleepiness and increased measures for adiposity traits (body mass index (BMI): rg = 0.20, P = 3.12 × 10-9; waist circumference: rg = 0.20, P = 2.12 × 10-7).
DOI: 10.1038/s41562-016-0016
2017
Cited 258 times
Genetic evidence of assortative mating in humans
DOI: 10.1038/ng.567
2010
Cited 240 times
Variants in ADCY5 and near CCNL1 are associated with fetal growth and birth weight
To identify genetic variants associated with birth weight, we meta-analyzed six genome-wide association (GWA) studies (n = 10,623 Europeans from pregnancy/birth cohorts) and followed up two lead signals in 13 replication studies (n = 27,591). rs900400 near LEKR1 and CCNL1 (P = 2 x 10(-35)) and rs9883204 in ADCY5 (P = 7 x 10(-15)) were robustly associated with birth weight. Correlated SNPs in ADCY5 were recently implicated in regulation of glucose levels and susceptibility to type 2 diabetes, providing evidence that the well-described association between lower birth weight and subsequent type 2 diabetes has a genetic component, distinct from the proposed role of programming by maternal nutrition. Using data from both SNPs, we found that the 9% of Europeans carrying four birth weight-lowering alleles were, on average, 113 g (95% CI 89-137 g) lighter at birth than the 24% with zero or one alleles (P(trend) = 7 x 10(-30)). The impact on birth weight is similar to that of a mother smoking 4-5 cigarettes per day in the third trimester of pregnancy.
DOI: 10.1038/ncomms9609
2015
Cited 175 times
Genetic discovery for oil production and quality in sesame
Oilseed crops are used to produce vegetable oil. Sesame (Sesamum indicum), an oilseed crop grown worldwide, has high oil content and a small diploid genome, but the genetic basis of oil production and quality is unclear. Here we sequence 705 diverse sesame varieties to construct a haplotype map of the sesame genome and de novo assemble two representative varieties to identify sequence variations. We investigate 56 agronomic traits in four environments and identify 549 associated loci. Examination of the major loci identifies 46 candidate causative genes, including genes related to oil content, fatty acid biosynthesis and yield. Several of the candidate genes for oil content encode enzymes involved in oil metabolism. Two major genes associated with lignification and black pigmentation in the seed coat are also associated with large variation in oil content. These findings may inform breeding and improvement strategies for a broad range of oilseed crops.
DOI: 10.1038/s41467-019-11456-7
2019
Cited 136 times
Genome-wide association analysis of self-reported daytime sleepiness identifies 42 loci that suggest biological subtypes
Excessive daytime sleepiness (EDS) affects 10-20% of the population and is associated with substantial functional deficits. Here, we identify 42 loci for self-reported daytime sleepiness in GWAS of 452,071 individuals from the UK Biobank, with enrichment for genes expressed in brain tissues and in neuronal transmission pathways. We confirm the aggregate effect of a genetic risk score of 42 SNPs on daytime sleepiness in independent Scandinavian cohorts and on other sleep disorders (restless legs syndrome, insomnia) and sleep traits (duration, chronotype, accelerometer-derived sleep efficiency and daytime naps or inactivity). However, individual daytime sleepiness signals vary in their associations with objective short vs long sleep, and with markers of sleep continuity. The 42 sleepiness variants primarily cluster into two predominant composite biological subtypes - sleep propensity and sleep fragmentation. Shared genetic links are also seen with obesity, coronary heart disease, psychiatric diseases, cognitive traits and reproductive ageing.
DOI: 10.1038/ng1510
2005
Cited 266 times
Admixture mapping for hypertension loci with genome-scan markers
DOI: 10.1086/504302
2006
Cited 265 times
Reconstructing Genetic Ancestry Blocks in Admixed Individuals
A chromosome in an individual of recently admixed ancestry resembles a mosaic of chromosomal segments, or ancestry blocks, each derived from a particular ancestral population. We consider the problem of inferring ancestry along the chromosomes in an admixed individual and thereby delineating the ancestry blocks. Using a simple population model, we infer gene-flow history in each individual. Compared with existing methods, which are based on a hidden Markov model, the Markov-hidden Markov model (MHMM) we propose has the advantage of accounting for the background linkage disequilibrium (LD) that exists in ancestral populations. When there are more than two ancestral groups, we allow each ancestral population to admix at a different time in history. We use simulations to illustrate the accuracy of the inferred ancestry as well as the importance of modeling the background LD; not accounting for background LD between markers may mislead us to false inferences about mixed ancestry in an indigenous population. The MHMM makes it possible to identify genomic blocks of a particular ancestry by use of any high-density single-nucleotide-polymorphism panel. One application of our method is to perform admixture mapping without genotyping special ancestry-informative-marker panels.
DOI: 10.1086/320104
2001
Cited 245 times
Linkage and Association Analysis of Angiotensin I–Converting Enzyme (ACE)–Gene Polymorphisms with ACE Concentration and Blood Pressure
Considerable effort has been expended to determine whether the gene for angiotensin I-converting enzyme (ACE) confers susceptibility to cardiovascular disease. In this study, we genotyped 13 polymorphisms in the ACE gene in 1,343 Nigerians from 332 families. To localize the genetic effect, we first performed linkage and association analysis of all the markers with ACE concentration. In multipoint variance-component analysis, this region was strongly linked to ACE concentration (maximum LOD score 7.5). Likewise, most of the polymorphisms in the ACE gene were significantly associated with ACE (P<.0013). The two most highly associated polymorphisms, ACE4 and ACE8, accounted for 6% and 19% of the variance in ACE, respectively. A two-locus additive model with an additive x additive interaction of these polymorphisms explained most of the ACE variation associated with this region. We next analyzed the relationship between these two polymorphisms (ACE4 and ACE8) and blood pressure (BP). Although no evidence of linkage was detected, significant association was found for both systolic and diastolic BP when a two-locus additive model developed for ACE concentration was used. Further analyses demonstrated that an epistasis model provided the best fit to the BP variation. In conclusion, we found that the two polymorphisms explaining the greatest variation in ACE concentration are significantly associated with BP, through interaction, in this African population sample. Our study also demonstrates that greater statistical power can be anticipated with association analysis versus linkage, when markers in strong linkage disequilibrium with a trait locus have been identified. Furthermore, allelic interaction may play an important role in the dissection of complex traits such as BP.
DOI: 10.1038/ng1899
2006
Cited 235 times
Transferability of tag SNPs in genetic association studies in multiple populations
DOI: 10.1161/circulationaha.105.568881
2005
Cited 188 times
Corin Gene Minor Allele Defined by 2 Missense Mutations Is Common in Blacks and Associated With High Blood Pressure and Hypertension
Background— The natriuretic peptide system contributes to blood pressure regulation. Atrial and brain natriuretic peptides are cleaved into smaller biologically active molecules by corin, a transmembrane serine protease expressed in cardiomyocytes. Method and Results— This genotype-phenotype genetic association study included replication samples and genomic control to correct for population stratification. Sequencing of the human corin gene identified 2 nonsynonymous, nonconservative single nucleotide polymorphisms (Q568P and T555I) in near-complete linkage disequilibrium, thus describing a single minor I555 (P568) corin gene allele. This allele was present in the heterozygote state in &amp;12% of blacks but was extremely rare in whites (&lt;0.5% were homozygous for the minor allele). In our primary population sample, the Dallas Heart Study, after adjustment for potential confounders, including population stratification, the corin I555 (P568) allele remained independently associated with increased risk for prevalent hypertension (odds ratio, 1.63; 95% CI, 1.11 to 2.38; P =0.013). The corin I555 (P568) allele also was associated with higher systolic blood pressure in subjects not using antihypertensive medication in unadjusted (133.7±20.7 versus 129.4±17.4 mm Hg; P =0.029) and adjusted (132.5±1.6 versus 128.9±0.6 mm Hg; P =0.029) analyses. The independent association of the minor corin allele with increased risk for prevalent hypertension was confirmed in the Multi-Ethnic Study of Atherosclerosis (odds ratio, 1.50; 95% CI, 1.09 to 2.06; P =0.014). In addition, the association of the minor corin I555 (P568) allele with higher systolic blood pressure was confirmed in adjusted analysis in the Chicago Genetics of Hypertension Study (125.8±1.9 versus 121.4±0.7 mm Hg; P =0.03). Conclusions— The corin I555 (P568) allele is common in blacks and is associated with higher blood pressure and an increased risk for prevalent hypertension.
DOI: 10.1053/j.gastro.2007.02.051
2007
Cited 181 times
IL23R Variation Determines Susceptibility But Not Disease Phenotype in Inflammatory Bowel Disease
Background & Aims: Identification of inflammatory bowel disease (IBD) susceptibility genes is key to understanding pathogenic mechanisms. Recently, the North American IBD Genetics Consortium provided compelling evidence for an association between ileal Crohn’s disease (CD) and the IL23R gene using genome-wide association scanning. External replication is a priority, both to confirm this finding in other populations and to validate this new technique. We tested for association between IL23R and IBD in a large independent UK panel to determine the size of the effect and explore subphenotype correlation and interaction with CARD15. Methods: Eight single nucleotide polymorphism markers in IL23R tested in the North American study were genotyped in 1902 cases of Crohn’s disease (CD), 975 cases of ulcerative colitis (UC), and 1345 controls using MassARRAY. Data were analyzed using χ2 statistics, and subgroup association was sought. Results: A highly significant association with CD was observed, with the strongest signal at coding variant Arg381Gln (allele frequency, 2.5% in CD vs 6.2% in controls [P = 1.1 × 10−12]; odds ratio, 0.38; 95% confidence interval, 0.29–0.50). A weaker effect was seen in UC (allele frequency, 4.6%; odds ratio, 0.73; 95% confidence interval, 0.55–0.96). Analysis accounting for Arg381Gln suggested that other loci within IL23R also influence IBD susceptibility. Within CD, there were no subphenotype associations or evidence of interaction with CARD15. Conclusions: This study shows an association between IL23R and all subphenotypes of CD with a smaller effect on UC. This extends the findings of the North American study, providing clear evidence that genome-wide association scanning can successfully identify true complex disease genes. Background & Aims: Identification of inflammatory bowel disease (IBD) susceptibility genes is key to understanding pathogenic mechanisms. Recently, the North American IBD Genetics Consortium provided compelling evidence for an association between ileal Crohn’s disease (CD) and the IL23R gene using genome-wide association scanning. External replication is a priority, both to confirm this finding in other populations and to validate this new technique. We tested for association between IL23R and IBD in a large independent UK panel to determine the size of the effect and explore subphenotype correlation and interaction with CARD15. Methods: Eight single nucleotide polymorphism markers in IL23R tested in the North American study were genotyped in 1902 cases of Crohn’s disease (CD), 975 cases of ulcerative colitis (UC), and 1345 controls using MassARRAY. Data were analyzed using χ2 statistics, and subgroup association was sought. Results: A highly significant association with CD was observed, with the strongest signal at coding variant Arg381Gln (allele frequency, 2.5% in CD vs 6.2% in controls [P = 1.1 × 10−12]; odds ratio, 0.38; 95% confidence interval, 0.29–0.50). A weaker effect was seen in UC (allele frequency, 4.6%; odds ratio, 0.73; 95% confidence interval, 0.55–0.96). Analysis accounting for Arg381Gln suggested that other loci within IL23R also influence IBD susceptibility. Within CD, there were no subphenotype associations or evidence of interaction with CARD15. Conclusions: This study shows an association between IL23R and all subphenotypes of CD with a smaller effect on UC. This extends the findings of the North American study, providing clear evidence that genome-wide association scanning can successfully identify true complex disease genes. See editorial on page 2045; CME quiz on page 1999. It is widely recognized that knowledge regarding the genetic basis of inflammatory bowel disease (IBD) and other complex diseases will provide key insights into pathogenic mechanisms. It is this fact that has spurred efforts to identify disease susceptibility genes. Of the many complex diseases investigated using molecular genetic techniques, Crohn’s disease (CD) is exceptional in that specific genetic variants unequivocally associated with disease susceptibility have been successfully identified.1Ogura Y. Bonen D.K. Inohara N. Nicolae D.L. Chen F.F. Ramos R. Britton H. Moran T. Karaliuskas R. Duerr R.H. Achkar J.P. Brant S.R. Bayless T.M. Kirschner B.S. Hanauer S.B. Nunez G. Cho J.H. A frameshift mutation in NOD2 associated with susceptibility to Crohn’s disease.Nature. 2001; 411: 603-606Google Scholar, 2Hugot J.P. Chamaillard M. Zouali H. Lesage S. Cezard J.P. Belaiche J. Almer S. Tysk C. O’Morain C.A. Gassull M. Binder V. Finkel Y. Cortot A. Modigliani R. Laurent-Puig P. Gower-Rousseau C. Macry J. Colombel J.F. Sahbatou M. Thomas G. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease.Nature. 2001; 411: 599-603Google Scholar Nonetheless, characterization of the unknown number of remaining CD genes is required to complete the picture and remains a priority. CD is one of the 2 common and related forms of IBD, the other being ulcerative colitis (UC). Within the United Kingdom, they have a combined prevalence of approximately 4/1000.3Stone M.A. Mayberry J.F. Baker R. Prevalence and management of inflammatory bowel disease: a cross-sectional study from central England.Eur J Gastroenterol Hepatol. 2003; 15: 1275-1280Google Scholar Both are known to have a significant genetic contribution to their etiology, but this is stronger for CD than UC.4Tysk C. Lindberg E. Jarnerot G. Floderus-Myrhed B. Ulcerative colitis and Crohn’s disease in an unselected population of monozygotic and dizygotic twins A study of heritability and the influence of smoking.Gut. 1988; 29: 990-996Google Scholar The epidemiologic evidence also suggests that CD and UC share some susceptibility genes. In 2001, fine mapping of a widely replicated linkage region on chromosome 16 led to the identification of CARD15 as a major CD susceptibility gene, with mutations leading to dysregulation of innate immune pathways.1Ogura Y. Bonen D.K. Inohara N. Nicolae D.L. Chen F.F. Ramos R. Britton H. Moran T. Karaliuskas R. Duerr R.H. Achkar J.P. Brant S.R. Bayless T.M. Kirschner B.S. Hanauer S.B. Nunez G. Cho J.H. A frameshift mutation in NOD2 associated with susceptibility to Crohn’s disease.Nature. 2001; 411: 603-606Google Scholar, 2Hugot J.P. Chamaillard M. Zouali H. Lesage S. Cezard J.P. Belaiche J. Almer S. Tysk C. O’Morain C.A. Gassull M. Binder V. Finkel Y. Cortot A. Modigliani R. Laurent-Puig P. Gower-Rousseau C. Macry J. Colombel J.F. Sahbatou M. Thomas G. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease.Nature. 2001; 411: 599-603Google ScholarCARD15 genes have subsequently been shown in meta-analysis to predominantly determine susceptibility to ileal CD. Variants within a number of other genes have been associated with CD, UC, or both,5Reinhard C. Rioux J.D. Role of the IBD5 susceptibility locus in the inflammatory bowel diseases.Inflamm Bowel Dis. 2006; 12: 227-238Google Scholar, 6Yamazaki K. McGovern D. Ragoussis J. Paolucci M. Butler H. Jewell D. Cardon L. Takazoe M. Tanaka T. Ichimori T. Saito S. Sekine A. Iida A. Takahashi A. Tsunoda T. Lathrop M. Nakamura Y. Single nucleotide polymorphisms in TNFSF15 confer susceptibility to Crohn’s disease.Hum Mol Genet. 2005; 14: 3499-3506Google Scholar, 7Ho G.-T. Nimmo E.R. Tenesa A. Fennell J. Drummond H. Mowat C. Arnott I.D. Satsangi J. Allelic variations of the multidrug resistance gene determine susceptibility and disease behavior in ulcerative colitis.Gastroenterology. 2005; 128: 288-296Abstract Full Text Full Text PDF Scopus (168) Google Scholar, 8Franchimont D. Vermeire S. El Housni H. Pierik M. Van Steen K. Gustot T. Quertinmont E. Abramowicz M. Van Gossum A. Deviere J. Rutgeerts P. Deficient host-bacteria interactions in inflammatory bowel disease? The toll-like receptor (TLR)-4 Asp299gly polymorphism is associated with Crohn’s disease and ulcerative colitis.Gut. 2004; 53: 987-992Google Scholar, 9Stoll M. Corneliussen B. Costello C.M. Waetzig G.H. Mellgard B. Koch W.A. Rosenstiel P. Albrecht M. Croucher P.J. Seegert D. Nikolaus S. Hampe J. Lengauer T. Pierrou S. Foelsch U.R. Mathew C.G. Lagerstrom-Fermer M. Schreiber S. Genetic variation in DLG5 is associated with inflammatory bowel disease.Nat Genet. 2004; 36: 476-480Google Scholar although their exact roles in IBD susceptibility require clarification and, in some cases, replication. To date, pinpointing of disease genes has depended on detailed evaluation of candidates implicated by their function or patterns of expression or by fine mapping within large regions identified in the course of genome-wide linkage scans. Across the range of common diseases, productivity of such approaches has been limited. Most complex disease genetic studies, including many in IBD, have been beset by poor reproducibility of results and slow progress in identifying disease genes. This has been attributed to a range of factors, some of the most important being the low resolution of sib-pair linkage analysis, use of inappropriate statistical thresholds for significance, and poor matching of controls due to population admixture.10Cardon L.R. Bell J.I. Association study designs for complex diseases.Nat Rev Genet. 2001; 2: 91-99Google Scholar One powerful new method for the identification of complex disease genes is genome-wide association scanning, genotyping large panels of affected individuals and appropriately matched population controls for hundreds of thousands of polymorphic markers across the genome and using appropriately stringent statistical thresholds for significance.11Risch N. Merikangas K. The future of genetic studies of complex human diseases.Science. 1996; 273: 1516-1517Google Scholar Within the past year, such studies have become technically and financially possible using sets of markers that capture most of the common variation across the genome using knowledge regarding human haplotype structure available from the International HapMap Project (http://www.hapmap.org).12Barrett J.C. Cardon L.R. Evaluating coverage of genome-wide association studies.Nat Genet. 2006; 38: 659-662Google Scholar Systematic whole-genome association studies, in comparison with the previous gold standard of linkage analysis, should provide substantially increased power and resolution for detection of complex disease susceptibility genes.13Hirschhorn J.N. Daly M.J. Genome-wide association studies for common diseases and complex traits.Nat Rev Genet. 2005; 6: 95-108Google Scholar Recently, the results of a 308,332-marker genome scan in a North American panel of 547 non-Jewish case patients with CD and 548 controls were reported. Case patients were selected as having ileal CD to reduce heterogeneity.14Duerr R.H. Taylor K.D. Brant S.R. Rioux J.D. Silverberg M.S. Daly M.J. Steinhart A.H. Abraham C. Regueiro M. Griffiths A. Dassopoulos T. Bitton A. Yang H. Targan S. Datta L.W. Kistner E.O. Schumm L.P. Lee A. Gregersen P.K. Barmada M.M. Rotter J.I. Nicolae D.L. Cho J.H. A Genome-wide association study identifies IL23R as an inflammatory bowel disease gene.Science. 2006; 314: 1461-1463Google Scholar Three markers showed a highly significant association with CD, 2 of which were in CARD15. The third marker was a rare coding variant rs11209026c (1142G→A; Arg381Gln) found in the interleukin 23 receptor (IL23R) gene on chromosome 1 (P = 5.05 × 10−9). Nine other markers showed association with P < .0001 either within IL23R or in the intergenic area with the adjacent IL12RB2 gene. Internal replication was achieved in the index study using both a Jewish CD case-control cohort (peak P value, 3.36 × 10−13) and family-based methodologies, the latter in addition suggesting association with UC in a small non-Jewish cohort. This finding indicates that IL23R may have a general role in the etiology of IBD.14Duerr R.H. Taylor K.D. Brant S.R. Rioux J.D. Silverberg M.S. Daly M.J. Steinhart A.H. Abraham C. Regueiro M. Griffiths A. Dassopoulos T. Bitton A. Yang H. Targan S. Datta L.W. Kistner E.O. Schumm L.P. Lee A. Gregersen P.K. Barmada M.M. Rotter J.I. Nicolae D.L. Cho J.H. A Genome-wide association study identifies IL23R as an inflammatory bowel disease gene.Science. 2006; 314: 1461-1463Google Scholar The aims of the current study were to seek replication of the association between IL23R and IBD in a large independent North European cohort representing the full range of CD and UC phenotypes, examine in detail genotype-phenotype relationships, explore evidence for epistasis with the known CD susceptibility gene CARD15, and provide accurate estimates of disease risk for associated variants. Replication of the association in an independent cohort would serve 2 important purposes. First, it is key to confirming the veracity of the original finding and the applicability of these findings in populations outside North America. Further, strong independent replication of the key finding of one of the first published genome-wide association scans would provide proof of principle that this novel methodology can be used to identify risk variants for complex diseases. A total of 2877 individuals with IBD (1902 with CD and 975 with UC) were recruited in 5 centers across England and Scotland. The study was approved by the research ethics committees at each center. Standard clinical, radiologic, and histologic diagnostic criteria were applied.15Lennard-Jones J.E. Classification of inflammatory bowel disease.Scand J Gastroenterol Suppl. 1989; 170: 2-6Google Scholar Phenotypic details were obtained by retrospective case notes review. CD phenotype was classified by age at diagnosis, location, and behavior of disease. Only one member of multiply affected families was included. A total of 1.75% were of Jewish origin, and 2.25% were nonwhite. Demographic and subphenotype data are presented in Table 1.Table 1Demographic Details of 2877 Individuals With IBD Used in Case-Control PanelCD (n = 1902)UC (n = 975)Median age at diagnosis (y)2638.9Gender (F/M)1153/745480/495Smoking at diagnosis (%) Never58.455.0 Ex9.430.3 Current32.214.7Jewish ancestry (%)1.751.9Nonwhite (%)2.253.25Surgery (%)61.8Location/extent (%)32.7 ileal16.5 rectum only31.8 colonic35.0 distal to35.5 ileocolonic splenic flexure27.1 perianal48.5 proximal to splenic flexureBehavior (%)36.5 stenosing17.15 penetrating Open table in a new tab Control allele frequencies were obtained from 1345 individuals recruited across Britain as part of the 1958 British birth cohort.16Power C. Elliott J. Cohort profile: 1958 British birth cohort (National Child Development Study).Int J Epidemiol. 2006; 35: 34-41Google Scholar Cases and controls were categorized into 12 broad geographical regions within Great Britain to minimize confounding due to variation in allele frequencies across the country.17Clayton D.G. Walker N.M. Smyth D.J. Pask R. Cooper J.D. Maier L.M. Smink L.J. Lam A.C. Ovington N.R. Stevens H.E. Nutland S. Howson J.M.M. Faham M. Moorhead M. Jones H.B. Falkowski M. Hardenbol P. Willis T.D. Todd J.A. Population structure, differential bias and genomic control in a large-scale, case-control association study.Nat Genet. 2005; 37: 1243-1246Google Scholar Genotyping of cases was undertaken with iPLEX chemistry on a matrix-assisted laser desorption/ionization time-of-flight MassARRAY platform (Sequenom, San Diego, CA). Cases were genotyped for 8 IL23R markers reported in the index study, including the nonsynonymous single nucleotide polymorphism (SNP) rs11209026 encoding amino acid change Arg381Gln (primer sequences in Supplementary Table 1; see supplemental material online at www.gastrojournal.org). Two of the North American markers (rs7517847, rs2201841) were omitted due to their location within a sequence of interspersed low-complexity repeats. Genotyping of controls was undertaken at the Wellcome Trust Sanger Institute using the Illumina 550K chip (Illumina, San Diego, CA). Concordance of genotype calls between the different platforms was confirmed by genotyping 87 control DNAs for all 8 markers using the MassARRAY platform with strong concordance of calls between technologies—98.99% for the 8 markers overall. There was 100% concordance for 3 markers, including the coding variant Arg381Gln (Supplementary Table 2; see supplemental material online at www.gastrojournal.org). The data for 1594 cases of CD genotyped for CARD15 mutations in earlier studies were used to undertake analysis for evidence of interaction between CARD15 and IL23R.18Waller S. Tremelling M. Bredin F. Godfrey L. Howson J. Parkes M. Evidence for association of OCTN genes and IBD5 with ulcerative colitis.Gut. 2006; 55: 809-814Google Scholar, 19Pearce A.V. Fisher S.A. Prescott N.J. Onnie C.M. Pattni R. Green P. Forbes A. Mansfield J. Sanderson J. Schreiber S. Lewis C.M. Mathew C.G. Investigation of association of the DLG5 gene with phenotypes of inflammatory bowel disease in the British population.Int J Colorectal Dis. 2007; 22: 419-424Google Scholar, 20Arnott I.D. Nimmo E.R. Drummond H.E. Fennell J. Smith B.R. MacKinlay E. Morecroft J. Anderson N. Kelleher D. O’Sullivan M. McManus R. Satsangi J. NOD2/CARD15, TLR4 and CD14 mutations in Scottish and Irish Crohn’s disease patients: evidence for genetic heterogeneity within Europe?.Genes Immun. 2004; 5: 417-425Google Scholar, 21Ahmad T. Tamboli C.P. Jewell D. Colombel J.F. Clinical relevance of advances in genetics and pharmacogenetics of IBD.Gastroenterology. 2004; 126: 1533-1549Google Scholar Allele frequencies were compared between cases and controls and between phenotypic subgroups using χ2 tests of 2 × 2 tables. Odds ratios were calculated for the minor allele at each SNP; confidence intervals (CIs) were calculated using Woolf’s method.22Woolf B. On estimating the relation between blood group and disease.Ann Hum Genet. 1955; 19: 251-253Google Scholar Pairwise SNP linkage disequilibrium coefficients were estimated using Haploview.23Barrett J.C. Fry B. Maller J. Daly M.J. Haploview: analysis and visualization of LD and haplotype maps.Bioinformatics. 2005; 21: 263-265Google Scholar Conditional association analysis was implemented using COCAPHASE, a module of the UNPHASED program.24Dudbridge F. Pedigree disequilibrium tests for multilocus haplotypes.Genet Epidemiol. 2003; 25: 115-121Google Scholar This method tests for equality of odds ratios for haplotypes identical at conditioning loci. The Mantel–Haenszel test for association conditioning on geographical region was implemented using PLINK (http://pngu.mgh.harvard.edu/∼purcell/plink/). Median age at disease diagnosis between groups was compared using the Wilcoxon rank sum test. Age at diagnosis was dichotomized according to the Montreal classification.25Silverberg M.S. Satsangi J. Ahmad T. Arnott I.D. Bernstein C.N. Brant S.R. Caprilli R. Colombel J.F. Gasche C. Geboes K. Jewell D.P. Karban A. Loftus Jr, E.V. Pena A.S. Riddell R.H. Sachar D.B. Schreiber S. Steinhart A.H. Targan S.R. Vermeire S. Warren B.F. Toward an integrated clinical, molecular and serological classification of inflammatory bowel disease: report of a Working Party of the 2005 Montreal World Congress of Gastroenterology.Can J Gastroenterol. 2005; 19: 5-36Crossref Scopus (2406) Google Scholar Unless specified otherwise, all analyses were performed using R version 2.2 for Windows (http://www.R-project.org). All genotypes were in Hardy–Weinberg equilibrium in both cases and controls (P > .05). A highly significant association with CD was observed across the region (Table 2). The strongest association was observed at the nonsynonymous SNP Arg381Gln, where the frequency of the A allele was 2.5% in CD compared with 6.2% in controls (P = 1.1 ×10−12). The odds ratio for this protective allele was 0.38 (95% CI, 0.29–0.50). Alternatively, the common wild-type homozygous GG genotype can be considered as the risk genotype with an odds ratio of 2.70. To minimize potential confounding from regional differences in allele frequencies, a Mantel–Haenszel test was performed across 12 regional strata. Mantel–Haenszel odds ratios was very similar to those obtained from pooled data for all SNPs. For example, the Mantel–Haenszel odds ratio was 0.36 (95% CI, 0.25–0.51) for Arg381Gln.Table 2Case-Control Allele Frequencies and Disease Odds Ratios (95% Confidence Intervals) for CD and UCSNPAlleleControlsCDPOdds ratio (95% CI)UCPOdds ratio (95% CI)rs1004819T0.3070.3831.1 × 10−81.41 (1.23–1.56)0.348.007131.20 (1.05–1.37)rs10489629G0.4480.3721.8 × 10−80.73 (0.66–0.82)0.43.260.93 (0.82–1.05)rs11465804G0.0580.0257.2 × 10−110.41 (0.31–0.53)0.046.0810.77 (0.58–1.02)rs11209026A0.0620.0251.1 × 10−120.38 (0.29–0.50)0.046.02910.73 (0.55–0.96)rs1343151T0.3320.2661.1 × 10−70.73 (0.65–0.82)0.315.260.92 (0.81–1.06)rs10889677A0.3150.3983.4 × 10−101.45 (1.28–1.61)0.358.00421.22 (1.07–1.39)rs11209032A0.3200.3901.3 × 10−71.35 (1.22–1.52)0.3524.0321.16 (1.01–1.32)rs1495965G0.4470.5173.4 × 10−71.32 (1.19–1.47)0.457.571.04 (0.92–1.18) Open table in a new tab Several SNPs also showed significant association with UC (Table 2). The strongest signal was observed with common SNPs rs1004819 (P = .0071) and rs10889677 (P = .0042). The frequency of Arg381Gln was only marginally different between cases and controls (UC, 0.046; controls, 0.062; P = .029), with an odds ratio of 0.73 (95% CI, 0.55–0.96). The nonsynonymous SNP Arg381Gln was in tight linkage disequilibrium with one other SNP (rs11465804, r2 = 0.85) but weak linkage disequilibrium with all 6 other SNPs (r2 = 0.03–0.1). A separate test for CD association was performed for each SNP conditioning on Arg381Gln by conditional regression modeling. This showed a significant association at all SNPs (P < .001) except rs11465804, with the strongest residual association detected at rs10889677 (P = 4.6 × 10−8). Hence, the nonsynonymous SNP does not account for all the association signal at this locus. Data were then analyzed for evidence of significant genotype-phenotype correlations based on age at onset of CD, disease location, and disease behavior (Table 3). No significant subgroup association was observed. In particular, the subgroup of subjects with CD affecting the colon only without small bowel disease (n = 539) appeared to be as strongly associated as those with exclusively ileal/small bowel involvement (n = 668) (minor allele frequencies, 2.3% and 2.0%, respectively). The age at disease onset ranged from 12 to 67 years in patients with CD who carried the A allele of Arg381Gln and from 0 to 80 years in wild-type GG cases. There was no difference in the median age of onset between these 2 groups (AA/AG: median, 28 years [n = 85]; GG: median, 26 years [n = 1650]; P = .26). Stratification of cases by age at diagnosis according to the Montreal classification25Silverberg M.S. Satsangi J. Ahmad T. Arnott I.D. Bernstein C.N. Brant S.R. Caprilli R. Colombel J.F. Gasche C. Geboes K. Jewell D.P. Karban A. Loftus Jr, E.V. Pena A.S. Riddell R.H. Sachar D.B. Schreiber S. Steinhart A.H. Targan S.R. Vermeire S. Warren B.F. Toward an integrated clinical, molecular and serological classification of inflammatory bowel disease: report of a Working Party of the 2005 Montreal World Congress of Gastroenterology.Can J Gastroenterol. 2005; 19: 5-36Crossref Scopus (2406) Google Scholar revealed similar genotype frequencies in all groups (Table 3). For UC, subgroup analysis by disease extent, smoking history, and sex also revealed no significant subgroup association. Age at onset of UC ranged from 14 to 79 years in cases who carried the A allele of Arg381Gln and from 2 to 81 years in wild-type GG cases, with no difference in the median age of onset between the 2 groups (AA/AG: median, 34 years [n = 72]; GG: median, 33 years [n = 708]; P = .14) (Table 4). A total of 1540 subjects with CD were fully genotyped for the 3 CARD15 mutations (G908R, L1007fs, R702W) (Table 3). The frequency of Arg381Gln in 460 cases carrying at least one CARD15 mutation (2.2%) was not significantly different from that in 1081 cases who carried none (2.7%; P = .47). None of the 3 cases who were homozygous for the rare A allele also carried a CARD15 mutation.Table 3Arg381Gln Genotype and Allele Frequencies in CD Cases Stratified by Known Phenotypic Subgroups and CARD15 StatusAAAGGGTotalFreq(A)Sex Male1366907270.026 Female249106811190.024Smoking history No1287297580.020 Yes1214144360.026 Ex011271280.004Disease location Pure colorectal disease1235155390.023 Pure ileal disease0315335640.027 Ileocolonic disease1256426680.020 Any colorectal disease246110111490.022 Any ileal disease154111511700.024Perianal disease Yes2204544760.025 No159120512650.024Disease behavior Stenosing1326046370.027 Penetrating2152482650.036 Inflammatory only0307147440.020Surgery Yes251103510880.025 No1316456770.024Age at diagnosis (y) 16 or younger181972060.024 17–40260112911910.027 Older than 400143243380.021CARD15 statusaSamples are subdivided by CARD15 status into those homozygous wild-type (−/ −), those heterozygous for CD-associated variants (−/+), and those homozygous or compound heterozygous for CD-associated variants (+/+). −/ −352102510800.027 −/+0153423570.021 +/+05981030.024a Samples are subdivided by CARD15 status into those homozygous wild-type (−/ −), those heterozygous for CD-associated variants (−/+), and those homozygous or compound heterozygous for CD-associated variants (+/+). Open table in a new tab Table 4Arg381Gln Genotype and Allele Frequencies in UC Cases Stratified by Known Phenotypic SubgroupsAAAGGGTotalFreq(A)Sex Male1414474890.044 Female2434364810.049Smoking history No0283013290.043 Yes0781880.040 Ex1201601810.061Disease extent Rectum only1121341470.048 Distal to splenic flexure1242863110.042 Proximal to splenic flexure1433874310.052Age at diagnosis (y) 16 or younger0239410.025 17–401414434850.044 Older than 400282282560.055 Open table in a new tab This study provides unequivocal confirmation of association between variants in the IL23R gene and IBD, suggesting a major effect on overall susceptibility to CD and a more modest effect on UC. Importantly, this study also shows the association at IL23R for the first time in a non-American population. The strength of this association at IL23R and the fact that it reaches such a magnitude in 2 independent data sets leaves no doubt that it is a true finding. In addition, this is one of the first instances of highly significant, independent replication of data derived from a genome-wide association scan and provides important validation of this technique as a hypothesis-free method for the identification of complex disease genes. As with the North American genome-wide scan, the strongest evidence for association was seen at the nonsynonymous SNP Arg381Gln, where the frequency of the A allele was 2.5% in CD compared with 6.2% in controls (P = 1.1 × 10−12). These allele frequencies are similar to those seen in the North American panel.14Duerr R.H. Taylor K.D. Brant S.R. Rioux J.D. Silverberg M.S. Daly M.J. Steinhart A.H. Abraham C. Regueiro M. Griffiths A. Dassopoulos T. Bitton A. Yang H. Targan S. Datta L.W. Kistner E.O. Schumm L.P. Lee A. Gregersen P.K. Barmada M.M. Rotter J.I. Nicolae D.L. Cho J.H. A Genome-wide association study identifies IL23R as an inflammatory bowel disease gene.Science. 2006; 314: 1461-1463Google Scholar There was no evidence that IL23R variants associate with any particular subphenotype of CD based on disease behavior or location. Hence, there was no difference in minor allele frequency even between the extremes of pure ileal/small bowel CD and pure colonic CD (2.7% and 2.3%, respectively). Likewise, analysis based on disease behavior did not show any specific subgroup associations (Table 3). This negative result is interesting because it contrasts with the other confirmed CD susceptibility locus CARD15, which seems to have definite associations with ileal disease.26Economou M. Trikalinos T.A. Loizou K.T. Tsianos E.V. Ioannidis J.P.A. Differential effects of NOD2 variants on Crohn’s disease risk and phenotype in diverse populations: a metaanalysis.Am J Gastroenterol. 2004; 99: 2393-2404Google Scholar These findings are extended by the observation of association with UC overall but not with any known UC subphenotype group, suggesting that IL23R variants may exert a rather generic effect on chronic intestinal inflammation, although the effect size in UC does appear to be smaller than in CD. It is noteworthy that the odds ratio confidence interval at Arg381Gln for UC (0.73 [95% CI, 0.55–0.96]) does not overlap with that for CD (0.38 [95% CI, 0.29–0.50]), suggesting a significantly less marked protective effect of the rare allele for UC compared with CD. Based on data from our large, independent panel of CD cases, it is possible to provide an accurate estimate of the size of the effect conferred by IL23R variants with regard to the risk of CD. We estimated an odds ratio of 0.38 (95% CI, 0.29–0.49) for Arg381Gln. This is likely to be a more accurate estimate than that provided in the index report from the North American study (odds ratio, 0.26; 95% CI, 0.15–0.43) due to the well-recognized bias of the so-called “winner’s curse,” which leads to overestimation of effect size in discovery panels.27Lohmueller K.E. Pearce C.L. Pike M. Lander E.S. Hirschhorn J.N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease.Nat Genet. 2003; 33: 177-182Google Scholar Characterizing the exact effect size is important to permit sample size calculation for any further attempts at replication. Where the effect size is overestimated, there is a risk that apparently appropriately powered studies will fail to observe the effect and erroneously conclude that it is a fa
DOI: 10.1371/journal.pgen.0030061
2007
Cited 147 times
The Association of a SNP Upstream of INSIG2 with Body Mass Index is Reproduced in Several but Not All Cohorts
A SNP upstream of the INSIG2 gene, rs7566605, was recently found to be associated with obesity as measured by body mass index (BMI) by Herbert and colleagues. The association between increased BMI and homozygosity for the minor allele was first observed in data from a genome-wide association scan of 86,604 SNPs in 923 related individuals from the Framingham Heart Study offspring cohort. The association was reproduced in four additional cohorts, but was not seen in a fifth cohort. To further assess the general reproducibility of this association, we genotyped rs7566605 in nine large cohorts from eight populations across multiple ethnicities (total n = 16,969). We tested this variant for association with BMI in each sample under a recessive model using family-based, population-based, and case-control designs. We observed a significant (p < 0.05) association in five cohorts but saw no association in three other cohorts. There was variability in the strength of association evidence across examination cycles in longitudinal data from unrelated individuals in the Framingham Heart Study Offspring cohort. A combined analysis revealed significant independent validation of this association in both unrelated (p = 0.046) and family-based (p = 0.004) samples. The estimated risk conferred by this allele is small, and could easily be masked by small sample size, population stratification, or other confounders. These validation studies suggest that the original association is less likely to be spurious, but the failure to observe an association in every data set suggests that the effect of SNP rs7566605 on BMI may be heterogeneous across population samples.
DOI: 10.1016/j.ajhg.2007.10.009
2008
Cited 127 times
A Unified Association Analysis Approach for Family and Unrelated Samples Correcting for Stratification
There are two common designs for association mapping of complex diseases: case-control and family-based designs. A case-control sample is more powerful to detect genetic effects than a family-based sample that contains the same numbers of affected and unaffected persons, although additional markers may be required to control for spurious association. When family and unrelated samples are available, statistical analyses are often performed in the family and unrelated samples separately, conditioning on parental information for the former, thus resulting in reduced power. In this report, we propose a unified approach that can incorporate both family and case-control samples and, provided the additional markers are available, at the same time corrects for population stratification. We apply the principal components of a marker matrix to adjust for the effect of population stratification. This unified approach makes it unnecessary to perform a conditional analysis of the family data and is more powerful than the separate analyses of unrelated and family samples, or a meta-analysis performed by combining the results of the usual separate analyses. This property is demonstrated in both a variety of simulation models and empirical data. The proposed approach can be equally applied to the analysis of both qualitative and quantitative traits. There are two common designs for association mapping of complex diseases: case-control and family-based designs. A case-control sample is more powerful to detect genetic effects than a family-based sample that contains the same numbers of affected and unaffected persons, although additional markers may be required to control for spurious association. When family and unrelated samples are available, statistical analyses are often performed in the family and unrelated samples separately, conditioning on parental information for the former, thus resulting in reduced power. In this report, we propose a unified approach that can incorporate both family and case-control samples and, provided the additional markers are available, at the same time corrects for population stratification. We apply the principal components of a marker matrix to adjust for the effect of population stratification. This unified approach makes it unnecessary to perform a conditional analysis of the family data and is more powerful than the separate analyses of unrelated and family samples, or a meta-analysis performed by combining the results of the usual separate analyses. This property is demonstrated in both a variety of simulation models and empirical data. The proposed approach can be equally applied to the analysis of both qualitative and quantitative traits.
DOI: 10.1159/000321967
2010
Cited 124 times
The Meaning of Interaction
Although recent studies have attempted to dispel the confusion that exists in regard to the definition, analysis and interpretation of interaction in genetics, there still remain aspects that are poorly understood by non-statisticians. After a brief discussion of the definition of gene-gene interaction, the main part of this study addresses the fundamental meaning of statistical interaction and its relationship to measurement scale, disproportionate sample sizes in the cells of a two-way table and gametic phase disequilibrium.
DOI: 10.1016/j.ajhg.2011.08.001
2011
Cited 123 times
A Variant in MCF2L Is Associated with Osteoarthritis
Osteoarthritis (OA) is a prevalent, heritable degenerative joint disease with a substantial public health impact. We used a 1000-Genomes-Project-based imputation in a genome-wide association scan for osteoarthritis (3177 OA cases and 4894 controls) to detect a previously unidentified risk locus. We discovered a small disease-associated set of variants on chromosome 13. Through large-scale replication, we establish a robust association with SNPs in <i>MCF2L</i> (rs11842874, combined odds ratio [95% confidence interval] 1.17 [1.11–1.23], p = 2.1 × 10<sup>−8</sup>) across a total of 19,041 OA cases and 24,504 controls of European descent. This risk locus represents the third established signal for OA overall. <i>MCF2L</i> regulates a nerve growth factor (NGF), and treatment with a humanized monoclonal antibody against NGF is associated with reduction in pain and improvement in function for knee OA patients.
DOI: 10.1371/journal.pgen.1001371
2011
Cited 123 times
Enhanced Statistical Tests for GWAS in Admixed Populations: Assessment using African Americans from CARe and a Breast Cancer Consortium
While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.
DOI: 10.1167/iovs.16-20243
2016
Cited 121 times
Age-Specific Prevalence of Visual Impairment and Refractive Error in Children Aged 3–10 Years in Shanghai, China
We assessed changes in age-specific prevalence of refractive error at the time of starting school, by comparing preschool and school age cohorts in Shanghai, China.A cross-sectional study was done in Jiading District, Shanghai during November and December 2013. We randomly selected 7 kindergartens and 7 primary schools, with probability proportionate to size. Chinese children (n = 8398) aged 3 to 10 years were enumerated, and 8267 (98.4%) were included. Children underwent distance visual acuity assessment and refraction measurement by cycloplegic autorefraction and subjective refraction.The prevalence of uncorrected visual acuity (UCVA), presenting visual acuity, and best-corrected visual acuity in the better eye of ≤20/40 was 19.8%, 15.5%, and 1.7%, respectively. Among those with UCVA ≤ 20/40, 93.2% could achieve visual acuity of ≥20/32 with refraction. Only 28.7% (n = 465) of children with UCVA in the better eye of ≤20/40 wore glasses. Prevalence of myopia (spherical equivalent ≤-0.5 diopters [D] in at least one eye) increased from 1.78% in 3-year-olds to 52.2% in 10-year-olds, while prevalence of hyperopia (spherical equivalent ≥+2.0 D) decreased from 17.8% among 3-year-olds to 2.6% by 10 years of age. After adjusting for age, attending elite "high-level" school was statistically associated with greater myopia prevalence.The prevalence of myopia was lower or comparable to that reported in other populations from age 3 to 5 years, but increased dramatically after 6 years, consistent with a strong environmental role of schooling on myopia development.
DOI: 10.1002/gepi.20449
2009
Cited 118 times
Detecting rare variants for complex traits using family and unrelated data
Large genome-wide association studies (GWAS) have been performed to detect common genetic variants involved in common diseases, but most of the variants found this way account for only a small portion of the trait variance. Furthermore, candidate gene-based resequencing suggests that many rare genetic variants contribute to the trait variance of common diseases. Here we propose two designs, sibpair and unrelated-case designs, to detect rare genetic variants in either a candidate gene-based or genome-wide association analysis. First we show that we can detect and classify together rare risk haplotypes using a relatively small sample with either of these designs, and then have increased power to test association in a larger case-control sample. This method can also be applied to resequencing data. Next we apply the method to the Wellcome Trust Case Control Consortium (WTCCC) coronary artery disease (CAD) and hypertension (HT) data, the latter being the only trait for which no genome-wide association evidence was reported in the original WTCCC study, and identify one interesting gene associated with HT and four associated with CAD at a genome-wide significance level of 5%. These results suggest that searching for rare genetic variants is feasible and can be fruitful in current GWAS, candidate gene studies or resequencing studies.
DOI: 10.1016/j.ajhg.2013.04.025
2013
Cited 116 times
Genome-wide Characterization of Shared and Distinct Genetic Components that Influence Blood Lipid Levels in Ethnically Diverse Human Populations
Blood lipid concentrations are heritable risk factors associated with atherosclerosis and cardiovascular diseases. Lipid traits exhibit considerable variation among populations of distinct ancestral origin as well as between individuals within a population. We performed association analyses to identify genetic loci influencing lipid concentrations in African American and Hispanic American women in the Women's Health Initiative SNP Health Association Resource. We validated one African-specific high-density lipoprotein cholesterol locus at CD36 as well as 14 known lipid loci that have been previously implicated in studies of European populations. Moreover, we demonstrate striking similarities in genetic architecture (loci influencing the trait, direction and magnitude of genetic effects, and proportions of phenotypic variation explained) of lipid traits across populations. In particular, we found that a disproportionate fraction of lipid variation in African Americans and Hispanic Americans can be attributed to genomic loci exhibiting statistical evidence of association in Europeans, even though the precise genes and variants remain unknown. At the same time, we found substantial allelic heterogeneity within shared loci, characterized both by population-specific rare variants and variants shared among multiple populations that occur at disparate frequencies. The allelic heterogeneity emphasizes the importance of including diverse populations in future genetic association studies of complex traits such as lipids; furthermore, the overlap in lipid loci across populations of diverse ancestral origin argues that additional knowledge can be gleaned from multiple populations.
DOI: 10.1016/j.jff.2014.06.009
2014
Cited 114 times
Optimizing soaking and germination conditions to improve gamma-aminobutyric acid content in japonica and indica germinated brown rice
Germinated brown rice is a well-known functional food due to its high content of gamma-aminobutyric acid (GABA). This study was designed to test the difference of producing GABA in two domesticated rice genotypes (indica and japonica rice), and the effects of adding exogenous glutamic acid or gibberellin, and processing conditions. Soaking at 30 °C and germination at 35 °C during 36 h resulted in the highest GABA in distilled soaking water with pH 7. The indica rice showed higher GABA levels than japonica rice. GABA was increased under acidic soaking conditions or by adding L-glutamic acid (L-Glu) at the optimal concentration of 1.0 g L−1 and gibberellin A3 (GA3) at the optimal concentration of 0.25 mg L−1. The lower accumulation of GABA in japonica rice could be remedied by adding exogenous L-Glu and GA3, and providing acidic soaking conditions. The results help to efficiently produce GABA enriched functional food.
DOI: 10.1038/ncomms9658
2015
Cited 114 times
Sixteen new lung function signals identified through 1000 Genomes Project reference panel imputation
Lung function measures are used in the diagnosis of chronic obstructive pulmonary disease. In 38,199 European ancestry individuals, we studied genome-wide association of forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC) and FEV1/FVC with 1000 Genomes Project (phase 1)-imputed genotypes and followed up top associations in 54,550 Europeans. We identify 14 novel loci (P<5 × 10(-8)) in or near ENSA, RNU5F-1, KCNS3, AK097794, ASTN2, LHX3, CCDC91, TBX3, TRIP11, RIN3, TEKT5, LTBP4, MN1 and AP1S2, and two novel signals at known loci NPNT and GPR126, providing a basis for new understanding of the genetic determinants of these traits and pulmonary diseases in which they are altered.
DOI: 10.1164/rccm.201512-2431oc
2016
Cited 111 times
Genetic Associations with Obstructive Sleep Apnea Traits in Hispanic/Latino Americans
Obstructive sleep apnea is a common disorder associated with increased risk for cardiovascular disease, diabetes, and premature mortality. Although there is strong clinical and epidemiologic evidence supporting the importance of genetic factors in influencing obstructive sleep apnea, its genetic basis is still largely unknown. Prior genetic studies focused on traits defined using the apnea-hypopnea index, which contains limited information on potentially important genetically determined physiologic factors, such as propensity for hypoxemia and respiratory arousability.To define novel obstructive sleep apnea genetic risk loci for obstructive sleep apnea, we conducted genome-wide association studies of quantitative traits in Hispanic/Latino Americans from three cohorts.Genome-wide data from as many as 12,558 participants in the Hispanic Community Health Study/Study of Latinos, Multi-Ethnic Study of Atherosclerosis, and Starr County Health Studies population-based cohorts were metaanalyzed for association with the apnea-hypopnea index, average oxygen saturation during sleep, and average respiratory event duration.Two novel loci were identified at genome-level significance (rs11691765, GPR83, P = 1.90 × 10-8 for the apnea-hypopnea index, and rs35424364; C6ORF183/CCDC162P, P = 4.88 × 10-8 for respiratory event duration) and seven additional loci were identified with suggestive significance (P < 5 × 10-7). Secondary sex-stratified analyses also identified one significant and several suggestive associations. Multiple loci overlapped genes with biologic plausibility.These are the first genome-level significant findings reported for obstructive sleep apnea-related physiologic traits in any population. These findings identify novel associations in inflammatory, hypoxia signaling, and sleep pathways.
DOI: 10.1371/journal.pgen.1006719
2017
Cited 103 times
Discovery and fine-mapping of adiposity loci using high density imputation of genome-wide association studies in individuals of African ancestry: African Ancestry Anthropometry Genetics Consortium
Genome-wide association studies (GWAS) have identified >300 loci associated with measures of adiposity including body mass index (BMI) and waist-to-hip ratio (adjusted for BMI, WHRadjBMI), but few have been identified through screening of the African ancestry genomes. We performed large scale meta-analyses and replications in up to 52,895 individuals for BMI and up to 23,095 individuals for WHRadjBMI from the African Ancestry Anthropometry Genetics Consortium (AAAGC) using 1000 Genomes phase 1 imputed GWAS to improve coverage of both common and low frequency variants in the low linkage disequilibrium African ancestry genomes. In the sex-combined analyses, we identified one novel locus (TCF7L2/HABP2) for WHRadjBMI and eight previously established loci at P < 5×10-8: seven for BMI, and one for WHRadjBMI in African ancestry individuals. An additional novel locus (SPRYD7/DLEU2) was identified for WHRadjBMI when combined with European GWAS. In the sex-stratified analyses, we identified three novel loci for BMI (INTS10/LPL and MLC1 in men, IRX4/IRX2 in women) and four for WHRadjBMI (SSX2IP, CASC8, PDE3B and ZDHHC1/HSD11B2 in women) in individuals of African ancestry or both African and European ancestry. For four of the novel variants, the minor allele frequency was low (<5%). In the trans-ethnic fine mapping of 47 BMI loci and 27 WHRadjBMI loci that were locus-wide significant (P < 0.05 adjusted for effective number of variants per locus) from the African ancestry sex-combined and sex-stratified analyses, 26 BMI loci and 17 WHRadjBMI loci contained ≤ 20 variants in the credible sets that jointly account for 99% posterior probability of driving the associations. The lead variants in 13 of these loci had a high probability of being causal. As compared to our previous HapMap imputed GWAS for BMI and WHRadjBMI including up to 71,412 and 27,350 African ancestry individuals, respectively, our results suggest that 1000 Genomes imputation showed modest improvement in identifying GWAS loci including low frequency variants. Trans-ethnic meta-analyses further improved fine mapping of putative causal variants in loci shared between the African and European ancestry populations.
DOI: 10.1126/scitranslmed.aad3744
2016
Cited 99 times
A genomic approach to therapeutic target validation identifies a glucose-lowering <i>GLP1R</i> variant protective for coronary heart disease
A missense variant in GLP1R associated with lower fasting glucose levels and protective against T2D is associated with lower risk of coronary heart disease, suggesting that GLP1R agonists are not associated with an unacceptable increase in cardiovascular risk.
DOI: 10.1212/wnl.0000000000001606
2015
Cited 93 times
Shared genetic basis for migraine and ischemic stroke
To quantify genetic overlap between migraine and ischemic stroke (IS) with respect to common genetic variation.We applied 4 different approaches to large-scale meta-analyses of genome-wide data on migraine (23,285 cases and 95,425 controls) and IS (12,389 cases and 62,004 controls). First, we queried known genome-wide significant loci for both disorders, looking for potential overlap of signals. We then analyzed the overall shared genetic load using polygenic scores and estimated the genetic correlation between disease subtypes using data derived from these models. We further interrogated genomic regions of shared risk using analysis of covariance patterns between the 2 phenotypes using cross-phenotype spatial mapping.We found substantial genetic overlap between migraine and IS using all 4 approaches. Migraine without aura (MO) showed much stronger overlap with IS and its subtypes than migraine with aura (MA). The strongest overlap existed between MO and large artery stroke (LAS; p = 6.4 × 10(-28) for the LAS polygenic score in MO) and between MO and cardioembolic stroke (CE; p = 2.7 × 10(-20) for the CE score in MO).Our findings indicate shared genetic susceptibility to migraine and IS, with a particularly strong overlap between MO and both LAS and CE pointing towards shared mechanisms. Our observations on MA are consistent with a limited role of common genetic variants in this subtype.
DOI: 10.1186/s12864-015-2316-4
2016
Cited 91 times
Updated sesame genome assembly and fine mapping of plant height and seed coat color QTLs using a new high-density genetic map
Sesame is an important high-quality oil seed crop. The sesame genome was de novo sequenced and assembled in 2014 (version 1.0); however, the number of anchored pseudomolecules was higher than the chromosome number (2n = 2x = 26) due to the lack of a high-density genetic map with 13 linkage groups.We resequenced a permanent population consisting of 430 recombinant inbred lines and constructed a genetic map to improve the sesame genome assembly. We successfully anchored 327 scaffolds onto 13 pseudomolecules. The new genome assembly (version 2.0) included 97.5 % of the scaffolds greater than 150 kb in size present in assembly version 1.0 and increased the total pseudomolecule length from 233.7 to 258.4 Mb with 94.3 % of the genome assembled and 97.2 % of the predicted gene models anchored. Based on the new genome assembly, a bin map including 1,522 bins spanning 1090.99 cM was generated and used to identified 41 quantitative trait loci (QTLs) for sesame plant height and 9 for seed coat color. The plant height-related QTLs explained 3-24 % the phenotypic variation (mean value, 8 %), and 29 of them were detected in at least two field trials. Two major loci (qPH-8.2 and qPH-3.3) that contributed 23 and 18 % of the plant height were located in 350 and 928-kb spaces on Chr8 and Chr3, respectively. qPH-3.3, is predicted to be responsible for the semi-dwarf sesame plant phenotype and contains 102 candidate genes. This is the first report of a sesame semi-dwarf locus and provides an interesting opportunity for a plant architecture study of the sesame. For the sesame seed coat color, the QTLs of the color spaces L*, a*, and b* were detected with contribution rates of 3-46 %. qSCb-4.1 contributed approximately 39 % of the b* value and was located on Chr4 in a 199.9-kb space. A list of 32 candidate genes for the locus, including a predicted black seed coat-related gene, was determined by screening the newly anchored genome.This study offers a high-density genetic map and an improved assembly of the sesame genome. The number of linkage groups and pseudomolecules in this assembly equals the number of sesame chromosomes for the first time. The map and updated genome assembly are expected to serve as a platform for future comparative genomics and genetic studies.
DOI: 10.1371/journal.pgen.1006728
2017
Cited 89 times
Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations
Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genome-wide association studies comprised of 31,968 individuals of African ancestry, and validated our results with additional 54,395 individuals from multi-ethnic studies. These analyses identified nine loci with eleven independent variants which reached genome-wide significance (P < 1.25×10-8) for either systolic and diastolic blood pressure, hypertension, or for combined traits. Single-trait analyses identified two loci (TARID/TCF21 and LLPH/TMBIM4) and multiple-trait analyses identified one novel locus (FRMD3) for blood pressure. At these three loci, as well as at GRP20/CDH17, associated variants had alleles common only in African-ancestry populations. Functional annotation showed enrichment for genes expressed in immune and kidney cells, as well as in heart and vascular cells/tissues. Experiments driven by these findings and using angiotensin-II induced hypertension in mice showed altered kidney mRNA expression of six genes, suggesting their potential role in hypertension. Our study provides new evidence for genes related to hypertension susceptibility, and the need to study African-ancestry populations in order to identify biologic factors contributing to hypertension.
DOI: 10.1165/rcmb.2017-0237oc
2018
Cited 69 times
Multiethnic Meta-Analysis Identifies <i>RAI1</i> as a Possible Obstructive Sleep Apnea–related Quantitative Trait Locus in Men
Obstructive sleep apnea (OSA) is a common heritable disorder displaying marked sexual dimorphism in disease prevalence and progression. Previous genetic association studies have identified a few genetic loci associated with OSA and related quantitative traits, but they have only focused on single ethnic groups, and a large proportion of the heritability remains unexplained. The apnea-hypopnea index (AHI) is a commonly used quantitative measure characterizing OSA severity. Because OSA differs by sex, and the pathophysiology of obstructive events differ in rapid eye movement (REM) and non-REM (NREM) sleep, we hypothesized that additional genetic association signals would be identified by analyzing the NREM/REM-specific AHI and by conducting sex-specific analyses in multiethnic samples. We performed genome-wide association tests for up to 19,733 participants of African, Asian, European, and Hispanic/Latino American ancestry in 7 studies. We identified rs12936587 on chromosome 17 as a possible quantitative trait locus for NREM AHI in men (N = 6,737; P = 1.7 × 10-8) but not in women (P = 0.77). The association with NREM AHI was replicated in a physiological research study (N = 67; P = 0.047). This locus overlapping the RAI1 gene and encompassing genes PEMT1, SREBF1, and RASD1 was previously reported to be associated with coronary artery disease, lipid metabolism, and implicated in Potocki-Lupski syndrome and Smith-Magenis syndrome, which are characterized by abnormal sleep phenotypes. We also identified gene-by-sex interactions in suggestive association regions, suggesting that genetic variants for AHI appear to vary by sex, consistent with the clinical observations of strong sexual dimorphism.
DOI: 10.1212/wnl.0000000000013120
2022
Cited 30 times
Cardiovascular Risk Factors and MRI Markers of Cerebral Small Vessel Disease
Cardiovascular risk factors have been implicated in the etiology of cerebral small vessel disease (CSVD); however, whether the associations are causal remains unclear in part due to the susceptibility of observational studies to reverse causation and confounding. Here, we use mendelian randomization (MR) to determine which cardiovascular risk factors are likely to be involved in the etiology of CSVD.We used data from large-scale genome-wide association studies of European ancestry to identify genetic proxies for blood pressure, blood lipids, body mass index (BMI), type 2 diabetes, smoking initiation, cigarettes per day, and alcohol consumption. MR was performed to assess their association with 3 neuroimaging features that are altered in CSVD (white matter hyperintensities [WMH], fractional anisotropy [FA], and mean diffusivity [MD]) using genetic summary data from the UK Biobank (N = 31,855). Our primary analysis used inverse-weighted median MR, with validation using weighted median, MR-Egger, and a pleiotropy-minimizing approach. Finally, multivariable MR was performed to study the effects of multiple risk factors jointly.MR analysis showed consistent associations across all methods for higher genetically proxied systolic and diastolic blood pressures with WMH, FA, and MD and for higher genetically proxied BMI with WMH. There was weaker evidence for associations between total cholesterol, low-density lipoprotein, smoking initiation, pulse pressure, and type 2 diabetes liability and at least 1 CSVD imaging feature, but these associations were not reproducible across all validation methods used. Multivariable MR analysis for blood pressure traits found that the effect was primarily through genetically proxied diastolic blood pressure across all CSVD traits.Genetic predisposition to higher blood pressure, primarily diastolic blood pressure, and to higher BMI is associated with a higher burden of CSVD, suggesting a causal role. Improved management and treatment of these risk factors could reduce the burden of CSVD.
DOI: 10.1086/340362
2002
Cited 145 times
A Combined Analysis of Genomewide Linkage Scans for Body Mass Index, from the National Heart, Lung, and Blood Institute Family Blood Pressure Program
A combined analysis of genome scans for obesity was undertaken using the interim results from the National Heart, Lung, and Blood Institute Family Blood Pressure Program. In this research project, four multicenter networks of investigators conducted eight individual studies. Data were available on 6,849 individuals from four ethnic groups (white, black, Mexican American, and Asian). The sample represents the largest single collection of genomewide scan data that has been analyzed for obesity and provides a test of the reproducibility of linkage analysis for a complex phenotype. Body mass index (BMI) was used as the measure of adiposity. Genomewide linkage analyses were first performed separately in each of the eight ethnic groups in the four networks, through use of the variance-component method. Only one region in the analyses of the individual studies showed significant linkage with BMI: 3q22.1 (LOD 3.45, for the GENOA network black sample). Six additional regions were found with an associated LOD >2, including 3p24.1, 7p15.2, 7q22.3, 14q24.3, 16q12.2, and 17p11.2. Among these findings, the linkage at 7p15.2, 7q22.3, and 17p11.2 has been reported elsewhere. A modified Fisher's omnibus procedure was then used to combine the P values from each of the eight genome scans. A complimentary approach to the meta-analysis was undertaken, combining the average allele-sharing identity by descent (pi) for whites, blacks, and Mexican Americans. Using this approach, we found strong linkage evidence for a quantitative-trait locus at 3q27 (marker D3S2427; LOD 3.40, P=.03). The same location has been shown to be linked with obesity-related traits and diabetes in at least two other studies. These results (1) confirm the previously reported obesity-susceptibility locus on chromosomes 3, 7, and 17 and (2) demonstrate that combining samples from different studies can increase the power to detect common genes with a small-to-moderate effect, so long as the same gene has an effect in all samples considered.
DOI: 10.1002/1098-2272(200101)20:1<57::aid-gepi6>3.0.co;2-5
2000
Cited 144 times
Transmission/disequilibrium tests for quantitative traits
Spielman et al. [1993] proposed a transmission-disequilibrium test (TDT), based on marker data collected on affected offspring and their parents, to test for linkage between a genetic marker and a binary trait provided there is allelic association. It has been shown that this TDT is powerful and is not affected by allelic association due to population stratification in the absence of linkage. For quantitative traits, George and Elston [1987] proposed a likelihood method to detect the effect of a candidate gene in pedigree data when familial correlations are present. This test will detect allelic association but will do so in the absence of linkage. In this paper, we investigate two new likelihood-ratio test statistics for multi-generational quantitative traits to test either for linkage in the presence of allelic association or for allelic association in the presence of linkage, such as may be due to linkage disequilibrium. We compare these two tests analytically and by simulation with respect to 1) the sample size required for the asymptotic null distributions to be valid and 2) their power to detect association in those cases in which they are not sensitive to population stratification unless linkage is present. In general, 80 nuclear families with two children each and at least one heterozygous parent, or the equivalent number of children in large pedigrees, are enough for the asymptotic null distribution of the proposed conditional and TDT methods to be valid. The theoretical power is close to the simulated power except for the case of a recessive allele with low frequency. A sampling strategy is proposed that dramatically improves power. Genet. Epidemiol. 20:57–74, 2001. © 2001 Wiley-Liss, Inc.
DOI: 10.1002/gepi.210
2002
Cited 123 times
Association mapping, using a mixture model for complex traits
Association mapping for complex diseases using unrelated individuals can be more powerful than family-based analysis in many settings. In addition, this approach has major practical advantages, including greater efficiency in sample recruitment. Association mapping may lead to false-positive findings, however, if population stratification is not properly considered. In this paper, we propose a method that makes it possible to infer the number of subpopulations by a mixture model, using a set of independent genetic markers and then testing the association between a genetic marker and a trait. The proposed method can be effectively applied in the analysis of both qualitative and quantitative traits. Extensive simulations demonstrate that the method is valid in the presence of a population structure.
DOI: 10.1161/01.hyp.0000068681.69874.cb
2003
Cited 119 times
Associations Between Hypertension and Genes in the Renin-Angiotensin System
The genes of the renin-angiotensin system have been subjected to intense molecular scrutiny in cardiovascular disease studies, but their contribution to risk is still uncertain. In this study, we sampled 192 African American and 153 European American families (602 and 608 individuals, respectively) to evaluate the contribution of variations in genes that encode renin-angiotensin system components of susceptibility to hypertension. We genotyped 25 single-nucleotide polymorphisms in the renin-angiotensin system genes ACE, AGT, AGTR1, and REN. The family-based transmission/disequilibrium test was performed with each single-nucleotide polymorphism and with the multilocus haplotypes. Two individual single-nucleotide polymorphisms were significantly associated with hypertension among African Americans, and this result persisted when both groups were combined. The associations were confirmed in haplotype analysis for REN, AGTR1, and ACE in African Americans. Consistent but less significant evidence was found in European Americans. We also randomly sampled unrelated individuals across families to obtain 84 cases and 108 controls among the African Americans and 41 cases and 113 controls in the European Americans. Single-nucleotide polymorphism and haplotype analyses again showed consistent, albeit weaker, results. Thus, in this biracial population sample, we find evidence that interindividual variation in the renin-angiotensin system genes contributes to hypertension risk.
DOI: 10.1038/sj.jid.5700302
2006
Cited 106 times
Diminished Induction of Skin Fibrosis in Mice with MCP-1 Deficiency
Scar and fibrosis are often the end result of mechanical injury and inflammatory diseases. One chemokine that is repeatedly linked to fibrotic responses is monocyte chemoattractant protein-1 (MCP-1). We utilized a murine fibrosis model that produces dermal lesions similar to scleroderma to evaluate collagen fibrillogenesis in the absence of MCP-1. Dermal fibrosis was induced by subcutaneous injection of bleomycin into the dorsal skin of MCP-1−/− and wild-type C57BL/6 mice. After 4 weeks of daily injections, bleomycin treatment led to thickened collagen bundles with robust inflammation in the lesional dermis of wild-type mice. In contrast, the lesional skin of MCP-1−/− mice exhibited a dermal architecture similar to phosphate-buffered saline (PBS)-injected control and normal skin, with few inflammatory cells. Ultrastructural analysis of the lesional dermis from bleomycin-injected wild-type mice revealed markedly abnormal arrangement of collagen fibrils, with normal large diameter collagen fibrils replaced by small collagen fibrils of 41.5 nm. In comparison, the dermis of bleomycin-injected MCP-1−/− mice displayed a uniform pattern of fibril diameters that was similar to normal skin (average diameter 76.7 nm). The findings implicate MCP-1 as a key determinant in the development of skin fibrosis induced by bleomycin, and suggest that MCP-1 may influence collagen fiber formation in vivo. Scar and fibrosis are often the end result of mechanical injury and inflammatory diseases. One chemokine that is repeatedly linked to fibrotic responses is monocyte chemoattractant protein-1 (MCP-1). We utilized a murine fibrosis model that produces dermal lesions similar to scleroderma to evaluate collagen fibrillogenesis in the absence of MCP-1. Dermal fibrosis was induced by subcutaneous injection of bleomycin into the dorsal skin of MCP-1−/− and wild-type C57BL/6 mice. After 4 weeks of daily injections, bleomycin treatment led to thickened collagen bundles with robust inflammation in the lesional dermis of wild-type mice. In contrast, the lesional skin of MCP-1−/− mice exhibited a dermal architecture similar to phosphate-buffered saline (PBS)-injected control and normal skin, with few inflammatory cells. Ultrastructural analysis of the lesional dermis from bleomycin-injected wild-type mice revealed markedly abnormal arrangement of collagen fibrils, with normal large diameter collagen fibrils replaced by small collagen fibrils of 41.5 nm. In comparison, the dermis of bleomycin-injected MCP-1−/− mice displayed a uniform pattern of fibril diameters that was similar to normal skin (average diameter 76.7 nm). The findings implicate MCP-1 as a key determinant in the development of skin fibrosis induced by bleomycin, and suggest that MCP-1 may influence collagen fiber formation in vivo. extracellular matrix glyceraldhehyde-3-phosphate-dehydrogenase heat-shock protein 47 monocyte chemoattractant protein-1 matrix metalloproteinase phosphate-buffered saline alpha-smooth muscle actin transforming growth factor-beta tissue inhibitor of metalloproteinase wild-type
DOI: 10.1164/rccm.201002-0192oc
2010
Cited 99 times
A Candidate Gene Study of Obstructive Sleep Apnea in European Americans and African Americans
Obstructive sleep apnea (OSA) is hypothesized to be influenced by genes within pathways involved with obesity, craniofacial development, inflammation, and ventilatory control.We conducted the first candidate gene study of OSA using family data from European Americans and African Americans, selecting biologically plausible genes from within these pathways.A total of 1,080 single nucleotide polymorphisms (SNPs) were genotyped in 729 African Americans and 505 SNPs were genotyped in 694 European Americans. Coding for SNPs additively, association testing on the apnea-hypopnea index (AHI) as a continuous trait, and OSA as a dichotomous trait (AHI ≥15) was conducted using methods that account for familial correlations in models adjusted for age, age-squared, and sex, with and without body mass index.In European Americans, variants within C-reactive protein (CRP) and glial cell line-derived neurotrophic factor (GDNF) were associated with AHI (CRP: β = 4.6; SE = 1.1; P = 0.0000402) (GDNF: β = 4.3; SE = 1; P = 0.0000201) and with the dichotomous OSA trait (CRP: odds ratio = 2.4; 95% confidence interval, 1.5-3.9; P = 0.000170) (GDNF: odds ratio = 2; 95% confidence interval, 1.4-2.89; P = 0.0000433). In African Americans, rs9526240 within serotonin receptor 2a (HTR2A: odds ratio = 2.1; 95% confidence interval, 1.5-2.9; P = 0.00005233) was associated with OSA.This candidate gene analysis identified the potential role of genes operating through intermediate disease pathways to influence sleep apnea phenotypes, providing a framework for focusing future replication studies.
DOI: 10.1371/journal.pgen.1002298
2011
Cited 95 times
Identification, Replication, and Fine-Mapping of Loci Associated with Adult Height in Individuals of African Ancestry
Adult height is a classic polygenic trait of high heritability (h(2) approximately 0.8). More than 180 single nucleotide polymorphisms (SNPs), identified mostly in populations of European descent, are associated with height. These variants convey modest effects and explain approximately10% of the variance in height. Discovery efforts in other populations, while limited, have revealed loci for height not previously implicated in individuals of European ancestry. Here, we performed a meta-analysis of genome-wide association (GWA) results for adult height in 20,427 individuals of African ancestry with replication in up to 16,436 African Americans. We found two novel height loci (Xp22-rs12393627, P = 3.4×10(-12) and 2p14-rs4315565, P = 1.2×10(-8)). As a group, height associations discovered in European-ancestry samples replicate in individuals of African ancestry (P = 1.7×10(-4) for overall replication). Fine-mapping of the European height loci in African-ancestry individuals showed an enrichment of SNPs that are associated with expression of nearby genes when compared to the index European height SNPs (P<0.01). Our results highlight the utility of genetic studies in non-European populations to understand the etiology of complex human diseases and traits.
DOI: 10.1093/hmg/ddq178
2010
Cited 86 times
Fine mapping of the association with obesity at the FTO locus in African-derived populations
Genome-wide association studies have identified many common genetic variants that are associated with polygenic traits, and have typically been performed with individuals of recent European ancestry. In these populations, many common variants are tightly correlated, with the perfect or near-perfect proxies for the functional or true variant showing equivalent evidence of association, considerably limiting the resolution of fine mapping. Populations with recent African ancestry often have less extensive and/or different patterns of linkage disequilibrium (LD), and have been proposed to be useful in fine-mapping studies. Here, we strongly replicate and fine map in populations of predominantly African ancestry the association between variation at the FTO locus and body mass index (BMI) that is well established in populations of European ancestry. We genotyped single nucleotide polymorphisms that are correlated with the signal of association in individuals of European ancestry but that have varying degrees of correlation in African-derived individuals. Most of the variants, including one previously proposed as functionally important, have no significant association with BMI, but two variants, rs3751812 and rs9941349, show strong evidence of association (P = 2.58 x 10(-6) and 3.61 x 10(-6) in a meta-analysis of 9881 individuals). Thus, we have both strongly replicated this association in African-ancestry populations and narrowed the list of potentially causal variants to those that are correlated with rs3751812 and rs9941349 in African-derived populations. This study illustrates the potential of using populations with different LD patterns to fine map associations and helps pave the way for genetically guided functional studies at the FTO locus.
DOI: 10.1016/j.ajhg.2011.07.025
2011
Cited 84 times
Genome-wide Comparison of African-Ancestry Populations from CARe and Other Cohorts Reveals Signals of Natural Selection
The study of recent natural selection in human populations has important applications to human history and medicine. Positive natural selection drives the increase in beneficial alleles and plays a role in explaining diversity across human populations. By discovering traits subject to positive selection, we can better understand the population level response to environmental pressures including infectious disease. Our study examines unusual population differentiation between three large data sets to detect natural selection. The populations examined, African Americans, Nigerians, and Gambians, are genetically close to one another (F(ST) < 0.01 for all pairs), allowing us to detect selection even with moderate changes in allele frequency. We also develop a tree-based method to pinpoint the population in which selection occurred, incorporating information across populations. Our genome-wide significant results corroborate loci previously reported to be under selection in Africans including HBB and CD36. At the HLA locus on chromosome 6, results suggest the existence of multiple, independent targets of population-specific selective pressure. In addition, we report a genome-wide significant (p = 1.36 × 10(-11)) signal of selection in the prostate stem cell antigen (PSCA) gene. The most significantly differentiated marker in our analysis, rs2920283, is highly differentiated in both Africa and East Asia and has prior genome-wide significant associations to bladder and gastric cancers.
DOI: 10.1093/hmg/ddr113
2011
Cited 82 times
Combined admixture mapping and association analysis identifies a novel blood pressure genetic locus on 5p13: contributions from the CARe consortium
Admixture mapping based on recently admixed populations is a powerful method to detect disease variants with substantial allele frequency differences in ancestral populations. We performed admixture mapping analysis for systolic blood pressure (SBP) and diastolic blood pressure (DBP), followed by trait-marker association analysis, in 6303 unrelated African-American participants of the Candidate Gene Association Resource (CARe) consortium. We identified five genomic regions (P< 0.001) harboring genetic variants contributing to inter-individual BP variation. In follow-up association analyses, correcting for all tests performed in this study, three loci were significantly associated with SBP and one significantly associated with DBP (P< 10(-5)). Further analyses suggested that six independent single-nucleotide polymorphisms (SNPs) contributed to the phenotypic variation observed in the admixture mapping analysis. These six SNPs were examined for replication in multiple, large, independent studies of African-Americans [Women's Health Initiative (WHI), Maywood, Genetic Epidemiology Network of Arteriopathy (GENOA) and Howard University Family Study (HUFS)] as well as one native African sample (Nigerian study), with a total replication sample size of 11 882. Meta-analysis of the replication set identified a novel variant (rs7726475) on chromosome 5 between the SUB1 and NPR3 genes, as being associated with SBP and DBP (P< 0.0015 for both); in meta-analyses combining the CARe samples with the replication data, we observed P-values of 4.45 × 10(-7) for SBP and 7.52 × 10(-7) for DBP for rs7726475 that were significant after accounting for all the tests performed. Our study highlights that admixture mapping analysis can help identify genetic variants missed by genome-wide association studies because of drastically reduced number of tests in the whole genome.
DOI: 10.1007/s00439-011-1009-6
2011
Cited 79 times
Two-marker association tests yield new disease associations for coronary artery disease and hypertension
It has been postulated that multiple-marker methods may have added ability, over single-marker methods, to detect genetic variants associated with disease. The Wellcome Trust Case Control Consortium (WTCCC) provided the first successful large genome-wide association studies (GWAS) which included single-marker association analyses for seven common complex diseases. Of those signals detected, only one was associated with coronary artery disease (CAD), and none were identified for hypertension (HTN). Our objective was to find additional genetic associations and pathways for cardiovascular disease by examining the WTCCC data for variants associated with CAD and HTN using two-marker testing methods. We applied two-marker association testing to the WTCCC dataset, which includes ~2,000 affected individuals with each disorder, and a shared pool of ~3,000 controls, all genotyped using Affymetrix GeneChip 500 K arrays. For CAD, we detected single nucleotide polymorphisms (SNP) pairs in three genes showing genome-wide significance: HFE2, STK32B, and DIPC2. The most notable SNP pairs in a non-protein-coding region were at 9p21, a known major CAD-associated region. For HTN, we detected SNP pairs in five genes: GPR39, XRCC4, MYO6, ZFAT, and MACROD2. Four further associated SNP pair regions were at least 70 kb from any known gene. We have shown that novel, multiple-marker, statistical methods can be of use in finding variants in GWAS. We describe many new, associated variants for both CAD and HTN and describe their known genetic mechanisms.
DOI: 10.1371/journal.pone.0048836
2012
Cited 66 times
Association of Genetic Loci with Sleep Apnea in European Americans and African-Americans: The Candidate Gene Association Resource (CARe)
Although obstructive sleep apnea (OSA) is known to have a strong familial basis, no genetic polymorphisms influencing apnea risk have been identified in cross-cohort analyses. We utilized the National Heart, Lung, and Blood Institute (NHLBI) Candidate Gene Association Resource (CARe) to identify sleep apnea susceptibility loci. Using a panel of 46,449 polymorphisms from roughly 2,100 candidate genes on a customized Illumina iSelect chip, we tested for association with the apnea hypopnea index (AHI) as well as moderate to severe OSA (AHI≥15) in 3,551 participants of the Cleveland Family Study and two cohorts participating in the Sleep Heart Health Study.Among 647 African-Americans, rs11126184 in the pleckstrin (PLEK) gene was associated with OSA while rs7030789 in the lysophosphatidic acid receptor 1 (LPAR1) gene was associated with AHI using a chip-wide significance threshold of p-value<2×10(-6). Among 2,904 individuals of European ancestry, rs1409986 in the prostaglandin E2 receptor (PTGER3) gene was significantly associated with OSA. Consistency of effects between rs7030789 and rs1409986 in LPAR1 and PTGER3 and apnea phenotypes were observed in independent clinic-based cohorts.Novel genetic loci for apnea phenotypes were identified through the use of customized gene chips and meta-analyses of cohort data with replication in clinic-based samples. The identified SNPs all lie in genes associated with inflammation suggesting inflammation may play a role in OSA pathogenesis.
DOI: 10.1007/s11032-016-0449-z
2016
Cited 60 times
Characterization of an IAA-glucose hydrolase gene TaTGW6 associated with grain weight in common wheat (Triticum aestivum L.)
DOI: 10.1016/j.ajhg.2016.05.006
2016
Cited 53 times
Trans-ethnic Meta-analysis and Functional Annotation Illuminates the Genetic Architecture of Fasting Glucose and Insulin
Knowledge of the genetic basis of the type 2 diabetes (T2D)-related quantitative traits fasting glucose (FG) and insulin (FI) in African ancestry (AA) individuals has been limited. In non-diabetic subjects of AA (n = 20,209) and European ancestry (EA; n = 57,292), we performed trans-ethnic (AA+EA) fine-mapping of 54 established EA FG or FI loci with detailed functional annotation, assessed their relevance in AA individuals, and sought previously undescribed loci through trans-ethnic (AA+EA) meta-analysis. We narrowed credible sets of variants driving association signals for 22/54 EA-associated loci; 18/22 credible sets overlapped with active islet-specific enhancers or transcription factor (TF) binding sites, and 21/22 contained at least one TF motif. Of the 54 EA-associated loci, 23 were shared between EA and AA. Replication with an additional 10,096 AA individuals identified two previously undescribed FI loci, chrX FAM133A (rs213676) and chr5 PELO (rs6450057). Trans-ethnic analyses with regulatory annotation illuminate the genetic architecture of glycemic traits and suggest gene regulation as a target to advance precision medicine for T2D. Our approach to utilize state-of-the-art functional annotation and implement trans-ethnic association analysis for discovery and fine-mapping offers a framework for further follow-up and characterization of GWAS signals of complex trait loci.
DOI: 10.1007/s40484-020-0216-3
2021
Cited 32 times
Mendelian randomization and pleiotropy analysis
Background Mendelian randomization (MR) analysis has become popular in inferring and estimating the causality of an exposure on an outcome due to the success of genome wide association studies. Many statistical approaches have been developed and each of these methods require specific assumptions. Results In this article, we review the pros and cons of these methods. We use an example of high‐density lipoprotein cholesterol on coronary artery disease to illuminate the challenges in Mendelian randomization investigation. Conclusion The current available MR approaches allow us to study causality among risk factors and outcomes. However, novel approaches are desirable for overcoming multiple source confounding of risk factors and an outcome in MR analysis.
DOI: 10.1371/journal.pgen.1011037
2024
Searching across-cohort relatives in 54,092 GWAS samples via encrypted genotype regression
Explicitly sharing individual level data in genomics studies has many merits comparing to sharing summary statistics, including more strict QCs, common statistical analyses, relative identification and improved statistical power in GWAS, but it is hampered by privacy or ethical constraints. In this study, we developed encG-reg, a regression approach that can detect relatives of various degrees based on encrypted genomic data, which is immune of ethical constraints. The encryption properties of encG-reg are based on the random matrix theory by masking the original genotypic matrix without sacrificing precision of individual-level genotype data. We established a connection between the dimension of a random matrix, which masked genotype matrices, and the required precision of a study for encrypted genotype data. encG-reg has false positive and false negative rates equivalent to sharing original individual level data, and is computationally efficient when searching relatives. We split the UK Biobank into their respective centers, and then encrypted the genotype data. We observed that the relatives estimated using encG-reg was equivalently accurate with the estimation by KING, which is a widely used software but requires original genotype data. In a more complex application, we launched a finely devised multi-center collaboration across 5 research institutes in China, covering 9 cohorts of 54,092 GWAS samples. encG-reg again identified true relatives existing across the cohorts with even different ethnic backgrounds and genotypic qualities. Our study clearly demonstrates that encrypted genomic data can be used for data sharing without loss of information or data sharing barrier.
DOI: 10.1016/s0002-9297(07)62945-0
2000
Cited 115 times
Localization of a Small Genomic Region Associated with Elevated ACE
Defining the relationship between multiple polymorphisms in a small genomic region and an underlying quantitative trait locus (QTL) represents a major challenge in human genetics. Pedigree analyses have shown that angiotensin I-converting enzyme (ACE) levels are influenced by a QTL located within or close to the ACE gene and most likely resides in the 3' region of this locus. We genotyped seven polymorphisms spanning 13 kb in the 3' end of ACE in 159 Afro-Caribbean subjects to evaluate the linkage disequilibrium between these sites and to narrow the genomic region associated with an elevated ACE level using a cladistic analysis. The linkage disequilibrium measurement D' and a haplotype tree revealed three distinct haplotype segments, presumably because of recombination. The value of the linkage disequilibrium parameter p(excess) was highest for site 22982, which is located in the middle segment. A series of nested, cladistic analyses confirmed that the other two regions are unlikely to be the ACE-linked QTL and that the variant resides in the middle region. Analyses of the same polymorphisms in 98 unrelated Europeans in the Monitoring Trends and Determinants in Cardiovascular Diseases (MONICA) study resulted in fewer haplotypes than were observed among the Afro-Caribbean subjects, suggesting that populations with greater genetic diversity may be especially informative for fine-scale mapping.
DOI: 10.1038/sj.ijo.0801650
2001
Cited 103 times
Heritability of obesity-related traits among Nigerians, Jamaicans and US black people
OBJECTIVE: The mean values for anthropometric traits vary across population groups and this variation is clearly determined for the most part by the environment. The familiarity of anthropometric traits also varies in reports from different populations, although this variation has not been shown to follow a consistent pattern. To examine whether heritability is influenced by socio-cultural factors, we conducted a cross-cultural study of populations of the African diaspora. PARTICIPANTS: Data were collected on 1868 family members from Nigeria, 623 from Jamaica and 2132 from metropolitan Chicago, IL, USA. MEASUREMENTS: Height and weight were measured and body mass index (kg/m2) calculated. Fat-free mass, fat mass and percentage body fat were estimated using bioelectrical impedance analysis. Plasma leptin concentrations were also measured. The proportion of variance attributable to additive genetic and non-shared environmental components was estimated with the maximum likelihood variance decomposition method. RESULTS: Mean values for all anthropometric traits increased along the socio-cultural gradient, and obesity increased from 5% in Nigeria to 23% in Jamaica and 39% in the USA. Within populations the relationships among traits both within individuals and within families were highly consistent. Heritability estimates for weight, body mass index, fat mass and percentage body fat were approximately 50% for all groups. Heritability for height was lower in Nigeria (62%) than in Jamaica (74%) or the US (87%). CONCLUSION: The familial patterns of body size and energy storage appear to be consistent in these genetically related populations across a wide range of environmental conditions.
DOI: 10.1002/gepi.10196
2002
Cited 95 times
On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals
Although genetic association studies using unrelated individuals may be subject to bias caused by population stratification, alternative methods that are robust to population stratification such as family-based association designs may be less powerful. Recently, various statistical methods robust to population stratification were proposed for association studies, using unrelated individuals to identify associations between candidate markers and traits of interest (both qualitative and quantitative). Here, we propose a semiparametric test for association (SPTA). SPTA controls for population stratification through a set of genomic markers by first deriving a genetic background variable for each sampled individual through his/her genotypes at a series of independent markers, and then modeling the relationship between trait values, genotypic scores at the candidate marker, and genetic background variables through a semiparametric model. We assume that the exact form of relationship between the trait value and the genetic background variable is unknown and estimated through smoothing techniques. We evaluate the performance of SPTA through simulations both with discrete subpopulation models and with continuous admixture population models. The simulation results suggest that our procedure has a correct type I error rate in the presence of population stratification and is more powerful than statistical association tests for family-based association designs in all the cases considered. Moreover, SPTA is more powerful than the Quantitative Similarity-Based Association Test (QSAT) developed by us under continuous admixture populations, and the number of independent markers needed by SPTA to control for population stratification is substantially fewer than that required by QSAT.
DOI: 10.1161/01.hyp.0000035708.02789.39
2002
Cited 91 times
Genome Scan Among Nigerians Linking Blood Pressure to Chromosomes 2, 3, and 19
An understanding of the genetic influences on hypertension would help unravel the pathophysiology of this complex disorder and improve our understanding of causal mechanisms. Contemporary technology makes it possible to examine enough genetic markers to support a generalized search across the entire genome for candidate regions. In the present study, a family set was recruited from southwest Nigeria, and 378 microsatellite markers were typed on 792 individuals in 196 families. Multipoint variance component analysis identified linkage signals (logarithm of the odds [LOD] 1.74, P<0.0023) for systolic blood pressure on 19p (D19S714) and 19q (D19S246), whereas for diastolic blood pressure, linkage was observed on 2p (D2S1790), 3p (D3S1304), 5q (D5S1462), 7p (D7S3046), 7q (D7S821), and 10q (D10S1221). Other regions of interest (1.18<LOD<1.74, 0.0023<P<0.01) were found on chromosomes 1, 6, 8, 9, and 11. These results provide additional evidence of linkage between blood pressure and several genomic regions reported in previous studies. Some of these regions additionally harbor hypertension candidate genes. Although evidence of linkage for blood pressure has been very slow to accumulate, even in comparison to other complex traits, the sum of current evidence appears to implicate, in particular, 2p, 3p, and 19p. Study designs that make it possible to confirm these results with association analysis and narrow the genomic interval are needed in order to make progress in this field.
DOI: 10.1007/s00439-006-0175-4
2006
Cited 84 times
Racial admixture and its impact on BMI and blood pressure in African and Mexican Americans
DOI: 10.1093/nar/gkm595
2007
Cited 81 times
A facilitated tracking and transcription mechanism of long-range enhancer function
In the human epsilon-globin gene locus, the HS2 enhancer in the Locus Control Region regulates transcription of the embryonic epsilon-globin gene located over 10 kb away. The mechanism of long-range HS2 enhancer function was not fully established. Here we show that the HS2 enhancer complex containing the enhancer DNA together with RNA polymerase II (pol II) and TBP tracks along the intervening DNA, synthesizing short, polyadenylated, intergenic RNAs to ultimately loop with the epsilon-globin promoter. Guided by this facilitated tracking and transcription mechanism, the HS2 enhancer delivers pol II and TBP to the cis-linked globin promoter to activate mRNA synthesis from the target gene. An insulator inserted in the intervening DNA between the enhancer and the promoter traps the enhancer DNA and the associated pol II and TBP at the insulator site, blocking mid-stream the facilitated tracking and transcription mechanism of the enhancer complex, thereby blocking long-range enhancer function.
DOI: 10.2337/db06-0407
2006
Cited 79 times
Common Variants in the <i>ENPP1</i> Gene Are Not Reproducibly Associated With Diabetes or Obesity
The common missense single nucleotide polymorphism (SNP) K121Q in the ectoenzyme nucleotide pyrophosphate phosphodiesterase (ENPP1) gene has recently been associated with type 2 diabetes in Italian, U.S., and South-Asian populations. A three-SNP haplotype, including K121Q, has also been associated with obesity and type 2 diabetes in French and Austrian populations. We set out to confirm these findings in several large samples. We genotyped the haplotype K121Q (rs1044498), rs1799774, and rs7754561 in 8,676 individuals of European ancestry with and without type 2 diabetes, in 1,900 obese and 930 lean individuals of European ancestry from the U.S. and Poland, and in 1,101 African-American individuals. Neither the K121Q missense polymorphism nor the putative risk haplotype were significantly associated with type 2 diabetes or BMI. Two SNPs showed suggestive evidence of association in a meta-analysis of our European ancestry samples. These SNPs were rs7754561 with type 2 diabetes (odds ratio for the G-allele, 0.85 [95% CI 0.78-0.92], P = 0.00003) and rs1799774 with BMI (homozygotes of the delT-allele, 0.6 [0.42-0.88], P = 0.007). However, these findings are not supported by other studies. We did not observe a reproducible association between these three ENPP1 variants and BMI or type 2 diabetes.
DOI: 10.1093/bioinformatics/btq560
2010
Cited 68 times
Interrogating local population structure for fine mapping in genome-wide association studies
Adjustment for population structure is necessary to avoid bias in genetic association studies of susceptibility variants for complex diseases. Population structure may differ from one genomic region to another due to the variability of individual ancestry associated with migration, random genetic drift or natural selection. Current association methods for correcting population stratification usually involve adjustment of global ancestry between study subjects.We suggest interrogating local population structure for fine mapping to more accurately locate true casual genes by better adjusting the confounding effect due to local ancestry. By extensive simulations on genome-wide datasets, we show that adjusting global ancestry may lead to false positives when local population structure is an important confounding factor. In contrast, adjusting local ancestry can effectively prevent false positives due to local population structure and thus can improve fine mapping for disease gene localization. We applied the local and global adjustments to the analysis of datasets from three genome-wide association studies, including European Americans, African Americans and Nigerians. Both European Americans and African Americans demonstrate greater variability in local ancestry than Nigerians. Adjusting local ancestry successfully eliminated the known spurious association between SNPs in the LCT gene and height due to the population structure existed in European Americans.xiaofeng.zhu@case.eduSupplementary data are available at Bioinformatics online.
DOI: 10.1016/s0065-2660(07)00419-1
2008
Cited 67 times
Admixture Mapping and the Role of Population Structure for Localizing Disease Genes
Admixture mapping, or mapping by admixture linkage disequilibrium, is a disease mapping strategy that has gained considerable popularity in recent years. It exploits the long-range linkage disequilibrium generated by admixture between genetically distinct ancestral populations. Compared to case-control association designs, admixture mapping requires fewer markers, and is more robust to allelic heterogeneity. At the same time, admixture mapping can be more powerful, and can achieve higher mapping resolution than traditional linkage studies, provided that the underlying trait variants occur at sufficiently different frequencies in the ancestral populations. In this chapter, we describe the recent methodology and software development, review successful applications, and comment on the future of this approach.
DOI: 10.1073/pnas.1004139107
2010
Cited 67 times
Long-range function of an intergenic retrotransposon
Retrotransposons including endogenous retroviruses and their solitary long terminal repeats (LTRs) compose >40% of the human genome. Many of them are located in intergenic regions far from genes. Whether these intergenic retrotransposons serve beneficial host functions is not known. Here we show that an LTR retrotransposon of ERV-9 human endogenous retrovirus located 40-70 kb upstream of the human fetal gamma- and adult beta-globin genes serves a long-range, host function. The ERV-9 LTR contains multiple CCAAT and GATA motifs and competitively recruits a high concentration of NF-Y and GATA-2 present in low abundance in adult erythroid cells to assemble an LTR/RNA polymerase II complex. The LTR complex transcribes intergenic RNAs unidirectionally through the intervening DNA to loop with and modulate transcription factor occupancies at the far downstream globin promoters, thereby modulating globin gene switching by a competitive mechanism.
DOI: 10.1093/bioinformatics/btq709
2010
Cited 63 times
Adjustment for local ancestry in genetic association analysis of admixed populations
Abstract Motivation: Admixed populations offer a unique opportunity for mapping diseases that have large disease allele frequency differences between ancestral populations. However, association analysis in such populations is challenging because population stratification may lead to association with loci unlinked to the disease locus. Methods and results: We show that local ancestry at a test single nucleotide polymorphism (SNP) may confound with the association signal and ignoring it can lead to spurious association. We demonstrate theoretically that adjustment for local ancestry at the test SNP is sufficient to remove the spurious association regardless of the mechanism of population stratification, whether due to local or global ancestry differences among study subjects; however, global ancestry adjustment procedures may not be effective. We further develop two novel association tests that adjust for local ancestry. Our first test is based on a conditional likelihood framework which models the distribution of the test SNP given disease status and flanking marker genotypes. A key advantage of this test lies in its ability to incorporate different directions of association in the ancestral populations. Our second test, which is computationally simpler, is based on logistic regression, with adjustment for local ancestry proportion. We conducted extensive simulations and found that the Type I error rates of our tests are under control; however, the global adjustment procedures yielded inflated Type I error rates when stratification is due to local ancestry difference. Contact: mingyao@upenn.edu; chun.li@vanderbilt.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
DOI: 10.1371/journal.pone.0019166
2011
Cited 58 times
Genetic Background of Patients from a University Medical Center in Manhattan: Implications for Personalized Medicine
The rapid progress currently being made in genomic science has created interest in potential clinical applications; however, formal translational research has been limited thus far. Studies of population genetics have demonstrated substantial variation in allele frequencies and haplotype structure at loci of medical relevance and the genetic background of patient cohorts may often be complex.To describe the heterogeneity in an unselected clinical sample we used the Affymetrix 6.0 gene array chip to genotype self-identified European Americans (N = 326), African Americans (N = 324) and Hispanics (N = 327) from the medical practice of Mount Sinai Medical Center in Manhattan, NY. Additional data from US minority groups and Brazil were used for external comparison. Substantial variation in ancestral origin was observed for both African Americans and Hispanics; data from the latter group overlapped with both Mexican Americans and Brazilians in the external data sets. A pooled analysis of the African Americans and Hispanics from NY demonstrated a broad continuum of ancestral origin making classification by race/ethnicity uninformative. Selected loci harboring variants associated with medical traits and drug response confirmed substantial within- and between-group heterogeneity.As a consequence of these complementary levels of heterogeneity group labels offered no guidance at the individual level. These findings demonstrate the complexity involved in clinical translation of the results from genome-wide association studies and suggest that in the genomic era conventional racial/ethnic labels are of little value.
DOI: 10.1002/gepi.21763
2013
Cited 53 times
GEE‐Based SNP Set Association Test for Continuous and Discrete Traits in Family‐Based Association Studies
Family‐based genetic association studies of related individuals provide opportunities to detect genetic variants that complement studies of unrelated individuals. Most statistical methods for family association studies for common variants are single marker based, which test one SNP a time. In this paper, we consider testing the effect of an SNP set, e.g., SNPs in a gene, in family studies, for both continuous and discrete traits. Specifically, we propose a generalized estimating equations (GEEs) based kernel association test, a variance component based testing method, to test for the association between a phenotype and multiple variants in an SNP set jointly using family samples. The proposed approach allows for both continuous and discrete traits, where the correlation among family members is taken into account through the use of an empirical covariance estimator. We derive the theoretical distribution of the proposed statistic under the null and develop analytical methods to calculate the P ‐values. We also propose an efficient resampling method for correcting for small sample size bias in family studies. The proposed method allows for easily incorporating covariates and SNP‐SNP interactions. Simulation studies show that the proposed method properly controls for type I error rates under both random and ascertained sampling schemes in family studies. We demonstrate through simulation studies that our approach has superior performance for association mapping compared to the single marker based minimum P ‐value GEE test for an SNP‐set effect over a range of scenarios. We illustrate the application of the proposed method using data from the Cleveland Family GWAS Study.
DOI: 10.1371/journal.pone.0033013
2012
Cited 52 times
Comparison of Blue Light-Filtering IOLs and UV Light-Filtering IOLs for Cataract Surgery: A Meta-Analysis
A number of published randomized controlled trials have been conducted to evaluate visual performance of blue light-filtering intraocular lenses (IOL) and UV light-filtering intraocular lenses (IOL) after cataract phacoemulsification surgery. However, results have not always been consistent. Therefore, we carried out a meta-analysis to compare the effectiveness of blue light-filtering IOLs versus UV light-filtering IOLs in cataract surgery.Comprehensive searches of PubMed, Embase, Cochrane Library and the Chinese BioMedical literature databases were performed using web-based search engines. Fifteen trials (1690 eyes) were included for systematic review, and 11 of 15 studies were included in this meta-analysis. The results showed that there were no significant differences in postoperative mean best corrected visual acuity, contrast sensitivity, overall color vision, or in the blue light spectrum under photopic light conditions between blue light-filtering IOLs and UV light-filtering IOLs [WMD = -0.01, 95%CI (-0.03, 0.01), P = 0.46; WMD = 0.07, 95%CI (-0.04, 0.19), P = 0.20; SMD = 0.14, 95%CI (-0.33, 0.60), P = 0.566; SMD = 0.20, 95%CI (-0.04, 0.43), P = 0.099]. However, color vision with blue light-filtering IOLs was significantly reduced in the blue light spectrum under mesopic light conditions [SMD = 0.74, 95%CI (0.29, 1.18), P = 0.001].This meta-analysis demonstrates that postoperative visual performance with blue light-filtering IOLs is approximately equal to that of UV light-filtering IOLs after cataract surgery, but color vision with blue light-filtering IOLs demonstrated some compromise in the blue light spectrum under mesopic light conditions.
DOI: 10.1186/s12889-015-2146-y
2015
Cited 50 times
Determinants of hypertension in a young adult Ugandan population in epidemiological transition—the MEPI-CVD survey
High blood pressure is the principal risk factor for stroke, heart failure and kidney failure in the young population in Africa. Control of hypertension is associated with a larger reduction in morbidity and mortality in younger populations compared with the elderly; however, blood pressure control efforts in the young are hampered by scarcity of data on prevalence and factors influencing awareness, treatment and control of hypertension. We aimed to describe the prevalence of prehypertension and hypertension among young adults in a peri-urban district of Uganda and the factors associated with occurrence of hypertension in this population.This cross-sectional study was conducted between August, 2012 and May 2013 in Wakiso district, a suburban district that that encircles Kampala, Uganda's capital city. We collected data on socio-demographic characteristics and hypertension status using a modified STEPs questionnaire from 3685 subjects aged 18-40 years selected by multistage cluster sampling. Blood pressure and anthropometric measurements were performed using standardized protocols. Fasting blood sugar and HIV status were determined using a venous blood sample. Association between hypertension status and various biosocial factors was assessed using logistic regression.The overall prevalence of hypertension was 15% (95% CI 14.2 - 19.6) and 40% were pre-hypertensive. Among the 553 hypertensive participants, 76 (13.7%) were aware of their diagnosis and all these participants had initiated therapy with target blood pressure control attained in 20% of treated subjects. Hypertension was significantly associated with the older age-group, male sex and obesity. There was a significantly lower prevalence of hypertension among participants with HIV OR 0.6 (95% CI 0.4-0.8, P = 0.007).There is a high prevalence of high blood pressure in this young periurban population of Uganda with sub-optimal diagnosis and control. There is previously undocumented high rate of treatment, a unique finding that may be exploited to drive efforts to control hypertension. Specific programs for early diagnosis and treatment of hypertension among the young should be developed to improve control of hypertension. The relationship between HIV infection and blood pressure requires further clarification by longitudinal studies.
DOI: 10.1093/hmg/ddy387
2018
Cited 44 times
Admixture mapping identifies novel loci for obstructive sleep apnea in Hispanic/Latino Americans
Obstructive sleep apnea (OSA) is a common disorder associated with increased risk of cardiovascular disease and mortality. Its prevalence and severity vary across ancestral background. Although OSA traits are heritable, few genetic associations have been identified. To identify genetic regions associated with OSA and improve statistical power, we applied admixture mapping on three primary OSA traits [the apnea hypopnea index (AHI), overnight average oxyhemoglobin saturation (SaO2) and percentage time SaO2 < 90%] and a secondary trait (respiratory event duration) in a Hispanic/Latino American population study of 11 575 individuals with significant variation in ancestral background. Linear mixed models were performed using previously inferred African, European and Amerindian local genetic ancestry markers. Global African ancestry was associated with a lower AHI, higher SaO2 and shorter event duration. Admixture mapping analysis of the primary OSA traits identified local African ancestry at the chromosomal region 2q37 as genome-wide significantly associated with AHI (P < 5.7 × 10-5), and European and Amerindian ancestries at 18q21 suggestively associated with both AHI and percentage time SaO2 < 90% (P < 10-3). Follow-up joint ancestry-SNP association analyses identified novel variants in ferrochelatase (FECH), significantly associated with AHI and percentage time SaO2 < 90% after adjusting for multiple tests (P < 8 × 10-6). These signals contributed to the admixture mapping associations and were replicated in independent cohorts. In this first admixture mapping study of OSA, novel associations with variants in the iron/heme metabolism pathway suggest a role for iron in influencing respiratory traits underlying OSA.
DOI: 10.1016/j.exphem.2018.11.002
2019
Cited 41 times
MIR-144-mediated NRF2 gene silencing inhibits fetal hemoglobin expression in sickle cell disease
•Higher miR-144 gene expression was observed in peripheral blood reticulocytes of sickle cell disease (SCD) patients with low fetal hemoglobin levels.•NRF2 protein levels are regulated by miR-144 as a mechanism of γ-globin gene silencing during erythropoiesis in SCD. Inherited genetic modifiers and pharmacologic agents that enhance fetal hemoglobin (HbF) expression reverse the clinical severity of sickle cell disease (SCD). Recent efforts to develop novel strategies of HbF induction include discovery of molecular targets that regulate γ-globin gene transcription and translation. The purpose of this study was to perform genome-wide microRNA (miRNA) analysis to identify genes associated with HbF expression in patients with SCD. We isolated RNA from purified reticulocytes for microarray-based miRNA expression profiling. Using samples from patients with contrasting HbF levels, we observed an eightfold upregulation of miR-144-3p (miR-144) and miR-144-5p in the low-HbF group compared with those with high HbF. Additional analysis by reverse transcription quantitative polymerase chain reaction confirmed individual miR-144 expression levels of subjects in the two groups. Subsequent functional studies in normal and sickle erythroid progenitors showed NRF2 gene silencing by miR-144 and concomitant repression of γ-globin transcription; by contrast, treatment with miR-144 antagomir reversed its silencing effects in a dose-dependent manner. Because NRF2 regulates reactive oxygen species levels, additional studies investigated mechanisms of HbF regulation using a hemin-induced oxidative stress model. Treatment of KU812 cells with hemin produced an increase in NRF2 expression and HbF induction that reversed with miR-144 pretreatment. Chromatin immunoprecipitation assay confirmed NRF2 binding to the γ-globin antioxidant response element, which was inhibited by miR-144 mimic treatment. The genome-wide miRNA microarray and primary erythroid progenitor data support a miR-144/NRF2-mediated mechanism of γ-globin gene regulation in SCD. Inherited genetic modifiers and pharmacologic agents that enhance fetal hemoglobin (HbF) expression reverse the clinical severity of sickle cell disease (SCD). Recent efforts to develop novel strategies of HbF induction include discovery of molecular targets that regulate γ-globin gene transcription and translation. The purpose of this study was to perform genome-wide microRNA (miRNA) analysis to identify genes associated with HbF expression in patients with SCD. We isolated RNA from purified reticulocytes for microarray-based miRNA expression profiling. Using samples from patients with contrasting HbF levels, we observed an eightfold upregulation of miR-144-3p (miR-144) and miR-144-5p in the low-HbF group compared with those with high HbF. Additional analysis by reverse transcription quantitative polymerase chain reaction confirmed individual miR-144 expression levels of subjects in the two groups. Subsequent functional studies in normal and sickle erythroid progenitors showed NRF2 gene silencing by miR-144 and concomitant repression of γ-globin transcription; by contrast, treatment with miR-144 antagomir reversed its silencing effects in a dose-dependent manner. Because NRF2 regulates reactive oxygen species levels, additional studies investigated mechanisms of HbF regulation using a hemin-induced oxidative stress model. Treatment of KU812 cells with hemin produced an increase in NRF2 expression and HbF induction that reversed with miR-144 pretreatment. Chromatin immunoprecipitation assay confirmed NRF2 binding to the γ-globin antioxidant response element, which was inhibited by miR-144 mimic treatment. The genome-wide miRNA microarray and primary erythroid progenitor data support a miR-144/NRF2-mediated mechanism of γ-globin gene regulation in SCD. Sickle cell disease (SCD) is a genetic disorder caused by the βS-globin mutation leading to production of hemoglobin S, polymer formation under low oxygen conditions, and red blood cell sickling. The net outcome of this process is chronic hemolysis, oxidative stress, anemia, and vaso-occlusive episodes of pain and organ damage. The most effective treatment for SCD is fetal hemoglobin (HbF; α2γ2) induction, which inhibits sickle hemoglobin polymerization through the formation of hybrid molecules [1Poillon WN Kim BC Rodgers GP Noguchi CT Schechter AN Sparing effect of hemoglobin F and hemoglobin A2 on the polymerization of hemoglobin S at physiologic ligand saturations.Proc Natl Acad Sci U S A. 1993; 90: 5039-5043Crossref PubMed Scopus (86) Google Scholar]. Hydroxyurea is the only Food and Drug Administration-approved drug that ameliorates the clinical symptoms of SCD through HbF induction and other beneficial properties such as increasing nitric oxide levels and anti-inflammatory effects [2Platt OS Brambilla DJ Rosse WF et al.Mortality in sickle cell disease: Life expectancy and risk factors for early death.N Engl J Med. 1994; 330: 1639-1644Crossref PubMed Scopus (2371) Google Scholar, 3Steinberg MH McCarthy WF Castro O et al.The risks and benefits of long-term use of hydroxyurea in sickle cell anemia: A 17.5 year follow-up.Am J Hematol. 2010; 85: 403-408Crossref PubMed Scopus (327) Google Scholar]. Not all individuals respond to hydroxyurea therapy, so understanding the molecular mechanisms involved in γ-globin regulation to develop strategies for HbF induction is critical to the discovery of additional effective therapeutic options for SCD. With completion of genome-wide association studies, single nucleotide polymorphisms (SNPs) associated with HbF levels in SCD and thalassemia patients [4Menzel S Garner C Gut I et al.A QTL influencing F cell production maps to a gene encoding a zinc-finger protein on chromosome 2p15.Nat Genet. 2007; 39: 1197-1199Crossref PubMed Scopus (386) Google Scholar, 5Thein SL Menzel S Peng X Best S et al.Intergenic variants of HBS1L-MYB are responsible for a major quantitative trait locus on chromosome 6q23 influencing fetal hemoglobin levels in adults.Proc Natl Acad Sci U S A. 2007; 104: 11346-11351Crossref PubMed Scopus (237) Google Scholar, 6Uda M Galanello R Sanna S Lettre G et al.Genome-wide association study shows BCL11A associated with persistent fetal hemoglobin and amelioration of the phenotype of beta-thalassemia.Proc Natl Acad Sci U S A. 2008; 105: 1620-1625Crossref PubMed Scopus (450) Google Scholar, 7Bae HT Baldwin CT Sebastiani P et al.Meta-analysis of 2040 sickle cell anemia patients: BCL11A and HBS1L-MYB are the major modifiers of HbF in African Americans.Blood. 2012; 120: 1961-1962Crossref PubMed Scopus (60) Google Scholar, 8Mtatiro SN Singh T Rooks H et al.Genome wide association study of fetal hemoglobin in sickle cell anemia in Tanzania.PLoS One. 2014; 9e111464Crossref PubMed Scopus (57) Google Scholar] were discovered. Three genetic loci, including –158 Xmn1-HBG2, BCL11A at 2p15 and the HBS1L-MYB region, account for 30–50% of inherited variations in HbF levels in several populations [4Menzel S Garner C Gut I et al.A QTL influencing F cell production maps to a gene encoding a zinc-finger protein on chromosome 2p15.Nat Genet. 2007; 39: 1197-1199Crossref PubMed Scopus (386) Google Scholar, 5Thein SL Menzel S Peng X Best S et al.Intergenic variants of HBS1L-MYB are responsible for a major quantitative trait locus on chromosome 6q23 influencing fetal hemoglobin levels in adults.Proc Natl Acad Sci U S A. 2007; 104: 11346-11351Crossref PubMed Scopus (237) Google Scholar, 6Uda M Galanello R Sanna S Lettre G et al.Genome-wide association study shows BCL11A associated with persistent fetal hemoglobin and amelioration of the phenotype of beta-thalassemia.Proc Natl Acad Sci U S A. 2008; 105: 1620-1625Crossref PubMed Scopus (450) Google Scholar, 7Bae HT Baldwin CT Sebastiani P et al.Meta-analysis of 2040 sickle cell anemia patients: BCL11A and HBS1L-MYB are the major modifiers of HbF in African Americans.Blood. 2012; 120: 1961-1962Crossref PubMed Scopus (60) Google Scholar]. The Xmn1-HBG2 locus contributes 13% of HbF variance in β-thalassemia populations, but this effect did not replicate in African American [7Bae HT Baldwin CT Sebastiani P et al.Meta-analysis of 2040 sickle cell anemia patients: BCL11A and HBS1L-MYB are the major modifiers of HbF in African Americans.Blood. 2012; 120: 1961-1962Crossref PubMed Scopus (60) Google Scholar] or Tanzanian [8Mtatiro SN Singh T Rooks H et al.Genome wide association study of fetal hemoglobin in sickle cell anemia in Tanzania.PLoS One. 2014; 9e111464Crossref PubMed Scopus (57) Google Scholar] people. The greatest effect on HbF expression is mediated by SNPs in the second intron of BCL11A leading to gene silencing [4Menzel S Garner C Gut I et al.A QTL influencing F cell production maps to a gene encoding a zinc-finger protein on chromosome 2p15.Nat Genet. 2007; 39: 1197-1199Crossref PubMed Scopus (386) Google Scholar, 6Uda M Galanello R Sanna S Lettre G et al.Genome-wide association study shows BCL11A associated with persistent fetal hemoglobin and amelioration of the phenotype of beta-thalassemia.Proc Natl Acad Sci U S A. 2008; 105: 1620-1625Crossref PubMed Scopus (450) Google Scholar, 7Bae HT Baldwin CT Sebastiani P et al.Meta-analysis of 2040 sickle cell anemia patients: BCL11A and HBS1L-MYB are the major modifiers of HbF in African Americans.Blood. 2012; 120: 1961-1962Crossref PubMed Scopus (60) Google Scholar, 9Sankaran VG Menne TF Xu J et al.Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A.Science. 2008; 322: 1839-1842Crossref PubMed Scopus (600) Google Scholar]. Subsequent gene knockout confirmed a major repressor role of BCL11A in γ-globin gene silencing during hemoglobin switching [10Xu J Peng C Sankaran VG et al.Correction of sickle cell disease in adult mice by interference with fetal hemoglobin silencing.Science. 2011; 334: 993-996Crossref PubMed Scopus (227) Google Scholar] through KLF1 activation [11Xu J Bauer DE Kerenyi MA Vo TD Hou S Hsu YJ Yao H Trowbridge JJ Mandel G Orkin SH Corepressor-dependent silencing of fetal hemoglobin expression by BCL11A.Proc Natl Acad Sci U S A. 2013; 110: 6518-6523Crossref PubMed Scopus (143) Google Scholar, 12Roosjen M McColl B Kao B Gearing LJ Blewitt ME Vadolas J Transcriptional regulators Myb and BCL11A interplay with DNA methyltransferase 1 in developmental silencing of embryonic and fetal β-like globin genes.FASEB J. 2014; 28: 1610-1620Crossref PubMed Scopus (27) Google Scholar] and interaction with the co-repressor SOX6 [13Xu J Sankaran VG Ni M et al.Transcriptional silencing of {gamma}-globin by BCL11A involves long-range interactions and cooperation with SOX6.Genes Dev. 2010; 24: 783-798Crossref PubMed Scopus (257) Google Scholar, 14Zhou D Liu K Sun CW Pawlik KM Townes TM KLF1 regulates BCL11A expression and gamma- to beta-globin gene switching.Nat Genet. 2010; 42: 742-744Crossref PubMed Scopus (265) Google Scholar]. Furthermore, haploinsufficiency of KLF1 caused by SNPs in coding and noncoding DNA regions causes high HbF levels in humans [15Borg J Papadopoulos P Georgitsi M et al.Haploinsufficiency for the erythroid transcription factor KLF1 causes hereditary persistence of fetal hemoglobin.Nat Genet. 2010; 42: 801-805Crossref PubMed Scopus (279) Google Scholar]. Recent studies by Bauer et al. demonstrated that an erythroid-specific enhancer in the second intron of BCL11A [16Bauer DE Kamran SC Lessard S et al.An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level.Science. 2013; 342: 253-257Crossref PubMed Scopus (387) Google Scholar] that regulates lineage-specific BCL11A activation, is an excellent target for the development of novel gene therapy for β-hemoglobinopathies. The third loci affecting HbF expression is located in the HBS1L-MYB region 5′ of the repressor oncogene MYB [5Thein SL Menzel S Peng X Best S et al.Intergenic variants of HBS1L-MYB are responsible for a major quantitative trait locus on chromosome 6q23 influencing fetal hemoglobin levels in adults.Proc Natl Acad Sci U S A. 2007; 104: 11346-11351Crossref PubMed Scopus (237) Google Scholar]. Studies in primary erythroid cultures demonstrated binding of the transcription factors LDB1, Tal1, and KLF1 in the HBS1L-MYB region to control MYB expression [17Stadhouders R Aktuna S Thongjuea S et al.HBS1L-MYB intergenic variants modulate fetal hemoglobin via long-range MYB enhancers.J Clin Invest. 2014; 124: 1699-1710Crossref PubMed Scopus (125) Google Scholar]. Additional studies by Sankaran et al. demonstrated microRNA (miRNA) miR-15a and miR-16-1 enhance γ-globin expression through MYB silencing in a child with trisomy 13 [18Sankaran VG Xu J Byron R et al.A functional element necessary for fetal hemoglobin silencing.N Engl J Med. 2011; 365: 807-814Crossref PubMed Scopus (129) Google Scholar]. Recent efforts have identified mechanisms of γ-globin gene expression that focus on posttranscriptional miRNA-mediated gene regulation. Azzouzi et al. demonstrated, in an miRNA screen of umbilical cord and peripheral blood reticulocytes, that miRNA-96 targets the open reading frame of the γ-globin mRNA molecule to silence γ-globin expression [19Azzouzi I Moest H Winkler J et al.MicroRNA-96 directly inhibits gamma-globin expression in human erythropoiesis.PLoS One. 2011; 6: 28Crossref Scopus (57) Google Scholar]. Additional work by Miller et al. [20Lee YT de Vasconcellos JF Yuan J et al.LIN28B-mediated expression of fetal hemoglobin and production of fetal-like erythrocytes from adult human erythroblasts ex vivo.Blood. 2013; 122: 1034-1041Crossref PubMed Scopus (91) Google Scholar] verified the ability of LIN28B to repress let-7 miRNA expression as a mechanism of HbF induction in tissue culture systems. Recently, we published data to support a role of miR34a in γ-globin activation [21Ward CM Li B Pace BS Original research: Stable expression of miR-34a mediates fetal hemoglobin induction in K562 cells.Exp Biol Med. 2016; 241: 719-729Crossref Scopus (17) Google Scholar] through STAT3 gene silencing. Our group previously demonstrated a negative role of STAT3 in γ-globin expression [22Foley HA Ofori-Acquah SF Yoshimura A Critz S Baliga BS Pace BS Stat3 beta inhibits gamma-globin gene expression in erythroid cells.J Biol Chem. 2002; 277: 16211-16219Crossref PubMed Scopus (24) Google Scholar]. These studies expand the role of miRNA in γ-globin regulation, but additional targets remain to be discovered. To this end, we performed genome-wide miRNA expression analysis using RNA isolated from the reticulocytes of individuals with SCD and contrasting high- and low-HbF levels. We observed significant differences in miR-144 between the two groups, along with other miRNA genes. Subsequent functional studies in normal and sickle erythroid progenitors confirmed the ability of miRNA-144 antagomir to mediate HbF induction while increasing NRF2 expression. After obtaining institutional review board approval and informed consent, blood samples were collected from patients with homozygous sickle cell anemia (HbSS) followed at Augusta University. None of the subjects received hydroxyurea therapy or transfusions before recruitment (Supplementary Table E1, online only, available at www.exphem.org). Medical record review was completed to obtain complete blood counts with differential, reticulocyte count, and HbF levels determined by high-performance liquid chromatography. Blood samples were processed by Ficoll-Histapaque separation of peripheral blood mononuclear cells (PBMCs) stored in dimethyl sulfoxide for primary erythroid cultures. From the same samples, red blood cells were processed on a MACS column with CD71+ MicroBeads (MACS, Miltenyi Biotec, Auburn, CA) to isolate reticulocytes for total RNA extraction using TRIzol (ThermoFisher). The quality of RNA was assessed using an Agilent 2100 Bioanalyzer followed by hybridization to the miRCURY LNA microRNA Array (Exiqon, Woburn, MA). Raw data were quantile normalized using a model-based correction algorithm (http://linus.nci.nih.gov/BRB-ArrayTools.html). miRNA gene expression profiling was conducted for SCD patients with HbF < 8.6% (low HbF) or HbF > 8.6% (high HbF) using principal component analysis (NIA Array Analysis Tool; https://lgsun.irp.nia.nih.gov/ANOVA/index.html). Microarray raw data were submitted to the Gene Expression Omnibus (GEO) under database accession number GGSE111356. To quantify miR-144 levels, the miScript II RT and SYBRGreen PCR kit (Qiagen) were used and relative expression determined as described previously [21Ward CM Li B Pace BS Original research: Stable expression of miR-34a mediates fetal hemoglobin induction in K562 cells.Exp Biol Med. 2016; 241: 719-729Crossref Scopus (17) Google Scholar] to confirm miR-144 expression obtained by microarray for the 12 samples analyzed and in vitro functional studies. To quantify mRNA of γ-globin, β-globin, βS-globin, and glyceraldehyde-3-phosphate dehydrogenase (GAPDH), we generated standard curves described previously [21Ward CM Li B Pace BS Original research: Stable expression of miR-34a mediates fetal hemoglobin induction in K562 cells.Exp Biol Med. 2016; 241: 719-729Crossref Scopus (17) Google Scholar]. Levels of NRF2, CD71, and CD235a were measured using the RT2-qPCR Primer Assay system (Qiagen, Valencia, CA) as described previously by our group [21Ward CM Li B Pace BS Original research: Stable expression of miR-34a mediates fetal hemoglobin induction in K562 cells.Exp Biol Med. 2016; 241: 719-729Crossref Scopus (17) Google Scholar]. All gene expression levels normalized to GAPDH mRNA. Human erythroid progenitors were generated from adult CD34+ stem cells (STEMCELL Technologies, Vancouver, BC) or sickle PBMCs in a two-phase culture system established in our laboratory [23Promsote W Makala L Li B et al.Monomethylfumarate induces gamma-globin expression and fetal hemoglobin production in cultured human retinal pigment epithelial (RPE) and erythroid cells, and in intact retina.Invest Ophthalmol Vis Sci. 2014; 55: 5382-5393Crossref PubMed Scopus (23) Google Scholar, 24Zhu X Li B Pace BS NRF2 mediates γ-globin gene regulation and fetal hemoglobin induction in human erythroid progenitors.Haematologica. 2017; 102: e285-e288Crossref PubMed Scopus (19) Google Scholar]. During phase I, CD34+ stem cells were cultured in alpha minimum essential medium containing interleukin-3 (10 ng/mL), stem cell factor (10 ng/mL), and erythropoietin (2 IU/mL) to promote erythroid lineage commitment. On day 7, cells transitioned to phase II medium that was identical except that stem cell factor was removed. Erythroid progenitors were transfected on day 8 with human mature miR-144 or negative control mimics (Dharmacon, Lafayette, CO) by nucleofection using the Amaxa Human CD34+ Cell Nucleofector Kit. After 2 days, cells were harvested for flow cytometry, Western blot, and RT-qPCR analysis. In a second set of studies, we determined the effect of longer treatment using erythroid progenitors generated from normal CD34+ stem cells or sickle PBMCs treated on day 5 and then harvested on day 10 for gene expression and protein analysis. KU812 cells maintained in Iscove's modified Dulbecco's medium supplemented with 10% fetal bovine serum were used for mechanistic studies. Cells were treated with 25–75 µmol/L hemin alone or pretreatment with 300 nmol/L miR-144 or negative control for 24 hours followed by 50 μmol/L hemin for 48 hours and then flow cytometry, Western blot, and RT-qPCR analysis completed. After the different treatments, cells were washed, fixed with 4% paraformaldehyde, and stained with fluorescein-isothiocyanate-conjugated anti-HbF (Thermo Fisher Scientific) or anti-CD235a and anti-CD71 antibodies (eBioscience, San Diego, CA); flow cytometry analysis was performed on an LSRII flow cytometer using gating parameters previously published by our group [23Promsote W Makala L Li B et al.Monomethylfumarate induces gamma-globin expression and fetal hemoglobin production in cultured human retinal pigment epithelial (RPE) and erythroid cells, and in intact retina.Invest Ophthalmol Vis Sci. 2014; 55: 5382-5393Crossref PubMed Scopus (23) Google Scholar, 24Zhu X Li B Pace BS NRF2 mediates γ-globin gene regulation and fetal hemoglobin induction in human erythroid progenitors.Haematologica. 2017; 102: e285-e288Crossref PubMed Scopus (19) Google Scholar]. We routinely acquire 10,000 erythroid cells to quantify HbF positive cells (F-cells) shown in histograms. To detect reactive oxygen species (ROS) levels, KU812 cells were incubated with 5 µmol/L dichlorodihydrofluorescein diacetate (DCF-DA) (Sigma-Aldrich) for 4 hours before harvest. The percentage of F-cells and DCF-positive cells were quantified using FACS Diva software. Total protein was isolated and Western blot performed with 10–30 µg of protein [23Promsote W Makala L Li B et al.Monomethylfumarate induces gamma-globin expression and fetal hemoglobin production in cultured human retinal pigment epithelial (RPE) and erythroid cells, and in intact retina.Invest Ophthalmol Vis Sci. 2014; 55: 5382-5393Crossref PubMed Scopus (23) Google Scholar, 24Zhu X Li B Pace BS NRF2 mediates γ-globin gene regulation and fetal hemoglobin induction in human erythroid progenitors.Haematologica. 2017; 102: e285-e288Crossref PubMed Scopus (19) Google Scholar] with HbF (sc-21756), HbS (sc-37-8 from Santa Cruz Biotechnology, Santa Cruz, CA), NRF2 (ab62352, Abcam), and tubulin (sc-53646; Santa Cruz) antibodies. The immunoblots were developed using SuperSignal West Pico Chemiluminescent Substrate (Thermo Fisher Scientific) and analyzed on a Fujifilm LAS-3000 gel imager (Stamford, CT) to acquire quantitative data. Sickle erythroid progenitors were used for chromatin immunoprecipitation (ChIP) assay as described previously by our group [24Zhu X Li B Pace BS NRF2 mediates γ-globin gene regulation and fetal hemoglobin induction in human erythroid progenitors.Haematologica. 2017; 102: e285-e288Crossref PubMed Scopus (19) Google Scholar]. Immunoprecipitations with anti-NRF2 and anti-TATA-binding protein (TBP) antibodies, along with an immunoglobulin G (IgG) control, were completed and chromatin isolated for qPCR analysis to quantify chromatin enrichment compared with input DNA. The data are reported as the mean ± standard error of the mean of three to five replicates of independent experiments performed in triplicate. All data were analyzed by a two-tailed Student t test and p < 0.05 was considered statistically significant. Binary regression analysis determined the correlation between miR-144 levels obtained by microarray and RT-qPCR analysis. Individuals with SCD and contrasting HbF levels show differentially expressed miRNA genes The role of miRNA genes in normal erythropoiesis [25Kim M Tan YS Cheng WC Kingsbury TJ Heimfeld S Civin CI MIR144 and MIR451 regulate human erythropoiesis via RAB14.Mol Med Rep. 2017; 15: 2495-2502PubMed Google Scholar, 26Leecharoenkiat K Tanaka Y Harada Y et al.Plasma microRNA-451 as a novel hemolytic marker for β0-thalassemia/HbE disease.Br J Haematol. 2015; 168: 583-597PubMed Google Scholar] and globin expression has been demonstrated [19Azzouzi I Moest H Winkler J et al.MicroRNA-96 directly inhibits gamma-globin expression in human erythropoiesis.PLoS One. 2011; 6: 28Crossref Scopus (57) Google Scholar, 20Lee YT de Vasconcellos JF Yuan J et al.LIN28B-mediated expression of fetal hemoglobin and production of fetal-like erythrocytes from adult human erythroblasts ex vivo.Blood. 2013; 122: 1034-1041Crossref PubMed Scopus (91) Google Scholar, 21Ward CM Li B Pace BS Original research: Stable expression of miR-34a mediates fetal hemoglobin induction in K562 cells.Exp Biol Med. 2016; 241: 719-729Crossref Scopus (17) Google Scholar, 27Lulli V Romania P Morsilli O et al.MicroRNA-486-3p regulates γ-globin expression in human erythroid cells by directly modulating BCL11A.PLoS One. 2013; 8: e60436Crossref PubMed Scopus (58) Google Scholar, 28Saki N Abroun S Soleimani M et al.MicroRNA expression in β-thalassemia and sickle cell disease: A role in the induction of fetal hemoglobin.Cell J. 2016; 17: 583-592PubMed Google Scholar]. Therefore, the goal of the present study was to define novel miRNA genes differentially expressed in persons with SCD and contrasting high and low HbF levels. After obtaining informed consent, blood samples were collected from study subjects with confirmed HbSS genotype and clinical phenotype data (Supplementary Table E1, online only, available at www.exphem.org). We collected complete blood cell counts with differential, reticulocyte counts, and HbF levels (Table 1); HbF levels ranged from 0.1% to 30.6%. None of the other hematologic values was significantly different between the two groups except HbF, suggesting that the red blood cell turnover and hemolysis rates were similar between groups.Table 1Summary of clinical phenotype data for sickle cell patients used in the miRNA analysisaShown are the values obtained for the individual complete blood counts and differential and reticulocyte counts for the 12 children and adults (011A and 012A) with HbSS included in the miRNA analysis. The mean ± standard error of the mean (SEM) is shown for each parameter. The Student t test was used to determine significant difference between the two study groups; p < 0.05 was considered statistically significant.High-HbF GroupSubject #Hgb (g/dl)Hct (%)Plts (× 103)WBC (× 103)Neutrophils (× 103)Lymphocytes (× 103)NRBC (%)Reticulocyte Count (%)HbF (%)0019.528.62698.15.81.305.030.600310.530.956311.45.04.305.721.808A7.421.71924.22.11.338.825.00148.122.946915.15.77.709.816.80158.725.441614.15.17.508.827.50167.627.249917.57.57.4112.719.2Mean ± SEM8.63 ± 0.4926.11 ± 1.42404.67 ± 60.1711.7 ± 2.005.2 ± 0.724.917 ± 1.250.67 ± 0.498.467 ± 1.1523.48 ± 2.12Low-HbF Group0047.120.138015.56.25.9610.53.80087.020.169311.14.26.2213.43.00097.824.864113.88.73.709.00.90117.622.857015.608.25.70212.06.2011A8.428.946110.85.92.508.26.2012A8.928.138410.94.94.515.60.1Mean ± SEM7.8 ± 1.3024.13 ± 1.56521 ± 54.3512.95 ± 0.946.35 ± 0.734750 ± 0.591.83 ± 1.029.783 ± 1.423.37 ± 1.05p valuesbp values generated using the Student t test for data collected for each parameter for the two groups. Hgb=hemoglobin, Hct= hematocrit, Plts=platelet count, WBC=white blood cell count, NRBC=nucleated red blood cells0.17660.36980.18020.59430.28760.90670.18590.43530.0001*a Shown are the values obtained for the individual complete blood counts and differential and reticulocyte counts for the 12 children and adults (011A and 012A) with HbSS included in the miRNA analysis. The mean ± standard error of the mean (SEM) is shown for each parameter. The Student t test was used to determine significant difference between the two study groups; p < 0.05 was considered statistically significant.b p values generated using the Student t test for data collected for each parameter for the two groups.Hgb=hemoglobin, Hct= hematocrit, Plts=platelet count, WBC=white blood cell count, NRBC=nucleated red blood cells Open table in a new tab Twelve individuals were included in the miRNA microarray analysis (Figure 1A), including a high-HbF group (average HbF 23.48±2.12) and a low-HbF group (average HbF 3.37±1.02). Total RNA isolated from CD71+ reticulocytes was analyzed on the miRCURY LNA microRNA Array. After raw data normalization, principal component analysis identified 89 and 91 unique miRNA genes upregulated and downregulated, respectively, in the low-HbF group compared with the high-HbF group (Supplementary Table E2, online only, available at www.exphem.org). miR-144-3p (miR-144) and miR-144-5p expression was increased 7.96-fold (p = 0.0010) and 7.79-fold (p = 0.0037), respectively, in the low-HbF group compared with the high-HbF group. Other miRNA genes such as miR-96-5p and let-7b-5p implicated in globin gene regulation [19Azzouzi I Moest H Winkler J et al.MicroRNA-96 directly inhibits gamma-globin expression in human erythropoiesis.PLoS One. 2011; 6: 28Crossref Scopus (57) Google Scholar, 20Lee YT de Vasconcellos JF Yuan J et al.LIN28B-mediated expression of fetal hemoglobin and production of fetal-like erythrocytes from adult human erythroblasts ex vivo.Blood. 2013; 122: 1034-1041Crossref PubMed Scopus (91) Google Scholar] showed enhanced expression in the low-HbF group, suggesting testable hypotheses that miRNA genes in this group might contribute to γ-globin transcription by silencing trans-activator DNA-binding proteins. Likewise, we identified the top miRNA genes downregulated in low-HbF group, such as miR-1, miR-5701, and miR-2116-3p (Supplementary Table E2, online only, available at www.exphem.org), that might silence repressors of γ-globin expression. The oxidative stress conditions observed in sickle cell patients is associated with high miR-144 levels and severe anemia [29Sangokoya C Telen MJ Chi JT MicroRNA miR-144 modulates oxidative stress tolerance and associates with anemia severity in sickle cell disease.Blood. 2010; 116: 4338-4348Crossref PubMed Scopus (244) Google Scholar]; furthermore, NRF2 is a direct target of miR-144-mediated gene silencing. We and others [23Promsote W Makala L Li B et al.Monomethylfumarate induces gamma-globin expression and fetal hemoglobin production in cultured human retinal pigment epithelial (RPE) and erythroid cells, and in intact retina.Invest Ophthalmol Vis Sci. 2014; 55: 5382-5393Crossref PubMed Scopus (23) Google Scholar, 24Zhu X Li B Pace BS NRF2 mediates γ-globin gene regulation and fetal hemoglobin induction in human erythroid progenitors.Haematologica. 2017; 102: e285-e288Crossref PubMed Scopus (19) Google Scholar, 30Macari ER Lowrey CH Induction of human fetal hemoglobin via the NRF2 antioxidant response signaling pathway.Blood. 2011; 117: 5987-5997Crossref PubMed Scopus (58) Google Scholar] have demonstrated that NRF2 activates γ-globin gene t
DOI: 10.1016/j.bone.2020.115247
2020
Cited 37 times
Identification of PIEZO1 polymorphisms for human bone mineral density
Bone mineral density (BMD) is a key indicator for diagnosis and treatment for osteoporosis; the reduction of BMD could increase the risk of osteoporotic fracture. It was very recently found that Piezo1 mediated mechanically evoked responses in bone and further participated in bone formation in mice. Here, we performed cross phenotype meta-analysis for human BMD at lumbar spine (LS), femoral neck (FN), distal radius/forearm (FA) and heel and screened out 14 top SNPs for PIEZO1, these SNPs were overlapped with putative enhancers, DNase-I hypersensitive sites and active promoter flanking regions. We found that the signal of the best SNP rs62048221 was mainly from heel ultrasound estimated BMD (-0.02 SD per T allele, P = 8.50E-09), where calcaneus supported most of the mechanical force of body when standing, walking and doing physical exercises. Each copy of the effect allele T of SNP rs62048221 was associated with a decrease of 0.0035 g/cm2 BMD (P = 4.6E-27, SE = 0.0003) in UK Biobank data within 477,760 samples. SNP rs62048221 was located at the enhancer region (HEDD enhancer ID 2331049) of gene PIEZO1, site-directed ChIP assays in human mesenchymal stem cells (hMSCs) showed significant enrichment of H3K4me1 and H3K27ac in this region, luciferase assays showed that rs62048221 could significantly affect the activity of the enhancer where it resides. Our results first suggested that SNP rs62048221 might mediate the PIEZO1 expression level via modulating the activity of cis-regulatory elements and then further affect the BMD.
DOI: 10.3233/jad-220787
2023
Cited 6 times
Aspirin Use and Risk of Alzheimer’s Disease: A 2-Sample Mendelian Randomization Study
Background: Observational studies have shown inconsistent findings of the relationships between aspirin use and the risk of Alzheimer’s disease (AD). Objective: Since residual confounding and reverse causality were challenging issues inherent in observational studies, we conducted a 2-sample Mendelian randomization analysis (MR) to investigate whether aspirin use was causally associated with the risk of AD. Methods: We conducted 2-sample MR analyses utilizing summary genetic association statistics to estimate the potential causal relationship between aspirin use and AD. Single-nucleotide variants associated with aspirin use in a genome-wide association study (GWAS) of UK Biobank were considered as genetic proxies for aspirin use. The GWAS summary-level data of AD were derived from a meta-analysis of GWAS data from the International Genomics of Alzheimer’s Project (IGAP) stage I. Results: Univariable MR analysis based on these two large GWAS data sources showed that genetically proxied aspirin use was associated with a decreased risk of AD (Odds Ratio (OR): 0.87; 95%CI: 0.77–0.99). In multivariate MR analyses, the causal estimates remained significant after adjusting for chronic pain, inflammation, heart failure (OR = 0.88, 95%CI = 0.78–0.98), or stroke (OR = 0.87, 95%CI = 0.77–0.99), but was attenuated when adjusting for coronary heart disease, blood pressure, and blood lipids. Conclusion: Findings from this MR analysis suggest a genetic protective effect of aspirin use on AD, possibly influenced by coronary heart disease, blood pressure, and lipid levels.
DOI: 10.1016/j.bcmd.2023.102792
2024
Bach1 inhibitor HPP-D mediates γ-globin gene activation in sickle erythroid progenitors
Sickle cell disease (SCD) is the most common β-hemoglobinopathy caused by various mutations in the adult β-globin gene resulting in sickle hemoglobin production, chronic hemolytic anemia, pain, and progressive organ damage. The best therapeutic strategies to manage the clinical symptoms of SCD is the induction of fetal hemoglobin (HbF) using chemical agents. At present, among the Food and Drug Administration-approved drugs to treat SCD, hydroxyurea is the only one proven to induce HbF protein synthesis, however, it is not effective in all people. Therefore, we evaluated the ability of the novel Bach1 inhibitor, HPP-D to induce HbF in KU812 cells and primary sickle erythroid progenitors. HPP-D increased HbF and decreased Bach1 protein levels in both cell types. Furthermore, chromatin immunoprecipitation assay showed reduced Bach1 and increased NRF2 binding to the γ-globin promoter antioxidant response elements. We also observed increased levels of the active histone marks H3K4Me1 and H3K4Me3 supporting an open chromatin configuration. In primary sickle erythroid progenitors, HPP-D increased γ-globin transcription and HbF positive cells and reduced sickled erythroid progenitors under hypoxia conditions. Collectively, our data demonstrate that HPP-D induces γ-globin gene transcription through Bach1 inhibition and enhanced NRF2 binding in the γ-globin promoter antioxidant response elements.
DOI: 10.1101/2024.02.02.24302211
2024
Genome-wide association analysis of composite sleep health scores in 413,904 individuals
ABSTRACT Recent genome-wide association studies (GWASs) of several individual sleep traits have identified hundreds of genetic loci, suggesting diverse mechanisms. Moreover, sleep traits are moderately correlated, and together may provide a more complete picture of sleep health, while also illuminating distinct domains. Here we construct novel sleep health scores (SHSs) incorporating five core self-report measures: sleep duration, insomnia symptoms, chronotype, snoring, and daytime sleepiness, using additive (SHS-ADD) and five principal components-based (SHS-PCs) approaches. GWASs of these six SHSs identify 28 significant novel loci adjusting for multiple testing on six traits (p&lt;8.3e-9), along with 341 previously reported loci (p&lt;5e-08). The heritability of the first three SHS-PCs equals or exceeds that of SHS-ADD (SNP-h 2 =0.094), while revealing sleep-domain-specific genetic discoveries. Significant loci enrich in multiple brain tissues and in metabolic and neuronal pathways. Post GWAS analyses uncover novel genetic mechanisms underlying sleep health and reveal connections to behavioral, psychological, and cardiometabolic traits.
DOI: 10.1086/302444
1999
Cited 85 times
A Test of Transmission/Disequilibrium for Quantitative Traits in Pedigree Data, by Multiple Regression
The transmission/disequilibrium (TD) test (TDT), proposed, by Spielman et al., for binary traits is a powerful method for detection of linkage between a marker locus and a disease locus, in the presence of allelic association. As a test for linkage disequilibrium, the TDT makes the assumption that any allelic association present is due to linkage. Allison proposed a series of TD-type tests for quantitative traits and calculated their power, assuming that the marker locus is the disease locus. All these tests assume that the observations are independent, and therefore they are applicable, as a test for linkage, only for nuclear-family data. In this report, we propose a regression-based TD-type test for linkage between a marker locus and a quantitative trait locus, using information on the parent-to-offspring transmission status of the associated allele at the marker locus. This method does not require independence of observations, thus allowing for analysis of pedigree data as well, and allows adjustment for covariates. We investigate the statistical power and validity of the test by simulating markers at various recombination fractions from the disease locus.
DOI: 10.1101/gr.302003
2003
Cited 79 times
Linkage Disequilibrium and Haplotype Diversity in the Genes of the Renin–Angiotensin System: Findings From the Family Blood Pressure Program
Association studies of candidate genes with complex traits have generally used one or a few single nucleotide polymorphisms (SNPs), although variation in the extent of linkage disequilibrium (LD) within genes markedly influences the sensitivity and precision of association studies. The extent of LD and the underlying haplotype structure for most candidate genes are still unavailable. We sampled 193 blacks (African-Americans) and 160 whites (European-Americans) and estimated the intragenic LD and the haplotype structure in four genes of the renin-angiotensin system. We genotyped 25 SNPs, with all but one of the pairs spaced between 1 and 20 kb, thus providing resolution at small scale. The pattern of LD within a gene was very heterogeneous. Using a robust method to define haplotype blocks, blocks of limited haplotype diversity were identified at each locus; between these blocks, LD was lost owing to the history of recombination events. As anticipated, there was less LD among blacks, the number of haplotypes was substantially larger, and shorter haplotype segments were found, compared with whites. These findings have implications for candidate-gene association studies and indicate that variation between populations of European and African origin in haplotype diversity is characteristic of most genes.
DOI: 10.1086/421329
2004
Cited 73 times
Linkage Analysis of a Complex Disease through Use of Admixed Populations
Linkage disequilibrium arising from the recent admixture of genetically distinct populations can be potentially useful in mapping genes for complex diseases. McKeigue has proposed a method that conditions on parental admixture to detect linkage. We show that this method tests for linkage only under specific assumptions, such as equal admixture in the parental generation and admixture that occurs in a single generation. In practice, these assumptions are unlikely to hold for natural populations, resulting in an inflation of the type I error rate when testing for linkage by this method. In this article, we generalize McKeigue's approach of testing for linkage to allow two different admixture models: (1) intermixture admixture and (2) continuous gene flow. We calculate the sample size required for a genomewide search by this method under different disease models: multiplicative, additive, recessive, and dominant. Our results show that the sample size required to obtain 90% power to detect a putative mutant allele at a genomewide significance level of 5% can usually be achieved in practice if informative markers are available at a density of 2 cM.
DOI: 10.1046/j.1469-1809.2003.00036.x
2003
Cited 70 times
Qualitative Semi‐Parametric Test for Genetic Associations in Case‐Control Designs Under Structured Populations
Summary Recently, statistical methods have been proposed using genomic markers to control for population stratification in genetic association studies. However, these methods either have unacceptable low power when population stratification becomes strong or cannot control for population stratification well under admixture population models. In this paper, we propose a semiparametric association test to detect genetic association between a candidate marker and a qualitative trait of interest in case‐control designs. The performanceof the test is compared to other existing methods through simulations. The results show that our method gives correct type I error rate both under discrete population models and admixture population models, and our method is robust to the extent of the population stratification. In most of the cases we considered, our method has higher power and, in some cases, substantially higher power than that of existing methods.
DOI: 10.1007/s00439-010-0849-9
2010
Cited 53 times
Genome-wide searching of rare genetic variants in WTCCC data
Although they have demonstrated success in searching for common variants for complex diseases, genome-wide association (GWA) studies are less successful in detecting rare genetic variants because of the poor statistical power of most of current methods. We developed a two-stage method that can apply to GWA studies for detecting rare variants. Here we report the results of applying this two-stage method to the Wellcome Trust Case Control Consortium (WTCCC) dataset that include seven complex diseases: bipolar disorder, cardiovascular disease, hypertension (HT), rheumatoid arthritis, Crohn’s disease, type 1 diabetes and type 2 diabetes (T2D). We identified 24 genes or regions that reach genome wide significance. Eight of them are novel and were not reported in the WTCCC study. The cumulative risk (or protective) haplotype frequency for each of the 8 genes or regions is small, being at most 11%. For each of the novel genes, the risk (or protective) haplotype set cannot be tagged by the common SNPs available in chips (r 2 < 0.32). The gene identified in HT was further replicated in the Framingham Heart Study, and is also significantly associated with T2D. Our analysis suggests that searching for rare genetic variants is feasible in current GWA studies and candidate gene studies, and the results can severe as guides to future resequencing studies to identify the underlying rare functional variants.
DOI: 10.1002/gepi.20532
2010
Cited 52 times
Pathway‐based analysis for genome‐wide association studies using supervised principal components
Many complex diseases are influenced by genetic variations in multiple genes, each with only a small marginal effect on disease susceptibility. Pathway analysis, which identifies biological pathways associated with disease outcome, has become increasingly popular for genome-wide association studies (GWAS). In addition to combining weak signals from a number of SNPs in the same pathway, results from pathway analysis also shed light on the biological processes underlying disease. We propose a new pathway-based analysis method for GWAS, the supervised principal component analysis (SPCA) model. In the proposed SPCA model, a selected subset of SNPs most associated with disease outcome is used to estimate the latent variable for a pathway. The estimated latent variable for each pathway is an optimal linear combination of a selected subset of SNPs; therefore, the proposed SPCA model provides the ability to borrow strength across the SNPs in a pathway. In addition to identifying pathways associated with disease outcome, SPCA also carries out additional within-category selection to identify the most important SNPs within each gene set. The proposed model operates in a well-established statistical framework and can handle design information such as covariate adjustment and matching information in GWAS. We compare the proposed method with currently available methods using data with realistic linkage disequilibrium structures, and we illustrate the SPCA method using the Wellcome Trust Case-Control Consortium Crohn Disease (CD) data set.
DOI: 10.1002/gepi.20588
2011
Cited 49 times
Detecting rare and common variants for complex traits: sibpair and odds ratio weighted sum statistics (SPWSS, ORWSS)
It is generally known that risk variants segregate together with a disease within families, but this information has not been used in the existing statistical methods for detecting rare variants. Here we introduce two weighted sum statistics that can apply to either genome-wide association data or resequencing data for identifying rare disease variants: weights calculated based on sibpairs and odd ratios, respectively. We evaluated the two methods via extensive simulations under different disease models. We compared the proposed methods with the weighted sum statistic (WSS) proposed by Madsen and Browning, keeping the same genotyping or resequencing cost. Our methods clearly demonstrate more statistical power than the WSS. In addition, we found that using sibpair information can increase power over using only unrelated samples by more than 40%. We applied our methods to the Framingham Heart Study (FHS) and Wellcome Trust Case Control Consortium (WTCCC) hypertension datasets. Although we did not identify any genes as reaching a genome-wide significance level, we found variants in the candidate gene angiotensinogen significantly associated with hypertension at P = 6.9 × 10−4, whereas the most significant single SNP association evidence is P = 0.063. We further applied the odds ratio weighted method to the IFIH1 gene for type-1 diabetes in the WTCCC data. Our method yielded a P-value of 4.82 × 10−4, much more significant than that obtained by haplotype-based methods. We demonstrated that family data are extremely informative in searching for rare variants underlying complex traits, and the odds ratio weighted sum statistic is more efficient than currently existing methods. Genet. Epidemiol. 2011. © 2011 Wiley-Liss, Inc. 35:398-409, 2011
DOI: 10.1371/journal.pgen.1000866
2010
Cited 47 times
Rapid Assessment of Genetic Ancestry in Populations of Unknown Origin by Genome-Wide Genotyping of Pooled Samples
As we move forward from the current generation of genome-wide association (GWA) studies, additional cohorts of different ancestries will be studied to increase power, fine map association signals, and generalize association results to additional populations. Knowledge of genetic ancestry as well as population substructure will become increasingly important for GWA studies in populations of unknown ancestry. Here we propose genotyping pooled DNA samples using genome-wide SNP arrays as a viable option to efficiently and inexpensively estimate admixture proportion and identify ancestry informative markers (AIMs) in populations of unknown origin. We constructed DNA pools from African American, Native Hawaiian, Latina, and Jamaican samples and genotyped them using the Affymetrix 6.0 array. Aided by individual genotype data from the African American cohort, we established quality control filters to remove poorly performing SNPs and estimated allele frequencies for the remaining SNPs in each panel. We then applied a regression-based method to estimate the proportion of admixture in each cohort using the allele frequencies estimated from pooling and populations from the International HapMap Consortium as reference panels, and identified AIMs unique to each population. In this study, we demonstrated that genotyping pooled DNA samples yields estimates of admixture proportion that are both consistent with our knowledge of population history and similar to those obtained by genotyping known AIMs. Furthermore, through validation by individual genotyping, we demonstrated that pooling is quite effective for identifying SNPs with large allele frequency differences (i.e., AIMs) and that these AIMs are able to differentiate two closely related populations (HapMap JPT and CHB).
DOI: 10.1016/j.ajhg.2013.06.017
2013
Cited 43 times
What Is the Significance of Difference in Phenotypic Variability across SNP Genotypes?
We studied the general problem of interpreting and detecting differences in phenotypic variability among the genotypes at a locus, from both a biological and a statistical point of view. The scales on which we measure interval-scale quantitative traits are man-made and have little intrinsic biological relevance. Before claiming a biological interpretation for genotype differences in variance, we should be sure that no monotonic transformation of the data can reduce or eliminate these differences. We show theoretically that for an autosomal diallelic SNP, when the three corresponding means are distinct so that the variance can be expressed as a quadratic function of the mean, there implicitly exists a transformation that will tend to equalize the three variances; we also demonstrate how to find a transformation that will do this. We investigate the validity of Bartlett's test, Box's modification of it, and a modified Levene's test to test for differences in variances when normality does not hold. We find that, although they may detect differences in variability, these tests do not necessarily detect differences in variance. The same is true for permutation tests that use these three statistics.
DOI: 10.1093/hmg/ddv434
2015
Cited 41 times
Common variants in<i>DRD2</i>are associated with sleep duration: the CARe consortium
Sleep duration is implicated in the etiologies of chronic diseases and premature mortality. However, the genetic basis for sleep duration is poorly defined. We sought to identify novel genetic components influencing sleep duration in a multi-ethnic sample. Meta-analyses were conducted of genetic associations with self-reported, habitual sleep duration from seven Candidate Gene Association Resource (CARe) cohorts of over 25 000 individuals of African, Asian, European and Hispanic American ancestry. All individuals were genotyped for ∼50 000 SNPs from 2000 candidate heart, lung, blood and sleep genes. African-Americans had additional genome-wide genotypes. Four cohorts provided replication. A SNP (rs17601612) in the dopamine D2 receptor gene (DRD2) was significantly associated with sleep duration (P = 9.8 × 10(-7)). Conditional analysis identified a second DRD2 signal with opposite effects on sleep duration. In exploratory analysis, suggestive association was observed for rs17601612 with polysomnographically determined sleep latency (P = 0.002). The lead DRD2 signal was recently identified in a schizophrenia GWAS, and a genetic risk score of 11 additional schizophrenia GWAS loci genotyped on the IBC array was also associated with longer sleep duration (P = 0.03). These findings support a role for DRD2 in influencing sleep duration. Our work motivates future pharmocogenetics research on alerting agents such as caffeine and modafinil that interact with the dopaminergic pathway and further investigation of genetic overlap between sleep and neuro-psychiatric traits.
DOI: 10.1002/gepi.21957
2016
Cited 33 times
Comparison of Heritability Estimation and Linkage Analysis for Multiple Traits Using Principal Component Analyses
A disease trait often can be characterized by multiple phenotypic measurements that can provide complementary information on disease etiology, physiology, or clinical manifestations. Given that multiple phenotypes may be correlated and reflect common underlying genetic mechanisms, the use of multivariate analysis of multiple traits may improve statistical power to detect genes and variants underlying complex traits. The literature, however, has been unclear as to the optimal approach for analyzing multiple correlated traits. In this study, heritability and linkage analysis was performed for six obstructive sleep apnea hypopnea syndrome (OSAHS) related phenotypes, as well as principal components of the phenotypes and principal components of the heritability (PCHs) using the data from Cleveland Family Study, which include both African and European American families. Our study demonstrates that principal components generally result in higher heritability and linkage evidence than individual traits. Furthermore, the PCHs can be transferred across populations, strongly suggesting that these PCHs reflect traits with common underlying genetic mechanisms for OSAHS across populations. Thus, PCHs can provide useful traits for using data on multiple phenotypes and for genetic studies of trans-ethnic populations.
DOI: 10.1038/s41598-017-08858-2
2017
Cited 32 times
Development of an SSR-based genetic map in sesame and identification of quantitative trait loci associated with charcoal rot resistance
Abstract Sesame is prized for its oil. Genetic improvement of sesame can be enhanced through marker-assisted breeding. However, few simple sequence repeat (SSR) markers and SSR-based genetic maps were available in sesame. In this study, 7,357 SSR markers were developed from the sesame genome and transcriptomes, and a genetic map was constructed by generating 424 novel polymorphic markers and using a cross population with 548 recombinant inbred lines (RIL). The genetic map had 13 linkage groups, equalling the number of sesame chromosomes. The linkage groups ranged in size from 113.6 to 179.9 centimorgans (cM), with a mean value of 143.8 cM over a total length of 1869.8 cM. Fourteen quantitative trait loci (QTL) for sesame charcoal rot disease resistance were detected, with contribution rates of 3–14.16% in four field environments; ~60% of the QTL were located within 5 cM at 95% confidence interval. The QTL with the highest phenotype contribution rate ( qCRR12 . 2 ) and those detected in different environments ( qCRR8 . 2 and qCRR8 . 3 ) were used to predict candidate disease response genes. The new SSR-based genetic map and 14 novel QTLs for charcoal rot disease resistance will facilitate the mapping of agronomic traits and marker-assisted selection breeding in sesame.
DOI: 10.1182/blood-2017-10-810531
2018
Cited 31 times
Loss of NRF2 function exacerbates the pathophysiology of sickle cell disease in a transgenic mouse model
Key Points NRF2 knockout inhibits fetal hemoglobin expression during gestational erythropoiesis in SCD mice. Loss of the cellular antioxidant response mediated by NRF2 exacerbates spleen damage, inflammation, and oxidative stress in SCD mice.
DOI: 10.1371/journal.pgen.1007739
2019
Cited 30 times
Associations of variants In the hexokinase 1 and interleukin 18 receptor regions with oxyhemoglobin saturation during sleep
Sleep disordered breathing (SDB)-related overnight hypoxemia is associated with cardiometabolic disease and other comorbidities. Understanding the genetic bases for variations in nocturnal hypoxemia may help understand mechanisms influencing oxygenation and SDB-related mortality. We conducted genome-wide association tests across 10 cohorts and 4 populations to identify genetic variants associated with three correlated measures of overnight oxyhemoglobin saturation: average and minimum oxyhemoglobin saturation during sleep and the percent of sleep with oxyhemoglobin saturation under 90%. The discovery sample consisted of 8,326 individuals. Variants with p < 1 × 10−6 were analyzed in a replication group of 14,410 individuals. We identified 3 significantly associated regions, including 2 regions in multi-ethnic analyses (2q12, 10q22). SNPs in the 2q12 region associated with minimum SpO2 (rs78136548 p = 2.70 × 10−10). SNPs at 10q22 were associated with all three traits including average SpO2 (rs72805692 p = 4.58 × 10−8). SNPs in both regions were associated in over 20,000 individuals and are supported by prior associations or functional evidence. Four additional significant regions were detected in secondary sex-stratified and combined discovery and replication analyses, including a region overlapping Reelin, a known marker of respiratory complex neurons.These are the first genome-wide significant findings reported for oxyhemoglobin saturation during sleep, a phenotype of high clinical interest. Our replicated associations with HK1 and IL18R1 suggest that variants in inflammatory pathways, such as the biologically-plausible NLRP3 inflammasome, may contribute to nocturnal hypoxemia.
DOI: 10.1093/sleep/zsz101
2019
Cited 30 times
Epigenome-wide association analysis of daytime sleepiness in the Multi-Ethnic Study of Atherosclerosis reveals African-American-specific associations
Daytime sleepiness is a consequence of inadequate sleep, sleep-wake control disorder, or other medical conditions. Population variability in prevalence of daytime sleepiness is likely due to genetic and biological factors as well as social and environmental influences. DNA methylation (DNAm) potentially influences multiple health outcomes. Here, we explored the association between DNAm and daytime sleepiness quantified by the Epworth Sleepiness Scale (ESS).We performed multi-ethnic and ethnic-specific epigenome-wide association studies for DNAm and ESS in the Multi-Ethnic Study of Atherosclerosis (MESA; n = 619) and the Cardiovascular Health Study (n = 483), with cross-study replication and meta-analysis. Genetic variants near ESS-associated DNAm were analyzed for methylation quantitative trait loci and followed with replication of genotype-sleepiness associations in the UK Biobank.In MESA only, we detected four DNAm-ESS associations: one across all race/ethnic groups; three in African-Americans (AA) only. Two of the MESA AA associations, in genes KCTD5 and RXRA, nominally replicated in CHS (p-value < 0.05). In the AA meta-analysis, we detected 14 DNAm-ESS associations (FDR q-value < 0.05, top association p-value = 4.26 × 10-8). Three DNAm sites mapped to genes (CPLX3, GFAP, and C7orf50) with biological relevance. We also found evidence for associations with DNAm sites in RAI1, a gene associated with sleep and circadian phenotypes. UK Biobank follow-up analyses detected SNPs in RAI1, RXRA, and CPLX3 with nominal sleepiness associations.We identified methylation sites in multiple genes possibly implicated in daytime sleepiness. Most significant DNAm-ESS associations were specific to AA. Future work is needed to identify mechanisms driving ancestry-specific methylation effects.
DOI: 10.1093/bioinformatics/btaa985
2020
Cited 25 times
An iterative approach to detect pleiotropy and perform Mendelian Randomization analysis using GWAS summary statistics
Abstract Motivation The overall association evidence of a genetic variant with multiple traits can be evaluated by cross-phenotype association analysis using summary statistics from genome-wide association studies. Further dissecting the association pathways from a variant to multiple traits is important to understand the biological causal relationships among complex traits. Results Here, we introduce a flexible and computationally efficient Iterative Mendelian Randomization and Pleiotropy (IMRP) approach to simultaneously search for horizontal pleiotropic variants and estimate causal effect. Extensive simulations and real data applications suggest that IMRP has similar or better performance than existing Mendelian Randomization methods for both causal effect estimation and pleiotropic variant detection. The developed pleiotropy test is further extended to detect colocalization for multiple variants at a locus. IMRP will greatly facilitate our understanding of causal relationships underlying complex traits, in particular, when a large number of genetic instrumental variables are used for evaluating multiple traits. Availability and implementation The software IMRP is available at https://github.com/XiaofengZhuCase/IMRP. The simulation codes can be downloaded at http://hal.case.edu/∼xxz10/zhu-web/ under the link: MR Simulations software. Supplementary information Supplementary data are available at Bioinformatics online.
DOI: 10.1161/hypertensionaha.121.18513
2022
Cited 13 times
Rare Variants in Genes Encoding Subunits of the Epithelial Na <sup>+</sup> Channel Are Associated With Blood Pressure and Kidney Function
The epithelial Na+ channel (ENaC) is intrinsically linked to fluid volume homeostasis and blood pressure. Specific rare mutations in SCNN1A, SCNN1B, and SCNN1G, genes encoding the α, β, and γ subunits of ENaC, respectively, are associated with extreme blood pressure phenotypes. No associations between blood pressure and SCNN1D, which encodes the δ subunit of ENaC, have been reported. A small number of sequence variants in ENaC subunits have been reported to affect functional transport in vitro or blood pressure. The effects of the vast majority of rare and low-frequency ENaC variants on blood pressure are not known.We explored the association of low frequency and rare variants in the genes encoding ENaC subunits, with systolic blood pressure, diastolic blood pressure, mean arterial pressure, and pulse pressure. Using whole-genome sequencing data from 14 studies participating in the Trans-Omics in Precision Medicine Whole-Genome Sequencing Program, and sequence kernel association tests.We found that variants in SCNN1A and SCNN1B were associated with diastolic blood pressure and mean arterial pressure (P<0.00625). Although SCNN1D is poorly expressed in human kidney tissue, SCNN1D variants were associated with systolic blood pressure, diastolic blood pressure, mean arterial pressure, and pulse pressure (P<0.00625). ENaC variants in 2 of the 4 subunits (SCNN1B and SCNN1D) were also associated with estimated glomerular filtration rate (P<0.00625), but not with stroke.Our results suggest that variants in extrarenal ENaCs, in addition to ENaCs expressed in kidneys, influence blood pressure and kidney function.
DOI: 10.1038/s41467-023-38990-9
2023
Cited 5 times
Evaluating the use of blood pressure polygenic risk scores across race/ethnic background groups
Abstract We assess performance and limitations of polygenic risk scores (PRSs) for multiple blood pressure (BP) phenotypes in diverse population groups. We compare “clumping-and-thresholding” (PRSice2) and LD-based (LDPred2) methods to construct PRSs from each of multiple GWAS, as well as multi-PRS approaches that sum PRSs with and without weights, including PRS-CSx. We use datasets from the MGB Biobank, TOPMed study, UK biobank, and from All of Us to train, assess, and validate PRSs in groups defined by self-reported race/ethnic background (Asian, Black, Hispanic/Latino, and White). For both SBP and DBP, the PRS-CSx based PRS, constructed as a weighted sum of PRSs developed from multiple independent GWAS, perform best across all race/ethnic backgrounds. Stratified analysis in All of Us shows that PRSs are better predictive of BP in females compared to males, individuals without obesity, and middle-aged (40-60 years) compared to older and younger individuals.
DOI: 10.1016/s0197-4580(00)00217-7
2000
Cited 66 times
Neuronal CDK7 in hippocampus is related to aging and Alzheimer disease
Despite their supposedly terminally-differentiated quiescent status, many neurons in Alzheimer disease display an ectopic re-expression of cell-cycle related proteins. In the highly regulated process of cell cycle, cyclin-dependent kinase 7 (Cdk7) plays a crucial role as a Cdk-activating kinase and activates all of the major Cdk-cyclin substrates. In this study, we demonstrate that Cdk7 immunoreactivity is significantly elevated in susceptible hippocampal neurons of Alzheimer disease patients in comparison with age-matched controls. Notably, the expression of Cdk7 is age-dependent, with decreased levels between the ages of 54 and 65 years and after the age of 78. While the Cdk7 levels in Alzheimer disease patients are higher than controls within each age group, the difference is greatest between ages 54-65 where disease susceptibility and/or progression is likely more related to genetic factors.
DOI: 10.1038/oby.2003.40
2003
Cited 65 times
A Genome‐Wide Scan for Body Mass Index among Nigerian Families
Abstract Objective : Interest in mapping genetic variants that are associated with obesity remains high because of the increasing prevalence of obesity and its complications worldwide. Data on genetic determinants of obesity in African populations are rare. Research Methods and Procedures : We have undertaken a genome‐wide scan for body mass index (BMI) in 182 Nigerian families that included 769 individuals. Results : The prevalence of obesity was only 5%, yet polygenic heritability for BMI was in the expected range (0.46 ± 0.07). Tandem repeat markers (402) were typed across the genome with an average map density of 9 cM. Pedigree‐based analysis using a variance components linkage model demonstrated evidence for linkage on chromosome 7 (near marker D7S817 at 7p14) with a logarithm of odds (LOD) score of 3.8 and on chromosome 11 (marker D11S2000 at 11q22) with an LOD score of 3.3. Weaker evidence for linkage was found on chromosomes 1 (1q21, LOD = 2.2) and 8 (8p22, LOD = 2.3). Several candidate genes, including neuropeptide Y, DRD2, APOA4, lamin A/C, and lipoprotein lipase, lie in or close to the chromosomal regions where strong linkage signals were found. Discussion : The findings of this study suggest that, as in other populations with higher prevalences of obesity, positive linkage signals can be found on genome scans for obesity‐related traits. Follow‐up studies may be warranted to investigate these linkages, especially the one on chromosome 11, which has been reported in a population at the opposite end of the BMI distribution.
DOI: 10.2337/diabetes.51.2.541
2002
Cited 59 times
A Genome-Wide Scan for Obesity in African-Americans
A genome-wide scan using 387 short tandem repeat markers was conducted for obesity among 618 black individuals from 202 families residing in a suburb of Chicago. Evidence for linkage was evaluated with BMI and percent body fat (PBF) using a variance component analysis approach. Suggestive evidence for linkage was found for BMI on chromosome 5 (logarithm of odds [LOD] score = 1.9) and PBF on chromosome 6 (LOD score = 2.7). One additional region on chromosome 3 was linked to these phenotypes at a lower level of significance (LOD score = 1.8 and 0.95 for BMI and PBF, respectively); the linked marker on this chromosome lies in the same region implicated as harboring obesity genes in a previous study of a white population. The replication of linkage evidence using different ethnic groups reinforces the potential significance of this latter candidate region.
DOI: 10.1371/journal.pone.0001244
2007
Cited 51 times
Admixture Mapping Provides Evidence of Association of the VNN1 Gene with Hypertension
Migration patterns in modern societies have created the opportunity to use population admixture as a strategy to identify susceptibility genes. To implement this strategy, we genotyped a highly informative ancestry marker panel of 2270 single nucleotide polymorphisms in a random population sample of African Americans (N = 1743), European Americans (N = 1000) and Mexican Americans (N = 581). We then examined the evidence for over-transmission of specific loci to cases from one of the two ancestral populations. Hypertension cases and controls were defined based on standard clinical criteria. Both case-only and case-control analyses were performed among African Americans. With the genome-wide markers we replicated the findings identified in our previous admixture mapping study on chromosomes 6 and 21 [1]. For case-control analysis we then genotyped 51 missense SNPs in 36 genes spaced across an 18.3 Mb region. Further analyses demonstrated that the missense SNP rs2272996 (or N131S) in the VNN1 gene was significantly associated with hypertension in African Americans and the association was replicated in Mexican Americans; a non-significant opposite association was observed in European Americans. This SNP also accounted for most of the evidence observed in the admixture analysis on chromosome 6. Despite these encouraging results, susceptibility loci for hypertension have been exceptionally difficult to localize and confirmation by independent studies will be necessary to establish these findings.
DOI: 10.2337/db06-1051
2007
Cited 49 times
Association Studies of BMI and Type 2 Diabetes in the Neuropeptide Y Pathway
The neuropeptide Y (NPY) family of peptides and receptors regulate food intake. Inherited variation in this pathway could influence susceptibility to obesity and its complications, including type 2 diabetes. We genotyped a set of 71 single nucleotide polymorphisms (SNPs) that capture the most common variation in NPY, PPY, PYY, NPY1R, NPY2R, and NPY5R in 2,800 individuals of recent European ancestry drawn from the near extremes of BMI distribution. Five SNPs located upstream of NPY2R were nominally associated with BMI in men (P values = 0.001–0.009, odds ratios [ORs] 1.27–1.34). No association with BMI was observed in women, and no consistent associations were observed for other genes in this pathway. We attempted to replicate the association with BMI in 2,500 men and tested these SNPs for association with type 2 diabetes in 8,000 samples. We observed association with BMI in men in only one replication sample and saw no association in the combined replication samples (P = 0.154, OR = 1.09). Finally, a 9% haplotype was associated with type 2 diabetes in men (P = 1.73 × 10−4, OR = 1.36) and not in women. Variation in this pathway likely does not have a major influence on BMI, although small effects cannot be ruled out; NPY2R should be considered a candidate gene for type 2 diabetes in men.
DOI: 10.1002/bio.1250
2010
Cited 43 times
Aqueous synthesis of CdTe/CdS/ZnS quantum dots and their optical and chemical properties
ABSTRACT In this paper, we described a strategy for synthesis of thiol‐coated CdTe/CdS/ZnS (core–shell–shell) quantum dots (QDs) via aqueous synthesis approach. The synthesis conditions were systematically optimized, which included the size of CdTe core, the refluxing time and the number of monolayers and the ligands, and then the chemical and optical properties of the as‐prepared products were investigated. We found that the mercaptopropionic acid (MPA)‐coated CdTe/CdS/ZnS QDs presented highly photoluminescent quantum yields (PL QYs), good photostability and chemical stability, good salt tolerance and pH tolerance and favorable biocompatibility. The characterization of high‐resolution transmission electron microscopy (HRTEM), X‐ray powder diffraction (XRD) and fluorescence correlation spectroscopy (FCS) showed that the CdTe/CdS/ZnS QDs had good monodispersity and crystal structure. The fluorescence life time spectra demonstrated that CdTe/CdS/ZnS QDs had a longer lifetime in contrast to fluorescent dyes and CdTe QDs. Furthermore, the MPA‐stabilized CdTe/CdS/ZnS QDs were applied for the imaging of cells. Compared with current synthesis methods, our synthesis approach was reproducible and simple, and the reaction conditions were mild. More importantly, our method was cost‐effective, and was very suitable for large‐scale synthesis of CdTe/CdS/ZnS QDs for future applications. Copyright © 2010 John Wiley &amp; Sons, Ltd.