Faculty Grants

Faculty Grants


  • PI Amanda Seyerle

    Whole Genome Analysis of Venous Thromboembolism
    View Abstract

    Venous thromboembolism (VTE), comprised of deep vein thrombosis (DVT) and pulmonary embolism (PE), is of major public health importance. The annual incidence rate of VTE in adults is approximately 1 per 1,000, but the rate is 3-5 per 1,000 by ages 70-79. It is estimated that between 350,000 and 600,000 Americans are affected by DVT/PE each year with at least 100,000 related deaths. About 10%-30% of all patients died within 30 days, mainly due to PE. The U.S. Surgeon General has issued a nationwide Call to Action to raise the awareness, prevention, and treatment for VTE. However, the etiology of VTE is not well understood despite its clinical and public health significance. Twin and family studies suggest a strong genetic influence in VTE risk (heritability = 0.5 to 0.6) with possible contribution from gene-by-environment interactions. More than a dozen genetic variants, mostly common ones, have been identified from genome-wide association studies (GWAS) but collectively only explain a small proportion of heritability. Of the nine gene/loci that have been reported for VTE in GWAS (Table A.1), the most significant variants in FGG, F11, TSPAN15, PROCR, and F8 were intronic, intergenic, or in coding regions but not translated, bearing no known functional relevance in influencing protein structure. Most of the variants identified from GWAS are not likely the underlying causal ones, because the GWAS approach is an indirect mapping strategy to utilize the phenomenon of linkage disequilibrium (LD) to localize disease associated regions. The high pass whole genome sequencing (WGS) data in the TOPMed VTE consortium provide us an excellent opportunity to identify causal variants and improve understanding of the underlying genetic architecture of the identified susceptibility from previously published VTE GWAS and we anticipate that additional genetic variants, probably with low-frequency, will be identified for VTE by a WGS approach. Given that relatively few research groups around the world have focused their efforts on genomic discovery for VTE, the extensive TOPMed sequence data puts us in a unique position to publish what will be the first WGS analysis of VTE, the third most common life-threatening cardiovascular disease. We will leverage the considerable WGS investment to extend the resource with genetic association studies of VTE and investigations of underlying mechanisms in the etiology of VTE. We will conduct association analyses of WGS data with VTE in ~3,914 cases (86% European Americans [EAs]; 13% African Americans [AAs]) and 10,378 controls (88% EAs) in the TOPMed VTE Consortium through agnostic interrogation of the genome to identify common and rare genetic variation associated with VTE risk. We will conduct analyses of annotated features of the WGS data based on protein coding and regulatory functions, as well as analyses of structural variants, sliding windows, noncoding RNA genes, and single nucleotide variants (SNVs).
  • PI Christy Avery

    Inflammatory Mediators of Cardiometabolic Risk in Latinos
    PI: Christy Avery
    Project Number: R01-HL147853

    View Abstract

    Cardiometabolic risk factors and type 2 diabetes (T2D) impart a substantial and growing morbidity and mortality burden that disproportionally affects racial/ethnic minorities, including Hispanic/Latinos (H/L). The population burden, established disparities, and limited availability of T2D treatments to reverse progression or prevent long- term complications underscore an urgent need to clarify mechanistic pathways that may serve as novel targets for prevention and treatment. Chronic low-grade inflammation is a widely recognized common pathological feature underling cardiometabolic risk factors and T2D, particularly in H/L when compared to other racial/ethnic groups; identifying specific mediators of chronic low-grade inflammation could greatly enhance efforts to tailor existing agents or develop of novel therapies, especially in populations at highest risk. Prior attempts to examine specific mediators of chronic low-grade inflammation have been limited by a focus on downstream markers, including C-reactive protein, which are less likely to be causal or are difficult to reliably measure. Upstream regulation of systemic inflammation is in turn mediated by fatty acid derived lipid mediators termed eicosanoids. Although select eicosanoids have been associated with cardiometabolic risk factors and T2D, prior studies have only assessed a handful of the most abundant eicosanoids in humans. We propose to address this major research gap by leveraging advances in analytical mass spectrometry (MS) that now enable the rapid and accurate quantification of >150 eicosanoids spanning major biosynthetic pathways. Eicosanoids will be assayed in the deeply-phenotyped population-based Hispanic Community Health Study/Study of Latinos (SOL) cohort, enabling cost-effective testing of study hypotheses in a H/L population with established cardiometabolic risk factor and T2D disparities. Specifically, we will identify known and novel eicosanoids associated with cardiometabolic risk factors and T2D, as well as leverage existing genomics data to conduct causal inference studies and evaluate mechanistic frameworks for key eicosanoids. This work will shed insight into the mechanisms underlying cardiometabolic disease in H/L, identify potential sources of health disparities in a genetically admixed cohort, and provide an essential foundation for future studies of inflammatory-modulating therapies aimed at reducing the burden of cardiometabolic disease in the population at large.

    Leveraging Multi-omics Approaches to Examine Metabolic Challenges of Obesity in Relation to Cardiovascular Diseases
    MPI: Christy Avery, Penny Gordon Larsen, Kari North and Susan Sumner
    Project Number: R01-HL143885

    View Abstract

    Cardiovascular diseases (CVD) remain leading causes of morbidity, mortality, and early disability, and are exacerbated by obesity. It is well known that obesity stresses metabolic pathways, thereby accelerating CVD risk. Yet, the specific biologic mechanisms remain poorly understood. Metabolites are biologically active small- molecule intermediates and byproducts of metabolism that lie along pathways linking genetic susceptibility with CVD and are responsive to obesity, related health behaviors, and CVD risk factors. Thus, metabolites can be powerful disease biomarkers and therapeutic targets and may provide targetable “mechanistic bridges” linking genome-wide association study (GWAS) findings with CVD risk factors and clinical disease. We hypothesize that: (1) genetic susceptibility influences CVD risk along specific metabolic pathways; (2) that metabolites on these pathways (i) affect and (ii) are affected by CVD risk factors to (3) increase clinical disease risk; and that (4) obesity modifies a subset of metabolite effects. Yet, the majority of metabolomics studies to-date have been largely cross-sectional or clinical efforts in older, European-ancestry populations, with inconsistent control of confounders, including diet, and they have ignored plausible modifiers, including obesity. To address these major research gaps, we will generate longitudinal untargeted and targeted metabolomics profiles in the biracial (47% African American) CARDIA study (n=5,115; 18-30 years in 1985-86; n~3,270 in 2020-21). The CARDIA study is uniquely suited to test the proposed study hypotheses, with 35 years of longitudinal data collected over the key your adult lifecycle period when CVD risk accelerates in concert with increasing obesity. We will develop and employ cutting-edge metabolomics and statistical methods to characterize known and unknown metabolite signals. Longitudinal data, Mendelian randomization, and pathway-based modeling enable assessment of (i) metabolic perturbations that influence CVD and (ii) CVD risk factors that influence metabolic perturbations, (iii) overall and in the context of a growing obesity burden. We address the following specific aims: 1) identify metabolites and major metabolic pathways that influence metabolic CVD risk factors (cholesterol, blood pressure, and glycemic phenotypes); 2) identify metabolic CVD risk factors that influence metabolites and major metabolic pathways; 3) leverage statistical innovations and existing `omics, phenotype, and covariate data for causal inference, to evaluate mechanistic frameworks, and characterize novel metabolites; and 4) test metabolites identified in the CARDIA study for evidence of association with CVD risk factors and clinical endpoints (coronary heart disease, heart failure, and stroke) in the biracial Atherosclerosis Risk in Communities (ARIC) study. We anticipate that the proposed project, prepared by a multi-disciplinary team with expertise in CVD and metabolic epidemiology, nutritional biochemistry, metabolomics, bioinformatics, biostatistics, and genetics, will inform disease mechanisms, with strong potential for identifying biomarkers of CVD risk. Together, our innovations will help identify novel therapeutic and nutritional targets to reduce the global burden of CVD.

    Add Health GWAS Data: User Support and Research Tools Enable Widespread Access
    Project Number: R03-HD097630
    View Abstract

    Genetic studies leveraging large-scale genotyping (i.e., “GWAS data”) are increasingly ubiquitous, as demonstrated by the >56,000 unique single nucleotide polymorphism (SNP)-trait associations identified to-date by genome-wide association studies (GWAS). GWAS data also are being used to understand social-genetic effects, control for genetic predispositions in population health and social science studies, and examine genetic correlation between traits. Despite growing adoption, studies leveraging GWAS data remain largely limited to adult populations of European ancestry and tend to ignore the physical and social environment. Studies with GWAS data combined with rich, longitudinal environmental and phenotype data are therefore needed to permit dynamic, multilevel, integrative research approaches to health that capture bidirectional biological and contextual contributions and their interactions over time. The National Longitudinal Study of Adolescent to Adult Health (Add Health) is an ongoing, nationally representative, multiethnic longitudinal study of the social, behavioral, and biological linkages in health and developmental trajectories from early adolescence into adulthood. As the only nationally representative longitudinal study of young adults that contains multilevel social, behavioral, environmental, and biological data (including recently available GWAS data through dbGaP, a NIH-sanctioned repository), Add Health is well positioned to address these research gaps. However, numerous and persistent challenges prevent broad usage of GWAS data by the research community. Specifically, Add Health users may be ill-prepared for conceptualizing, accessing, storing, understanding, analyzing, and interpreting high dimensional (i.e. >30 million SNPs) GWAS data, an impression supported by our recent survey of users. To enable widespread use of this valuable resource, this application aims to: (1) develop resources to aid users in accessing, understanding, analyzing, and interpreting Add Health GWAS data; and (2) initiate and support a scientific community of Add Health GWAS data users. The proposed application builds upon a 25-year commitment of Add Health investigators to user support and data dissemination, which has resulted in prolific research production with unparalleled disciplinary breadth. We are confident that the proposed resources, which will not be developed without dedicated funding, will expedite access to and facilitate high-quality studies of the Add Health GWAS data by a new group of investigators who may have little-to-no experience with GWAS data. Ultimately, we anticipate that this application will multiply the impact of Add Health sociogenomics research throughout the scientific community and provide a stimulus for new scientific discovery.

    Characterizing Pleiotropy in Cardiometabolic Phenotypes Among Diverse Populations
    Project Number: R01-HL142825
    View Abstract

    Genetic susceptibility underlies a majority of cardiovascular diseases (CVD) and their antecedents, underscored by genome-wide association studies (GWAS) that identified >1,500 loci to-date. Each GWAS-identified locus potentially provides novel mechanistic insight, yet translation of study findings remains largely incomplete, representing a critical barrier to progress. Pleiotropy, a variant that affects multiple phenotypes, is a long-described and pervasive, but largely uncharacterized avenue to advance genomic medicine. Specifically, studies of pleiotropy have the potential to clarify molecular functions, identify mechanistic “common denominators”, inform diagnosis and treatment, and prioritize variants for functional interrogation. Systematic and comprehensive interrogation of pleiotropy is particularly relevant for CVD phenotypes, as decades of human and animal studies support a shared genetic architecture that collectively affects downstream clinical disease. Yet, few studies have comprehensively and systematically evaluated pleiotropy within or across cardiovascular phenotypes or extended investigations to examine how pleiotropic variants affect clinical disease. Further, many CVDs and their antecedents disproportionately affect African Americans (AA) and Hispanic/Latinos (HL). However, the majority (>80%) of participants included in GWAS to-date are of European (EU) ancestry. This research disparity creates a biased view of human variation, fails to leverage the unique genetic architecture of AAs and HLs for fine-mapping, and hinders translation of genetic findings into clinical and public health applications relevant for broad populations. We respond to these gaps by leveraging high-quality, harmonized, and centrally available phenotype and genotype data from the Population Architecture Using Genomics in Epidemiology (PAGE) consortium and the Reasons for Geographic and Racial Differences in Stroke (REGARDS) study (n=100,917; 35% AA; 32% EU; 24% HL) as well as cutting edge statistical methods to comprehensively identify loci with potential evidence of pleiotropy within and across blood pressure, cholesterol, cardiac conduction, glycemic, inflammatory, and obesity cardiovascular domains as well as incident MI and stroke (Aim 1). At known and novel loci with strong evidence of potential pleiotropy, we will leverage population structure, haplotypic architecture, and phenotype correlation through multi-ethnic, multi-phenotype fine-mapping to prioritize variants for further interrogation (Aim 2). Finally, we will leverage longitudinal data and pathway models to disaggregate variants displaying evidence of biological pleiotropy (i.e. variant affects multiple phenotypes due to shared biology) from variants displaying evidence of mediated pleiotropy (e.g. variant influences one phenotype and this phenotype influences a second phenotype) (Aim 3). We hypothesize that CVD phenotypes and clinical disease may be more accurately characterized as variations in clinical expression, with common biological mechanisms. By investigating pleiotropy, we hope to clarify these mechanisms, which has the potential to inform phenotype classification, drug development and repurposing, and CVD prevention.

  • PI Eric Whitsel

    Modification of PM-Mediated Arrhythmogenesis in Populations
    Project Number: R01-ES017794
    View Abstract

    The proposed study examines susceptibility to the arrhythmogenic effects of particulate matter (PM) air pollution contributed by common genetic variation. Its rationale derives from the established, but heterogeneous association between ambient concentrations of PM air pollution and acute coronary heart disease (CHD) events, a widespread, but poorly understood threat to public health. Its focus is on resting, standard twelve-lead ECG measures that have been linked to ambient PM concentrations on the one hand, and to acute CHD events on the other. It will optimally leverage the genomic, environmental and electrocardiographic data from the Women’s Health Initiative clinical trial (WHI CT), The SNP Health Association Resource project (SHARe, NHLBI-PB-2006-091), The Environmental Epidemiology of Arrhythmogenesis in WHI (5-R01-ES012238), the Atherosclerosis Risk in Communities (ARIC) study and the Population Architecture using Genomics and Epidemiology (PAGE) consortium. Specifically, it will examine measures of heart rate variability, ventricular repolarization, myocardial ischemia and ventricular ectopy within seven distinct subpopulations evaluated between 1987 and 2004. The seven subpopulations include: (1) 5148 black women, (2) 2002 Hispanic women, and (3) 1507 white women with and 1507 without ventricular ectopy in the WHI CT; and in the ARIC study, (4) 2615 black women, (5) 5989 white women, (6) 1621 black men, and (7) 5369 white men. Subpopulations 1-7 will be used to independently identify gene-by-environment interactions between approximately 106 single nucleotide polymorphisms (SNPs) genotyped on the Affymetrix 6.0 array and daily mean ambient PM concentrations spatially interpolated at geocoded participant addresses. SNP main effects will also be investigated. The intramurally co-funded analyses will be well-powered and appropriately adjusted for multiple comparisons. Anticipated findings will provide a foundation for examining the consistency of associations across race and gender. Collectively, they will advance understanding of genetic susceptibility to and the pathophysiological mechanisms underlying PM-mediated arrhythmogenesis in an ethnically and geographically representative population of 25,758 uniformly well-characterized participants living in U.S. Environmental Protection Agency (EPA) Regions 1-10. The advance will provide insight into the proportion of PM-attributable ECG abnormalities that could be reduced by establishing and complying with stricter National Ambient Air Quality Standards. PUBLIC HEALTH RELEVANCE: The proposed study will efficiently advance understanding of genetic susceptibility to and the pathophysiological mechanisms underlying PM-mediated arrhythmogenesis in an ethnically and geographically representative population of 25,758 uniformly well-characterized participants living in U.S. Environmental Protection Agency (EPA) Regions 1-10. Its rationale derives from the established, but heterogeneous association between ambient levels of particulate matter air pollution and acute coronary heart disease events, a widespread, but poorly understood threat to public health.

    Epigenetic Mechanisms of PM-Mediated CVD Risk
    Project Number: R01-ES020836
    View Abstract

    Epidemiologic studies have linked exposure to ambient particulate matter (PM) air pollution with sub-clinical and clinical cardiovascular disease (CVD). Although PM inhalation also has been linked with increases in inflammatory, oxidative, endothelial, metabolic, and coagulation biomarkers in blood, biomarker discovery and understanding of the molecular mechanisms producing such effects in human populations remain incomplete. We and others have consistently found that PM exposure is associated with altered global and gene-specific methylation measured in peripheral leukocyte DNA, an environmentally inducible and dynamic epigenetic mechanism that controls gene expression. DNA methylation in leukocytes and other tissues also has been associated with CVD, implicating it as a primary molecular mechanism mediating the cardiovascular effects of environmental exposures. However, few studies have examined the epigenetics of PM air pollution. Those that have tend to be underpowered, unreplicated, cross-sectional analyses of candidate gene DNA methylation conducted within environmentally homogeneous, single-city populations of white men, without attention to other factors capable of affecting methylation and in turn, CVD risk. We therefore propose to conduct a two- stage, longitudinal study of associations between PM air pollution, DNA methylation, and CVD risk factors among independent subsets of the exam site- and race-stratified, randomly selected 6% minority oversample of approximately 4,300 Women’s Health Initiative clinical trial (WHI CT) participants who had fasting blood draws and resting, standard, twelve-lead electrocardiograms (ECGs) repeated at three-year intervals from 1993 to 2004. Stage 1 will focus on the interrogation, discovery and ranking of >450,000 DNA methylation sites potentially sensitive to PM in 1999-2001 blood samples from 800 of the participants. In up to three blood samples collected serially from the remaining 3,500 participants in 1993-2004, Stage 2 will focus on the longitudinal validation of the ten most PM-sensitive DNA methylation sites identified by Stage 1, the temporal relationship between PM and DNA methylation at those sites, and that between site-specific DNA methylation, CVD risk factors, and CVD. The proposed epigenetic data analyses will be conducted within a phenomics framework, well-powered and appropriately adjusted for both ancestral admixture and multiple comparisons. Findings will be externally validated in the ARIC and NAS cohorts. Generalizable findings will advance understanding of epigenetic mechanisms underlying, and biomarkers identifying susceptibility to PM-mediated CVD risk in pre- and post-menopausal women, younger black and older white men. At the same time, they will support inference to the larger, dynamic population of WHI CT participants from which the study’s minority oversample was drawn, one living in U.S. Environmental Protection Agency (EPA) Regions 1-10 and potentially benefitted by the science-based establishment of and compliance with stricter National Ambient Air Quality Standards.

    Clonal Hematopoiesis in the Women’s Health Initiative
    Project Number: R01-HL148565
    View Abstract

    Clonal hematopoiesis of indeterminate potential (CHIP) is a common, age-related condition in which hematopoietic stem cells in the bone marrow undergo somatic mutations that lead to overgrowth (“clones”) of a genetically distinct subpopulation of blood cells. Evidence is mounting that CHIP has major implications for human health as a risk factor for mortality and chronic diseases including hematologic cancers and atherosclerotic cardiovascular disease (CVD). Prior studies of CHIP were cross-sectional and limited information is available on behavioral/lifestyle, environmental, and heritable risk factors for the development, progression, and the occurrence of CHIP and also the relationship of CHIP to risk of specific CVD subtypes (coronary heart disease, stroke, and venous thromboembolic disease), pre-malignant blood diseases, and dementia over long- term follow up. The large (N~161,000), multi-ethnic, prospective Women’s Health Initiative (WHI), which enrolled post-menopausal women during 1993-1998 is particularly well-suited to address these limitations because of its longitudinal design, availability of extensive exposure and phenotype data, and ongoing surveillance of incident disease/mortality among aging women. In particular, a subset of ~7,800 of the original WHI cohort underwent a subsequent examination and blood sampling in 2012 (ranging from 14 to 19 years after WHI enrollment) as part of the WHI Long Life Study (LLS). Through the NHLBI Trans-Omics for Precision Medicine (TOPMed) Project, 11,000 original WHI participants (including approximately 1,400 WHI-LLS participants) have undergone deep- coverage (30x) whole genome sequencing of their baseline genomic DNA and are currently undergoing somatic variant genotype calling for assessment of CHIP. Through the current R01 proposal, we will additionally perform CHIP genotyping and detection by targeted hematopoiesis gene sequencing in the remaining 6,400 WHI-LLS samples (at baseline) and the full set of 7,800 WHI-LLS participants using peripheral blood genomic DNA extracted at the LLS exam. In Aim 1, we will estimate associations between prevalent CHIP (at baseline), incidence or progression of CHIP (between baseline and LLS), and putative socio-demographic, cardiometabolic, behavioral, pharmacologic, environmental, genetic, and aging-related risk factors for CHIP. In Aim 2, we will estimate CHIP-outcome associations using the LLS cohort and TOPMed baseline CHIP data (total N=17,000) with incident clinical cardiovascular, hematologic, neurocognitive, and mortality outcomes. In Aim 3, informed by results from Aims 1 and 2, we will use Mendelian randomization approaches, mediation analyses, and polygenic risk scores to assess causal mediation of exposure-outcome associations by CHIP and the mechanisms by which heritable germline variants contribute to CHIP.

  • PI Kari North

    Leveraging Multi-omics Approaches to Examine Metabolic Challenges of Obesity in Relation to Cardiovascular Diseases
    MPI: Christy Avery, Penny Gordon Larsen, Kari North and Susan Sumner
    Project Number: R01-HL143885
    View Abstract

    Cardiovascular diseases (CVD) remain leading causes of morbidity, mortality, and early disability, and are exacerbated by obesity. It is well known that obesity stresses metabolic pathways, thereby accelerating CVD risk. Yet, the specific biologic mechanisms remain poorly understood. Metabolites are biologically active small- molecule intermediates and byproducts of metabolism that lie along pathways linking genetic susceptibility with CVD and are responsive to obesity, related health behaviors, and CVD risk factors. Thus, metabolites can be powerful disease biomarkers and therapeutic targets and may provide targetable “mechanistic bridges” linking genome-wide association study (GWAS) findings with CVD risk factors and clinical disease. We hypothesize that: (1) genetic susceptibility influences CVD risk along specific metabolic pathways; (2) that metabolites on these pathways (i) affect and (ii) are affected by CVD risk factors to (3) increase clinical disease risk; and that (4) obesity modifies a subset of metabolite effects. Yet, the majority of metabolomics studies to-date have been largely cross-sectional or clinical efforts in older, European-ancestry populations, with inconsistent control of confounders, including diet, and they have ignored plausible modifiers, including obesity. To address these major research gaps, we will generate longitudinal untargeted and targeted metabolomics profiles in the biracial (47% African American) CARDIA study (n=5,115; 18-30 years in 1985-86; n~3,270 in 2020-21). The CARDIA study is uniquely suited to test the proposed study hypotheses, with 35 years of longitudinal data collected over the key your adult lifecycle period when CVD risk accelerates in concert with increasing obesity. We will develop and employ cutting-edge metabolomics and statistical methods to characterize known and unknown metabolite signals. Longitudinal data, Mendelian randomization, and pathway-based modeling enable assessment of (i) metabolic perturbations that influence CVD and (ii) CVD risk factors that influence metabolic perturbations, (iii) overall and in the context of a growing obesity burden. We address the following specific aims: 1) identify metabolites and major metabolic pathways that influence metabolic CVD risk factors (cholesterol, blood pressure, and glycemic phenotypes); 2) identify metabolic CVD risk factors that influence metabolites and major metabolic pathways; 3) leverage statistical innovations and existing `omics, phenotype, and covariate data for causal inference, to evaluate mechanistic frameworks, and characterize novel metabolites; and 4) test metabolites identified in the CARDIA study for evidence of association with CVD risk factors and clinical endpoints (coronary heart disease, heart failure, and stroke) in the biracial Atherosclerosis Risk in Communities (ARIC) study. We anticipate that the proposed project, prepared by a multi-disciplinary team with expertise in CVD and metabolic epidemiology, nutritional biochemistry, metabolomics, bioinformatics, biostatistics, and genetics, will inform disease mechanisms, with strong potential for identifying biomarkers of CVD risk. Together, our innovations will help identify novel therapeutic and nutritional targets to reduce the global burden of CVD.

    Leveraging Ancestral Diversity to Map Adiposity Loci in Hispanics
    Project Number: R01-DK101855
    View Abstract

    Obesity is a leading risk factor for metabolic and cardiovascular diseases and its prevalence has more than doubled since the 1980’s, with the greatest burden carried by minority populations. Large-scale genome-wide associations studies (GWAS) have identified >70 genetic loci that are unequivocally associated with obesity-related traits primarily in European descent populations. So far, no large-scale GWAS for any obesity-related traits have been performed in Hispanic /Latinos (HL) populations, despite their increased prevalence of obesity. Although classified under one ‘ethnic label’, HL populations are incredibly diverse and genetically highly admixed with recent origins from Europe, Africa and the Americas. Hence, genome-wide association will necessitate a large collaborative effort and the use of advanced statistical methods (that go far beyond standard GWAS analyses) to account for and leverage their high degree of genetic diversity. Here, we propose to perform the first large-scale genomic study in search of obesity-susceptibility loci in HL populations. For aim 1, we have assembled the world’s GWAS studies in HL populations, including >50,000 HL men and women with high-density SNP array data. Genome-wide imputation to multiethnic reference panels from the 1000 Genomes Project and other unique Amerindian resources will allow for comprehensive testing of common and low frequency variation present in HL populations as well as provide a rich resource for addressing genetic risk heterogeneity at obesity-related loci across HL sub-populations. To elucidate racial/ethnic transferability and fine-map association signals in aim 2, we will leverage data from large-scale GWAS of obesity-related traits in AA and EA populations that are available to us through our work with AA (n>50,000) and EA (GIANT consortium, n>200,000) consortia. In aim 3, we will employ functional analyses in Drosophila and bioinformatic data-mining tools to identify and characterize the target genes and functional alleles, and link associations with biological pathways. We are uniquely positioned and experienced to establish a large-scale collaboration to study the genomics of obesity in HLs. Our proposal is also unique and innovative for taking a GWAS study to the next translational stage, with an experimental research aim for further characterization of obesity specific genetic effects. Our study may improve the understanding of the genomic etiology of obesity, knowledge which may be used to reduce the burden of disease in underserved and understudied minority populations.

    Hispanic Latino Lipid Consortium
    Project Number: R01-HL142302
    View Abstract

    An estimated 53% of U.S. adults have dyslipidemia, putting a majority of the U.S. adult population at high risk for related chronic diseases such as cardiovascular diseases, non-alcoholic fatty liver disease, and gallbladder disease. US Hispanic/Latinos (H/L) ages 18–74 have an overall prevalence of dyslipidemia of 65%, among the highest reported in the US. Lipid traits are highly heritable; estimates range from 20 to 70%, with common genetic variants explaining ~30% of the variance for these traits in Europeans. As serum concentrations of lipids are established therapeutic targets for many lipid-related chronic diseases, researchers have invested considerable effort into understanding the genetic epidemiology of lipid traits, however these large-scale efforts have almost exclusively considered Caucasians. Understudied at-risk populations provide a powerful design to gain insight into genetic mechanisms for disease because they can exhibit finer haplotypic structure and have different underlying causal variants. To ensure ancestrally diverse populations are not the last to benefit from the new era of precision medicine, we must both increase representation of ancestrally diverse populations in genetic research and develop expedited strategies for translating genomics for clinical utility. First, to enrich discovery, we will conduct the first large-scale GWAS and rare variant analyses for lipid- related traits in H/L. We will meta-analyze, fine-map, perform multivariate associations, and validate effects in all available H/L samples in analyses that will include >50,000 samples. Second, to interpret function, we will move GWAS findings into an interpretable biological context and characterize the regulatory mechanisms involved in lipid regulation via tissue-specific functional analysis, and ancestry-specific validation of effects using RNAseq data in two independent H/L cohorts. Identification of genes and pathways associated with lipid levels elucidates important basic biology about human metabolism, but isn’t necessarily clinically translatable. Thus, to evaluate clinical significance of lipid-associated genetic risk factors, we will use multiple massive genetic and electronic medical record repositories (including the Multiethnic Cohort, BioME, and BioVU) to identify clinical outcomes associated with single variants and genetically regulated expression of lipid- associated genes in H/L phenome-wide. Our design focuses effort on discovery of new variants and loci by pioneering genetic studies of lipid-related traits in diverse H/L populations, functional interpretation of variant effects via gene-based annotation and expression prediction with robust validation, and characterizing the clinical outcomes predicted by lipid-associated genetics in three large DNA bio-banks with linked electronic medical records. These population-specific, function- and outcome-oriented approaches will advance understanding of the genetic etiology of lipids and related traits with high H/L disparities of risk, revealing new biologic pathways and providing new avenues for precision treatment for H/L, a population that will constitute ~35% of the US population by the year 2050.

    Genetic Epidemiology of Causal Variants Across the Life Course Phase II (Calico II)
    Project Number: U01-HG007416
    View Abstract

    As part of the Population Architecture Using Genomics in Epidemiology (PAGE, 2007-), this grant seeks to expand understanding of how ancestry-specific differences in allele frequencies and LD may explain differences in risks of common traits and conditions. Recent studies have identified rare genetic variants that are likely to contribute to common diseases and traits and observed that rare variants likely to be functional, such as those in coding and regulatory regions, tend to be population-specific. PAGE II has genotyped over 50,000 samples using MEGA, an Illumina high density custom exomechip array. The MEGA data is being imputed in PAGE to the 1000 Genomes panel. PAGE also sequenced 1,000 samples representative of 21 populations from the Americas. PAGE has harmonized phenotype data for ~300 trait variables. These datasets will be analyzed to continue emphasis on characterizing population-level disease risks in non-European-descent individuals. Cohorts in PAGE II are: CALiCo (Causal Variants Across the Life Course, a consortium of ARIC, CARDIA, HCHS/SOL, Strong Heart Studies), ISMMS (Mount Sinai BioMe Biobank), MEC (Multiethnic Cohort), WHI (Women’s Health Initiative), and Stanford University (PAGE Global Reference Panel). Genotyping services were provided by the Center for Inherited Disease Research (CIDR) and sequence data were provided by the McDonnell Genome Institute at Washington University School of Medicine.

    Exome Variants Underlying Weight Gain from Adolescence to Adulthood
    Project Number: R01-HD057194
    View Abstract

    The objective of the proposed research is to investigate how genetic variation influences weight-related traits during the transition from adolescence to adulthood – a critical risk period for weight gain. Genome wide association studies (GWAS) have identified >70 well-replicated loci influencing weight-related traits, some of which vary by race/ethnicity. Few studies have examined the genetic architecture of these traits during this critical period; the discovered loci are largely common variants that explain only a fraction of the estimated trait heritability. Fine-mapping studies suggest allelic heterogeneity; many causal variants remain to be determined. Recent attention has shifted to coding variants some of which may have larger effect sizes and potential to explain more trait heritability. We build on our successes in R01 HD057194 and capitalize on nationally representative, ethnically diverse, prospective and well-characterized data on 10,581 individuals from the National Longitudinal Study of Adolescent Health (Add Health) to assess the association between weight- related traits and coding variants across a 15-year lifecycle period of dramatic weight gain between adolescence and adulthood. In addition, to make full use of this excellent resource, we combine our data with extant exome data from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium (n>91,000) to further assess associations with adiposity phenotypes, an approach that will be particularly informative and powerful for the discovery of novel coding variants. Further, to fully ensure that we capitalize on the uniqueness of our longitudinal data on adolescent to adult weight gain, we combine our data with two well-characterized, age-matched cohorts with exome data (China Health and Nutrition Survey, CHNS n=1,951; Cebu Longitudinal Health and Nutrition Survey, CLHNS, n=1,691) living under different environmental conditions but experiencing high levels of weight gain analogous to Add Health. Using all three datasets, we will determine the genetic and epidemiological architecture of causal variants; identify functional SNPs and genes; and using advanced and innovative statistical modeling, examine differential genetic effects by age, time, and under varying environmental circumstances to downstream cardiometabolic risk factors (diabetes-, blood pressure-, and inflammatory-related markers). We will test novel hypotheses on tempo and timing of risk as well as address each piece of the complex system linking genetic markers, weight-related outcomes, and cardiometabolic risk factors, in the context of a variety of environmental and behavioral confounders. In sum, these data provide outstanding resources for examining low frequency coding variants associated with weight- related and cardiometabolic traits – a rapidly emerging area of science. Our longitudinal and complex analyses in this understudied age range will provide critical information about risk in the transition from adolescence into adulthood, a period of rapid weight gain when precursors of adult disease are developing. Our work will shed light on the progression of risk to inform efforts to mitigate early development of disease risk.

  • PI Kimon Divaris

    Genome-wide Association Study of Early Childhood Caries
    Project Number: U01-DE025046
    View Abstract

    Early childhood caries (ECC) is the most common chronic disease of childhood, and one that is characterized by marked social, economic, and racial disparities. The prevalence of ECC in the US increased by 15% during the last 2 decades, and according to the most recent national data 28% of children ages 2-5 are affected. Importantly, management of ECC among young children often requires complex and costly restorative procedures with the use of advanced behavior management techniques including sedation and general anesthesia. ECC can have severe sequelae for children’s general health and well-being, and confers important social and economic impacts on their families, communities, and the health system. Caries is a multifactorial disease with a substantial genetic component, which is estimated between 40-60%, but little is known regarding specific contributing genetic factors. A large-scale genome-wide association study (GWAS) involving a multi-cohort meta-analysis recently nominated 29 loci as associated with dental caries among adult, predominantly European-American (EA), populations. So far, the only GWAS examining caries in the primary dentition employed a sample of 1,300 3-12 year-old EA children and nominated 7 genes, 2 of which showed additional evidence of association in follow-up studies among children and adults. Here, we propose to conduct a large-scale GWAS of ECC involving a multi-ethnic community-based sample of 6,000 children ages 3 and 4. For Aim 1, we will undertake a comprehensive clinical dental characterization of a state-representative sample of approximately 6,000 children enrolled in Early Head Start and Head Start programs in North Carolina. We will use a tested clinical examination protocol including saliva sample collection for DNA extraction, and will use the latest International Caries Diagnosis System (ICDAS) visual diagnostic criteria to determine disease prevalence and severity. We will collect and store dental plaque samples for future microbiome analyses that will be funded separately. To identify genetic variants that are associated with ECC, in Aim 2, we will conduct a trans-ethnic GWAS of ECC and related traits, utilizing high-density genotyping, imputation to 1000 Genomes Project reference panels and advanced statistical approaches to leverage differences in genetic structure between racial/ethnic groups.
    In Aim 3, we will utilize publicly available GWAS data of early childhood and adult caries, to determine the racial/ethnic and age-group generalization/transferability of loci, genes, and gene sets/pathways identified in our study. Our group’s experience in conducting dental epidemiologic studies among young children including an ongoing collaboration with EHS/HS and expertise in large-scale, multi-ethnic GWAS, create a unique opportunity to carry out this important investigation and advance the knowledge base of genomics of dental caries. The study will improve our understanding of ECC’s epidemiology and genomic etiology, key knowledge to reduce the burden of disease in populations of young children, including those underrepresented in research.

    Pediatric HIV/AIDS Cohort Study (PHACS) Data and Operations Center
    PI: G Seage; UNC Subcontract PI: Kimon Divaris
    Project Number: U01-HD052102
    View Abstract

    The Pediatric HIV/AIDS Cohort Study (PHACS) was created in 2005 to evaluate the clinical course of perinatally acquired HIV infection among adolescents and pre-adolescents and the consequences of fetal and neonatal exposure to HIV and antiretroviral chemotherapy among a representative cohort of children in the United States. A cohort of 450 perinatally infected adolescents and preadolescents (Adolescent Master Protocol, AMP, age 7-16 at enrollment) was established to evaluate the impact of HIV and ART on sexual maturation, pubertal development, and socialization;and, a drug toxicity surveillance system (Surveillance Monitoring for Anti-Retroviral Toxicities Study (SMARTT) enrolled 1,934 perinatally HIV exposed uninfected children to evaluate long-term effects of in-utero ART exposure. PHACS is comprised of a Scientific Leadership Group (SLG), which is overseen by a Coordinating Center, a Data and Operations Center (DOC), and 24 clinical sites. The Department of Epidemiology and the Center for Biostatistics in AIDS Research (CBAR) at the Harvard School of Public Health, Westat, and the Frontier Science Foundation collaborate to form the PHACS DOC. The DOC collaborates with the SLG to define the PHACS research agenda; provides methodological support for the development of all PHACS analytic projects; merges data from pre-existing databases from previous cohorts (PACTG 219/219C, WITS, Legacy);maintains clinical site subcontracts and trains and monitors sites in proper procedures for PHACS research;plans and conducts all leadership and full PHACS network meetings; and, supports an active CAB. In PHACS II, the DOC will continue the duties described above while refining its practices, as well as follow and enroll an additional 1,200-1,500 children into SMARTT. Together, HSPH, Westat and Frontier Science bring long histories of providing the type of methodologic and operational support required by PHACS, as well as innovative methods to enhance and maximize the efficiency of PHACS study design, conduct, and analysis. Given our prior and current professional experience, we are uniquely positioned to provide the scientific/epidemiologic and operational leadership to successfully conduct PHACS.

    Who Does What in the Oral Biofilm: the Metagenome and Metatranscriptome of Early Childhood Oral Health and Disease
    Project Number: 550KR111505

  • PI Kristin Young

    Refining Obesity Phenotypes Using Metabolomics in the Atherosclerosis Risk in Communities (ARIC) Study
    Project Number: R21HL140419
    View Abstract

    Over the past 30 years, the US prevalence of adult obesity has more than doubled, resulting in ~96 million obese Americans in 2016. US minority populations shoulder the majority of the obesity burden, with African Americans (AA) having the highest age-adjusted prevalence of overall and central obesity. These disparities are accompanied by rises in obesity-related morbidity, mortality, and health care expenditures, most notably from cardiovascular diseases (CVD). However, not all obese individuals have the same risk for adverse health outcomes. Central obesity is more metabolically active and contributes disproportionately to poor health compared to overall obesity. This implies that the distribution of body fat may have distinct health consequences, and distinct underpinnings, including genetic predisposition. And, while genome-wide association studies (GWAS) have identified over 500 genomic regions associated with obesity phenotypes, few genes have been functionally validated, making it difficult to move GWAS findings into the clinic to improve patient health. Precisely measuring obesity and fat distribution may help move the field of obesity genomics forward. As obesity can result from dysregulation of energy balance, metabolites are a logical means to refine phenotypic definitions of obesity, and narrow in on the most likely causal genes underlying GWAS signals. The proposed study will leverage existing genetic, phenotypic, and metabolomic data from both European American and African American participants in the Atherosclerosis Risk in Communities (ARIC) Study in order to: 1) evaluate the association between metabolomic profiles and overall obesity and central obesity in ARIC to identify unique metabolomic profiles (metabotypes) that differ by body mass and body fat distribution patterns; and 2) examine genetic effects on obesity-associated metabolites (mGWAS) in ARIC to identify unique genetic underpinnings of obesity-associated metabolites and metabolic profiles, with replication and validation of both aims in the Multi-Ethnic Study of Atherosclerosis (MESA). Systematically evaluating the influences of obesity distribution and metabolomic profiles could provide important, but largely unexplored, insights into the pathogenesis of obesity and inform prevention strategies and treatment guidelines. Integrating metabolomics into GWAS will also provide the opportunity to identify the best candidate genes around GWAS signals for functional follow-up in future laboratory experiments, giving potentially actionable biological relevance to the hundreds of genetic signals for obesity.