Xuexia (Helen) Wang
Xuexia (Helen) Wang, PhD
Assistant Professor of Biostatistics
  • University of North Texas
    Office: GAB 459, 1155 Union Circle #311430
    Denton, TX 76203-5017, USA
    Phone: (940) 369-8307
    Email: xuexia.wang@unt.edu


  • PhD  Statistics/Mathematical Sciences (in Statistical Genetics) Dec. 2008 Michigan Technological University, Houghton, MI. Dissertation: “Genetic Association Studies Considering LD Information and Genome-Wide Application” Advisor: Professor Shuanglin Zhang
  • MS   Statistics/Mathematical Sciences (in Statistical Genetics) May 2007 Michigan Technological University, Houghton, MI. Thesis: “Genome-Wide Association Tests by Two-Stage Approaches with Unified Analysis of Families and Unrelated Individuals” Advisor: Professor Shuanglin Zhang
  • PhD  Quantitative Economics July 2004 Capital University of Economics and Business, Beijing, China. Dissertation: “Studies on the Mechanism of Double Layered Principal-agent in Venture Capital” Advisor: Professor Xianglan Jin
  • MS Quantitative Economics April 1996 Dongbei University of Finance and Economics, Dalian, China
  • BS  Math Education July 1993 Shandong Normal University, Jinan, China



Dr. Xuexia Wang joined the biostatistics faculty in August 2011, after working one year as an assistant research professor at the City of Hope. Her main research area is in statistical genetics. She is interested in developing statistical methods and computational tools to identify genetic variants that influence the susceptibility to complex diseases such as breast cancer, colon/rectum cancer, lung cancer, and prostate cancer.
She has proposed three two-stage approaches to deal with the problem of multiple testing based on nuclear family data; a new association method to test multiple-marker association based on case control data. Local ancestry at a test SNP may confound with the association signal and ignoring it can lead to spurious association. She first demonstrated theoretically that adjustment for local ancestry at the test SNP is sufficient to remove the spurious association regardless of the mechanism of population stratification. Furthermore, she developed two novel powerful association tests adjusting for local ancestry.
Her current research work involves analysis of high-throughput genetics data generated from genome-wide association and next-generation sequencing studies. In particular, she is interested in population- based and family-based genetic association studies for rare and common variants, gene-set and pathway analysis, gene-gene and gene-environment interactions, admixed populations and genetics of gene expression.

In addition to methods development, Dr. Wang is also interested in collaborating with researchers seeking to identify complex disease susceptibility genes. Her collaborative research includes studies of searching genetic susceptibility in the development of breast cancer, colon/rectum cancer, lung cancer, prostate cancer, lymphoma, cardiovascular disease, childhood obesity, type 1 diabetes, type 2 diabetes, and autism, therapy-related cardiac dysfunction and avascular necrosis after surviving childhood cancer, the secondary malignancies after hematopoietic cell transplantation.

Research Interest

My broad research agenda is to understand how genetic variations contribute to the etiology of complex diseases. One of my primary objectives is to develop statistical methods and computational tools for identifying and characterizing genetic variants that influence susceptibility to complex diseases. In addition to the methodology development work, I work closely with my collaborators as a biostatistician to search for disease susceptibility factors of therapy-related cardiac dysfunction, secondary cancer, autism, and other diseases.

1.Novel statistical methods for genetic association studies

Identified associations between genetic variants and disease phenotypes can not only promote a better understanding of disease etiology but also influence disease risk prediction which can guide the primary and secondary prevention strategies. Association tests have been used for detecting the associations. I have developed a series of important statistical tests to detect genetic variants that have susceptibility to complex diseases in populations, especially admixed populations such as African Americans and Hispanic Americans.
1A. Novel statistical methods to deal with population stratification issue
In admixed populations, the proportion of admixture may vary across individuals. This variation can lead to associations of the disease with loci (i.e. gene positioning) unlinked to the disease locus, a phenomenon well known as ‘population stratification.’ This can produce both false-positive and false-negative association signals if not appropriately controlled.
I am the first person that proposed and theoretically proved that it is important to account for local ancestry difference when detecting genetic variants in admixed populations. Also, I developed two novel association approaches to adjust for local ancestry at the genetic test region in order to remove the spurious associations due to the confounding effect of local ancestry.  Indeed, the importance of this particular contribution (Wang et al. Bioinformatics 2011) to the field of Statistical Genetics is demonstrated by its extensive citations (cited by 32).
Population stratification can be even worse in rare variant association analyses because rare variants often demonstrate stronger and potentially different patterns of stratification than common variants. To correct for population stratification in genetic association studies, I proposed a novel method to Test the effect of an Optimally Weighted combination of variants in Admixed populations (TOWA) in which the analytically derived optimal weights can be calculated from existing phenotype and genotype data. TOWA up weights rare variants and those variants that have strong associations with the phenotype. Additionally, it can adjust for the direction of the association, and allows for local ancestry difference among study subjects. This contribution was published by the prestigious journal Genetic Epidemiology (Wang at al. Genetic Epidemiology, 2015; impact factor: 2.951).
1B. Novel statistical methods to test multiple genetic variants
There is strong evidence that several mutations within a single gene can interact to have a large effect on the observed phenotype, which emphasizes the importance of the analysis of multiple single nucleotide polymorphisms (SNPs) that jointly represent variation within common transcripts and other functional regions, such as promoters. As a result of the availability of a very large number of SNPs, there has been increasing interest in genetic association tests involving several closely linked loci. Methods for detection of association between disease and multiple genetic variants are being rapidly developed, which include the Hotelling’s T2 test and the linkage disequilibrium (LD) contrast tests. I developed a new association method to test multiple-marker association based on case control data. Case control study is a typical design in genetic and epidemiology studies. A scientist identifies disease-susceptible genetic variants by comparing the features, such as mean and variance-covariance of genetic variants, of cases and controls. To detect association between traits and multiple genetic polymorphisms, I proposed a likelihood ratio test (Wang et al. Genetic    Epidemiology, 2009). My new test is an improvement over existing method since it can detect both the difference of  means and variance-covariance matrices in cases and controls simultaneously. Existing methods can detect only the difference of either means (Hotelling’s T2 test) or variance-covariance matrices (LD contrast tests) in cases and controls  not both simultaneously. Researchers can gain more power using this approach in searching disease genes.
1C. Novel statistical methods to test rare variant association in next-generation sequencing data
Next-generation sequencing technology allows sequencing the whole genome of large groups of individuals, and thus makes directly testing rare variants possible. Most existing methods for rare variant association studies are essentially testing the effect of a weighted combination of variants with different weighting schemes. Performance of these methods depends on the weights being used and no optimal weights are available. By putting large weights on rare variants and small weights on common variants, these methods target rare variants only, although increasing evidence shows that complex diseases are caused by both common and rare variants. My collaborators and I analytically derived optimal weights. Based on the optimal weights, we proposed a novel Test for testing the effect of an Optimally Weighted combination of variants (TOW) and Variable Weight Test for testing the effect of an Optimally Weighted combination of variants (VW-TOW). TOW aims to test for the effects of rare variants. VW-TOW aims to test for the effects of both rare and common variants. TOW and VW-TOW are applicable to both quantitative and qualitative traits, allow covariates, can control for population stratification, and are robust to directions of effects of causal variants. These findings were published in Genetic Epidemiology (Sha and Wang et al. Genetic Epidemiology, 2012).
2. Genetic association studies for cancer treatment-related adverse outcomes
An estimated 13.7 million cancer survivors are now living in the U.S., representing 4% of the entire population. This number is projected to rise to 18 million in 2022. Anthracyclines are one of the most effective classes of chemotherapeutic agents currently available for cancer patients. However, anthracycline-related cardiomyopathy and radiation/chemotherapy-associated histologically distinct new cancers or subsequent malignant neoplasms (SMNs) are two of the most serious treatment-related adverse events experienced by cancer survivors. The cumulative incidence of second cancer approaches 15% at 20 years after diagnosis of primary cancer, representing a 10-fold increased risk for cancer survivors, compared to the general population. In addition to the development of statistical methods in genetic association studies, my research focuses on investigating the genetic susceptibility to cancer treatment-related adverse outcomes.
2A. Test genetic susceptibility to anthracycline-related cardiomyopathy
Anthracyclines are one of the most effective chemotherapeutic agents currently available for cancer patients. The therapeutic potential of anthracyclines, however, is limited because of their strong dose-dependent relation with progressive and irreversible cardiomyopathy leading to congestive heart failure. An inter-individual variability in cardiomyopathy risk is observed, such that cumulative anthracycline exposure as low as 150mg/m2 results in cardiomyopathy in some patients, while exposure as high as 1000mg/m2 is tolerated without cardiomyopathy by others. To investigate reason(s) for observed inter-individual variability by identifying SNPs that might modify the association between anthracycline exposure and risk of cardiotoxicity, I conducted a large scale candidate gene (~2000 genes) study and identified a significant gene-environment (anthracycline) interaction on gene HAS3, known to increase the risk of cardiovascular disease. This contribution has been published in the high impact Journal of Clinical Oncology (Wang et al., 2014; impact factor: 18; cited 25 times to date). In addition to the gene HAS3 identified in a large-scale candidate gene study, I also identified a significant gene- environment (anthracycline) interaction on gene CELF4 associated to anthracycline-related cardiomyopathy from a genome-wide association study. This contribution is currently in press at the prestigious Journal of Clinical Oncology (Wang et al., 2015).
2B. Test genetic susceptibility to second brain cancer
Chemotherapy and radiation are cornerstones for most pediatric cancers. However, second brain tumors develop after cranial radiation for histologically distinct brain tumors or for management of central nervous system diseases. The risk of second brain tumors increase with increasing cumulative exposure to chemotherapy and radiation. However, for any given dose of radiation exposure, there exists a significant inter-individual variability in the risk of second brain tumors, suggesting the role of an interaction between genetic susceptibility and therapeutic exposures in developing second brain tumors. The pathways that lead to the development of second brain tumors are complex and likely to involve the actions and interactions of a large number of genes. However, few studies investigate multiple variants across several genes, partly due to the lack of appropriate statistical methods and detailed information of therapeutic exposures. In my Matthew Larson Foundation funded project (PI: Wang), I proposed novel gene and/or pathway based statistical methods. Using a matched case-control study design, my collaborators and I have enrolled childhood cancer survivors with second brain tumors (84 cases) and those without second brain tumors (231 controls) using the infrastructure offered by the Children’s Oncology Group (COG). I am comprehensively investigating gene-therapy interaction using the novel statistical methods and the high quality deep genotyped data, along with careful measurement of chemotherapy and radiation therapeutic exposures. Findings from this study may lead to clinical justification for altering therapies for those identified to be at high risk of second brain tumors, or instituting aggressive screening for those who have already received the necessary therapeutic exposures and are identified to be at high risk.
2C. Test genetic susceptibility to cancer treatment related subsequent malignant neoplasms
Cancer survivors are at a high risk of treatment-related adverse events, with the cumulative incidence exceeding 40%. One of the most serious treatment-related adverse events is histologically distinct new cancers or subsequent malignant neoplasms (SMNs) which are also called secondary cancers. The high burden of morbidity and mortality carried by these cancer survivors creates an obligation to understand the etiology of SMNs in order to develop targeted prevention and/or intervention strategies in those identified to be at high risk. My newly-awarded UWM Research Growth Initiative grant (PI: Wang) is designed to detect gene-environment (GxE) interactions for SMNs in cancer survivors with a deep identified exome data, using newly-proposed novel statistical methods. The importance of findings from this study lies not only in a better understanding of the complex interplay of genetic and environmental risk factors in complex biological pathways relevant to SMNs but also in its ability to influence risk prediction that can guide primary and secondary prevention strategies. Prospective identification of newly diagnosed cancer patients that are at high risk to develop SMNs could result in opportunities to individualize therapy to maximize therapeutic benefit and minimize SMNs risk by using alternative therapies.
Future research work
Pleiotropy, the effect of one variant on multiple traits, is a widespread phenomenon in complex diseases. Joint analysis of multiple traits such as systolic and diastolic blood pressures evaluated in hypertension can increase statistical power to detect disease susceptible genetic variants. However, testing rare variants for multiple traits in admixed populations such as African Americans and Hispanic Americans is challenging due to the extreme rarity of individual variants, allelic heterogeneity, and the confounding issue of population stratification. Population stratification issues can be even more problematic in rare variant association analyses for multiple traits because rare variants often demonstrate stronger and potentially different patterns of stratification than common variants. To correct for population stratification for multiple traits in genetic association studies, I am planning to propose two novel methods to test the gene- or pathway- based effect of genetic variants for admixed populations. For the first method, using the generalized linear model, I will treat the combination of local ancestry weighted variants as response variable and the traits as predictors. For the second method, I will treat the ancestry-based weighed dosage score as response variable and the traits as predictors using generalized linear model. To test association between the multiple traits and the genomic region, I will develop a score  test in the first method and a bootstrap confidence interval test in the second method. I will use simulated next generation sequencing data to show that the proposed tests have controlled type I error rates, whereas naïve application of existing rare variants tests for multiple traits leads to inflated type I error rates. I will evaluate power of the novel methods in simulation studies and real data analysis.
Using the proposed novel methods, I will test genetic susceptibility for multiple traits of anthracycline-related cardiomyopathy. Findings from this study will lead to a clinical justification for altering therapies for those identified to be at high risk, or instituting aggressive screening for those who have already received the necessary therapeutic exposures and are identified to be at high risk.
My long-term research goal is to assemble and lead a fully funded team of graduate students and postdocs that will substantially contribute to our understanding of genetic variations contributing to complex diseases. In my current position, I am fortunate to be able to support my work on developments of novel statistical methods for the analysis of whole-genome, whole-exome, and exome-chip data through several funded projects. These projects establish the core of my research program in the near future as my career transitions from my current level to the next stage. I will continue to work with my collaborators and seek more funding opportunities as primary investigator and co-investigator to support my research on methodological development in statistical genetics and applied epidemiology. Particularly, in my future research proposals, I will focus on developing and applying novel statistical approaches that address the analytic needs of geneticists and epidemiologists and investigating genetic variants that have susceptibility to primary cancer and its treatment-related adverse outcomes.

Professional Activities:

Work Experience

  • Assistant Professor Aug. 2011 – Present Division of Biostatistics, School of Public Health, University of Wisconsin- Milwaukee (UWM), Milwaukee, WI
  • Assistant Professor (Adjunct) April. 2015 – Present School of Medicine, University of Alabama at Birmingham Birmingham, AL  35233
  • Assistant Professor (Adjunct) Sept. 2013 – Present Doctoral program in the Biomedical and Health Informatics of UWM
  • Assistant Research Professor Sept. 2010 - Aug. 2011 Department of Population Sciences, City of Hope, Duarte, CA
  • Postdoctoral Researcher Jan. 2009 - Sept. 2010   (Mentor: Dr. Mingyao Li)  Department of Biostatistics and Epidemiology, University of Pennsylvania (UPenn) School of Medicine, Philadelphia, PA
  • Graduate Instructor and Research Assistant Sept. 2005 - Dec. 2008 Department of Mathematical Sciences, Michigan Technological University, Houghton, MI
  • Visiting Scholar    Mar. - Aug. 2005 College of Business Administration, Missouri State University, Springfield, MO
  • Associate Professor Sept. 2004 - Mar. 2005 School of Economics, Capital University of Economics and Business, Beijing, China
  • Assistant Professor Sept. 1998 - Sept. 2004 Information College, Capital University of Economics and Business, Beijing
  • Lecturer April, 1996 - Sept. 1998 Information College, Capital University of Economics and Business, Beijing

Selected Honors and Awards

  1. Travel Grant of 2009 Joint Mathematics Meetings American Mathematical Association, Fall 2008
  2. Travel Grant of Graduate School Michigan Technological University, Fall 2006
  3. Graduate Research/ Teaching Assistantship Michigan Technological University, Aug. 2005 - Dec. 2008
  4. Outstanding Research Award of Young Faculty Capital University of Economics and Business, May, 2004

Professional  Memberships

  • Member, American Statistical Association, 2004 - Present
  • Member, American Society of Human Genetics, 2009 - Present
  • Member, Sigma Xi, 2012 - Present
  • Member, American Association for the Advancement of Science, 2012 - Present
  • Associate member of the City of Hope Comprehensive Cancer Center, 2010 - 2011



  1. Wang X, Zhang SL, Li Y, and Sha Q. A powerful approach to test an optimally weighted combination of rare variants in admixed populations. Genetic Epidemiology. 39:294-305, 2015. PMID:25758547
  2. Wang X, Sun CL, Quiñones-Lombraña A, Singh P, et al. CELF4 variant and Anthracycline-related Cardiomyopathy – A COG Study (ALTE03N1). JCO, 2015. (in press)
  3. Wang X, Zhao X, and Zhou J. Testing rare variants for hypertension using family-based tests with different weighting schemes. BMC Proc. 2015 (in press)
  4. Xia S, Kohli M, Meijun D, Dittmar R, Lee A, Nandy D, Yuan T, Guo Y, Wang Y, Tschannen M, Worthey E, Jacob H, See W, Kilari D, Wang X, Hovey R, Huang CC, and Wang L. Plasma genetic and genomic abnormalities predict treatment response and clinical outcome in advanced prostate cancer. Oncotarget, Published online, April 15, 2015
  5. Kalkbrenner A, Windham G, Serre ML, Akita Y, Wang X. et al. Particulate Matter Exposure, Prenatal and Postnatal Windows of Susceptibility, and Autism Spectrum Disorders. Epidemiology. 26(1):30-42, 2015. PMID:25286049
  6. Xia S, Huang CC, Le M, Ditmar R, Du M, Yuan T, Guo Y, Wang Y, Wang X, Tsai S, Suster S, Mackinnon AC and Wang L. Genomic variations in plasma cell free 1 DNA differentiate early stage lung cancers from normal controls. Lung cancer, 2015. (in press)
  7. Zhu H, Wang Z, Wang X, Sha Q. A novel statistical method for rare variants association studies in general pedigrees. BMC Proc., 2015 (in press)
  8. Wang X, Liu W, Sun CL, et al. Hyaluronidase synthase 3 (HAS3) variant and Anthracycline-related Cardiomyopathy – A report from the Children’s Oncology Group. JCO. 32(7):647-53, 2014. PMID:24470002
  9. Wang X*, Oldani MJ, Zhao X, et al. A review of cancer risk prediction models with genetic variants. Cancer Informatics. Suppl. 2 19-28, 2014. PMID:25288876
  10. Zhao X, Sha Q, Zhang S, and Wang X*. Testing optimally weighted combination of variants for Hypertension. BMC Proc. 8(Suppl 1):S59, 2014.
  11. Ahrenhoerstera LS, Tate ER, Lakatos PA, Wang X, Laiosa MD. Developmental exposure to 2,3,7,8 tetrachlorodibenzo-p-dioxin attenuates capacity of hematopoietic stem cells to undergo lymphocyte differentiation. Toxicology and Applied Pharmacology, 277 (2):172-182, 2014.
  12. Sha Q, Wang X, Wang XL, Zhang SL. Detecting association of rare and common variants by testing an optimally weighted combination of variants. Genetic Epidemiology. 36(6):561-571, 2012. PMID:22714994
  13. Ferguson J, Hinkle C, Mehta N, Bagheri R, DerOhannessian S, Shah R, Wolfe M, Bradfield J, Hakonarson H, Wang X, Master S, Rader D, Li M, Reilly M. Translational studies of lipoprotein-associated phospholipase A2 in inflammation and atherosclerosis. Journal of the American College of Cardiology. 59(8):764-72, 2012.
  14. Wang X, Zhu X, Qin H, Cooper R, Ewens W, Li C, Li M. Adjustment for local ancestry in genetic association analysis of admixed populations. Bioinformatics. 27(5):670-677, 2011.
  15. Cappola T, Matkovich S, Wang W, Booven D, Li M, Wang X, et al. Loss-of- Function DNA sequence variant in the CLCNKA chloride channel implicates the cardio-renal axis in interindividual heart failure risk variation. Proceedings of the National Academy of Sciences (PNAS). 108(6):2456-2461, 2011.
  16. Shen H, Bielak L, Ferguson J, Streeten E, Yerges-Armstrong L, Liu J, Post W, O'Connell J, Hixson J, Kardia S, Sun Y, Jhun S, Wang X. et al. Association of the Vitamin D Metabolism gene CYP24A1 with coronary artery calcification. Arteriosclerosis, Thrombosis, and Vascular Biology. Biol. 2010; 0: ATVBAHA.110.211805v1.
  17. Wang X, Sha Q, Zhang SL. A new association test to test multiple-marker association. Genetic Epidemiology. 33:164-71, 2009. PMID:18720476
  18. Wang X, Qin H, and Sha Q. Incorporating multiple-marker information to detect risk loci for rheumatoid arthritis. BMC Proc. 3(Suppl 7):s28, 2009. PMID: 20018018
  19. Wang X, Zhang Z, Zhang SL and Sha Q. Genome-wide association tests by two- stage approaches with unified analysis of families and unrelated individuals. BMC Proc. 1(Suppl 1):S140, 2007. PMID:18466484
  20. Wang X*. A study on the principal-agent model of the sides in venture capital with their stock's proportionate. Quantitative & Technical Economics. 2:121-126, 2005.
  21. Wang X*. Analysis and Forecasting on the Stainless Steel Market, Brilliance. 8:20-21, 2004.
  22. Wang X*. Study of optimal revenue models in subsection venture capital. Journal of Capital University of Economics and Business. 3:57-60, 2004.
  23. Wang X*. Car sales unaffected by rising insurance costs. Financial News, July 6th, 6, 2004
  24. Wang X*. Forecast of banks’ information processing trends. Financial News, June 26th, 5, 2004.
  25. Jin XL and Wang X. Analysis of developing state of national economy of Beijing. Economic Management Study. 4:31-35, 2001.
  26. Wang X*. An economics simulation with SWARM: agent-based model and object-oriented programming. Quantitative & Technical Economics. 5:63-64, 2001.
  27. Xie Z, Wang X. The Structural Analysis of Mortgage-backed Securities, Contemporary Legal Science. 1:127-135, 2001.
  28. Wang X*. An analysis and the counter measurement of Beijing's real estimate investment. Journal of Capital University of Economics and Business. 4:71-74, 2000.

Manuscripts in Review and in Preparation

  1. Wang Z, Wang X, Sha Q. Joint analysis of multiple traits in rare variant association studies. Annual of Human Genetics, 2015. (in review)
  2. Wang X, Sun CL, Singh S, Bhatia S. Genetic variation as a modifier of association between therapeutic exposure and brain cancer in cancer survivors. 2015. (Ready to submit to the Journal of Clinical Oncology)
  3. Wang X and Qian D. Fast and accurate p-value estimation by fitting a density adaptive distribution, 2015 (in preparation)
  4. Wang X, Zhu X, Li, C, Li M. A family based local ancestry adjustment method in genetic association analysis of admixed populations, 2015 (in preparation)
  5. Wang X, Zhang SL, Li M, Sha Q. Testing an optimally weighted combination of rare variants for quantitative trait in admixed populations, 2015 (in preparation)

Book Chapters

  1. Tihua Jing, Xu Jing, Ling Ning, Wang X et al. (2004) Economic Simulations in SWARM: Agent-based Modeling and Object-oriented Programming (Chapters 4, 8, and 11). Social Science Documentation Publishing House, Beijing.
  2. Tongsan Wang, Shouyi Zhang, Jianguo Qi, Fuqiang Li, Wang X et al. (2001) 21 Century Quantitative Econometrics. China Statistics Press, Beijing.
  3. Daiguang  Hu,  Hongye  Gao,  Wang  X  et  al.  (2000)  Dictionary  of  Economics. Economic Science Press, Beijjing.



Autoimmune Journal Flyer