TY - JOUR
T1 - Accurate Local-Ancestry Inference in Exome-Sequenced Admixed Individuals via Off-Target Sequence Reads
AU - Hu, Youna
AU - Willer, Cristen
AU - Zhan, Xiaowei
AU - Kang, Hyun Min
AU - Abecasis, Goņalo R.
N1 - Funding Information:
We thank two anonymous reviewers for their constructive comments, which helped us to greatly improve the manuscript. This work was made possible by research funding from the National Institutes of Health (particularly grants HG007022, HG006513, and HL102925).
PY - 2013/11/7
Y1 - 2013/11/7
N2 - Estimates of the ancestry of specific chromosomal regions in admixed individuals are useful for studies of human evolutionary history and for genetic association studies. Previously, this ancestry inference relied on high-quality genotypes from genome-wide association study (GWAS) arrays. These high-quality genotypes are not always available when samples are exome sequenced, and exome sequencing is the strategy of choice for many ongoing genetic studies. Here we show that off-target reads generated during exome-sequencing experiments can be combined with on-target reads to accurately estimate the ancestry of each chromosomal segment in an admixed individual. To reconstruct local ancestry, our method SEQMIX models aligned bases directly instead of relying on hard genotype calls.We evaluate the accuracy of our method through simulations and analysis of samples sequenced by the 1000 Genomes Project and the NHLBI Grand Opportunity Exome Sequencing Project. In African Americans, we show that local-ancestry estimates derived by our method are very similar to those derived with Illumina's Omni 2.5M genotyping array and much improved in relation to estimates that use only exome genotypes and ignore off-target sequencing reads. Software implementing this method, SEQMIX, can be applied to analysis of human population history or used for genetic association studies in admixed individuals.
AB - Estimates of the ancestry of specific chromosomal regions in admixed individuals are useful for studies of human evolutionary history and for genetic association studies. Previously, this ancestry inference relied on high-quality genotypes from genome-wide association study (GWAS) arrays. These high-quality genotypes are not always available when samples are exome sequenced, and exome sequencing is the strategy of choice for many ongoing genetic studies. Here we show that off-target reads generated during exome-sequencing experiments can be combined with on-target reads to accurately estimate the ancestry of each chromosomal segment in an admixed individual. To reconstruct local ancestry, our method SEQMIX models aligned bases directly instead of relying on hard genotype calls.We evaluate the accuracy of our method through simulations and analysis of samples sequenced by the 1000 Genomes Project and the NHLBI Grand Opportunity Exome Sequencing Project. In African Americans, we show that local-ancestry estimates derived by our method are very similar to those derived with Illumina's Omni 2.5M genotyping array and much improved in relation to estimates that use only exome genotypes and ignore off-target sequencing reads. Software implementing this method, SEQMIX, can be applied to analysis of human population history or used for genetic association studies in admixed individuals.
UR - http://www.scopus.com/inward/record.url?scp=84890238082&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890238082&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2013.10.008
DO - 10.1016/j.ajhg.2013.10.008
M3 - Article
C2 - 24210252
AN - SCOPUS:84890238082
SN - 0002-9297
VL - 93
SP - 891
EP - 899
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 5
ER -