Sequence variations in the public human genome data reflect a bottlenecked population history

Gabor Marth; Greg Schuler; Raymond Yeh; Ruth Davenport; Richa Agarwala; Deanna Church; Sarah Wheelan; Jonathan Baker; Ming Ward; Michael Kholodov; Lon Phan; Eva Czabarka; Janos Murvai; David Cutler; Stephen Wooding; Alan Rogers; Aravinda Chakravarti; Henry C. Harpending; Pui Yan Kwok; Stephen T. Sherry

doi:10.1073/pnas.222673099

Sequence variations in the public human genome data reflect a bottlenecked population history

Gabor Marth, Greg Schuler, Raymond Yeh, Ruth Davenport, Richa Agarwala, Deanna Church, Sarah Wheelan, Jonathan Baker, Ming Ward, Michael Kholodov, Lon Phan, Eva Czabarka, Janos Murvai, David Cutler, Stephen Wooding, Alan Rogers, Aravinda Chakravarti, Henry C. Harpending, Pui Yan Kwok, Stephen T. Sherry

Eugene Mcdermott Center For Human Growth & Development

Research output: Contribution to journal › Article › peer-review

81 Scopus citations

Abstract

Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.

Original language	English (US)
Pages (from-to)	376-381
Number of pages	6
Journal	Proceedings of the National Academy of Sciences of the United States of America
Volume	100
Issue number	1
DOIs	https://doi.org/10.1073/pnas.222673099
State	Published - Jan 7 2003

ASJC Scopus subject areas

General

Access to Document

10.1073/pnas.222673099

Cite this

Marth, G., Schuler, G., Yeh, R., Davenport, R., Agarwala, R., Church, D., Wheelan, S., Baker, J., Ward, M., Kholodov, M., Phan, L., Czabarka, E., Murvai, J., Cutler, D., Wooding, S., Rogers, A., Chakravarti, A., Harpending, H. C., Kwok, P. Y., & Sherry, S. T. (2003). Sequence variations in the public human genome data reflect a bottlenecked population history. Proceedings of the National Academy of Sciences of the United States of America, 100(1), 376-381. https://doi.org/10.1073/pnas.222673099

Marth, G, Schuler, G, Yeh, R, Davenport, R, Agarwala, R, Church, D, Wheelan, S, Baker, J, Ward, M, Kholodov, M, Phan, L, Czabarka, E, Murvai, J, Cutler, D, Wooding, S, Rogers, A, Chakravarti, A, Harpending, HC, Kwok, PY & Sherry, ST 2003, 'Sequence variations in the public human genome data reflect a bottlenecked population history', Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 1, pp. 376-381. https://doi.org/10.1073/pnas.222673099

@article{c41cd533b3cf491aa3d6374262056b56,

title = "Sequence variations in the public human genome data reflect a bottlenecked population history",

abstract = "Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.",

author = "Gabor Marth and Greg Schuler and Raymond Yeh and Ruth Davenport and Richa Agarwala and Deanna Church and Sarah Wheelan and Jonathan Baker and Ming Ward and Michael Kholodov and Lon Phan and Eva Czabarka and Janos Murvai and David Cutler and Stephen Wooding and Alan Rogers and Aravinda Chakravarti and Harpending, {Henry C.} and Kwok, {Pui Yan} and Sherry, {Stephen T.}",

year = "2003",

month = jan,

day = "7",

doi = "10.1073/pnas.222673099",

language = "English (US)",

volume = "100",

pages = "376--381",

journal = "Proceedings of the National Academy of Sciences of the United States of America",

issn = "0027-8424",

publisher = "National Academy of Sciences",

number = "1",

}

TY - JOUR

T1 - Sequence variations in the public human genome data reflect a bottlenecked population history

AU - Marth, Gabor

AU - Schuler, Greg

AU - Yeh, Raymond

AU - Davenport, Ruth

AU - Agarwala, Richa

AU - Church, Deanna

AU - Wheelan, Sarah

AU - Baker, Jonathan

AU - Ward, Ming

AU - Kholodov, Michael

AU - Phan, Lon

AU - Czabarka, Eva

AU - Murvai, Janos

AU - Cutler, David

AU - Wooding, Stephen

AU - Rogers, Alan

AU - Chakravarti, Aravinda

AU - Harpending, Henry C.

AU - Kwok, Pui Yan

AU - Sherry, Stephen T.

PY - 2003/1/7

Y1 - 2003/1/7

N2 - Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.

AB - Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.

UR - http://www.scopus.com/inward/record.url?scp=0037422542&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037422542&partnerID=8YFLogxK

U2 - 10.1073/pnas.222673099

DO - 10.1073/pnas.222673099

M3 - Article

C2 - 12502794

AN - SCOPUS:0037422542

SN - 0027-8424

VL - 100

SP - 376

EP - 381

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

IS - 1

ER -

Sequence variations in the public human genome data reflect a bottlenecked population history

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this