Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data

M. Moragues , J. Comadran, R. Waugh, I. Milne, A. J. Flavell, Joanne R. Russell (Lead / Corresponding author)

    Research output: Contribution to journalArticle

    70 Citations (Scopus)

    Abstract

    The capability of molecular markers to provide information of genetic structure is influenced by their number and the way they are chosen. This study evaluates the effects of single nucleotide polymorphism (SNP) number and selection strategy on estimates of germplasm diversity and population structure for different types of barley germplasm, namely cultivar and landrace. One hundred and sixty-nine barley landraces from Syria and Jordan and 171 European barley cultivars were genotyped with 1536 SNPs. Different subsets of 384 and 96 SNPs were selected from the 1536 set, based on their ability to detect diversity in landraces or cultivated barley in addition to corresponding randomly chosen subsets. All SNP sets except the landrace-optimised subsets underestimated the diversity present in the landrace germplasm, and all subsets of SNP gave similar estimates for cultivar germplasm. All marker subsets gave qualitatively similar estimates of the population structure in both germplasm sets, but the 96 SNP sets showed much lower data resolution values than the larger SNP sets. From these data we deduce that pre-selecting markers for their diversity in a germplasm set is very worthwhile in terms of the quality of data obtained. Second, we suggest that a properly chosen 384 SNP subset gives a good combination of power and economy for germplasm characterization, whereas the rather modest gain from using 1536 SNPs does not justify the increased cost and 96 markers give unacceptably low performance. Lastly, we propose a specific 384 SNP subset as a standard genotyping tool for middle-eastern landrace barley.

    Original languageEnglish
    Pages (from-to)1525-1534
    Number of pages10
    JournalTheoretical and Applied Genetics
    Volume120
    Issue number8
    DOIs
    Publication statusPublished - May 2010

    Keywords

    • SINGLE-NUCLEOTIDE POLYMORPHISMS
    • LINKAGE DISEQUILIBRIUM
    • GENETIC DIVERSITY
    • POPULATION PARAMETERS
    • HORDEUM-SPONTANEUM
    • WILD BARLEY
    • ASSOCIATION
    • LANDRACES
    • DISCOVERY
    • LOCI

    Cite this

    Moragues , M. ; Comadran, J. ; Waugh, R. ; Milne, I. ; Flavell, A. J. ; Russell, Joanne R. / Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data. In: Theoretical and Applied Genetics. 2010 ; Vol. 120, No. 8. pp. 1525-1534.
    @article{dd232fda477940a389d3d7e7c9277815,
    title = "Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data",
    abstract = "The capability of molecular markers to provide information of genetic structure is influenced by their number and the way they are chosen. This study evaluates the effects of single nucleotide polymorphism (SNP) number and selection strategy on estimates of germplasm diversity and population structure for different types of barley germplasm, namely cultivar and landrace. One hundred and sixty-nine barley landraces from Syria and Jordan and 171 European barley cultivars were genotyped with 1536 SNPs. Different subsets of 384 and 96 SNPs were selected from the 1536 set, based on their ability to detect diversity in landraces or cultivated barley in addition to corresponding randomly chosen subsets. All SNP sets except the landrace-optimised subsets underestimated the diversity present in the landrace germplasm, and all subsets of SNP gave similar estimates for cultivar germplasm. All marker subsets gave qualitatively similar estimates of the population structure in both germplasm sets, but the 96 SNP sets showed much lower data resolution values than the larger SNP sets. From these data we deduce that pre-selecting markers for their diversity in a germplasm set is very worthwhile in terms of the quality of data obtained. Second, we suggest that a properly chosen 384 SNP subset gives a good combination of power and economy for germplasm characterization, whereas the rather modest gain from using 1536 SNPs does not justify the increased cost and 96 markers give unacceptably low performance. Lastly, we propose a specific 384 SNP subset as a standard genotyping tool for middle-eastern landrace barley.",
    keywords = "SINGLE-NUCLEOTIDE POLYMORPHISMS, LINKAGE DISEQUILIBRIUM, GENETIC DIVERSITY, POPULATION PARAMETERS, HORDEUM-SPONTANEUM, WILD BARLEY, ASSOCIATION, LANDRACES, DISCOVERY, LOCI",
    author = "M. Moragues and J. Comadran and R. Waugh and I. Milne and Flavell, {A. J.} and Russell, {Joanne R.}",
    year = "2010",
    month = "5",
    doi = "10.1007/s00122-010-1273-1",
    language = "English",
    volume = "120",
    pages = "1525--1534",
    journal = "Theoretical and Applied Genetics",
    issn = "0040-5752",
    publisher = "Springer Verlag",
    number = "8",

    }

    Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data. / Moragues , M.; Comadran, J.; Waugh, R.; Milne, I.; Flavell, A. J.; Russell, Joanne R. (Lead / Corresponding author).

    In: Theoretical and Applied Genetics, Vol. 120, No. 8, 05.2010, p. 1525-1534.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data

    AU - Moragues , M.

    AU - Comadran, J.

    AU - Waugh, R.

    AU - Milne, I.

    AU - Flavell, A. J.

    AU - Russell, Joanne R.

    PY - 2010/5

    Y1 - 2010/5

    N2 - The capability of molecular markers to provide information of genetic structure is influenced by their number and the way they are chosen. This study evaluates the effects of single nucleotide polymorphism (SNP) number and selection strategy on estimates of germplasm diversity and population structure for different types of barley germplasm, namely cultivar and landrace. One hundred and sixty-nine barley landraces from Syria and Jordan and 171 European barley cultivars were genotyped with 1536 SNPs. Different subsets of 384 and 96 SNPs were selected from the 1536 set, based on their ability to detect diversity in landraces or cultivated barley in addition to corresponding randomly chosen subsets. All SNP sets except the landrace-optimised subsets underestimated the diversity present in the landrace germplasm, and all subsets of SNP gave similar estimates for cultivar germplasm. All marker subsets gave qualitatively similar estimates of the population structure in both germplasm sets, but the 96 SNP sets showed much lower data resolution values than the larger SNP sets. From these data we deduce that pre-selecting markers for their diversity in a germplasm set is very worthwhile in terms of the quality of data obtained. Second, we suggest that a properly chosen 384 SNP subset gives a good combination of power and economy for germplasm characterization, whereas the rather modest gain from using 1536 SNPs does not justify the increased cost and 96 markers give unacceptably low performance. Lastly, we propose a specific 384 SNP subset as a standard genotyping tool for middle-eastern landrace barley.

    AB - The capability of molecular markers to provide information of genetic structure is influenced by their number and the way they are chosen. This study evaluates the effects of single nucleotide polymorphism (SNP) number and selection strategy on estimates of germplasm diversity and population structure for different types of barley germplasm, namely cultivar and landrace. One hundred and sixty-nine barley landraces from Syria and Jordan and 171 European barley cultivars were genotyped with 1536 SNPs. Different subsets of 384 and 96 SNPs were selected from the 1536 set, based on their ability to detect diversity in landraces or cultivated barley in addition to corresponding randomly chosen subsets. All SNP sets except the landrace-optimised subsets underestimated the diversity present in the landrace germplasm, and all subsets of SNP gave similar estimates for cultivar germplasm. All marker subsets gave qualitatively similar estimates of the population structure in both germplasm sets, but the 96 SNP sets showed much lower data resolution values than the larger SNP sets. From these data we deduce that pre-selecting markers for their diversity in a germplasm set is very worthwhile in terms of the quality of data obtained. Second, we suggest that a properly chosen 384 SNP subset gives a good combination of power and economy for germplasm characterization, whereas the rather modest gain from using 1536 SNPs does not justify the increased cost and 96 markers give unacceptably low performance. Lastly, we propose a specific 384 SNP subset as a standard genotyping tool for middle-eastern landrace barley.

    KW - SINGLE-NUCLEOTIDE POLYMORPHISMS

    KW - LINKAGE DISEQUILIBRIUM

    KW - GENETIC DIVERSITY

    KW - POPULATION PARAMETERS

    KW - HORDEUM-SPONTANEUM

    KW - WILD BARLEY

    KW - ASSOCIATION

    KW - LANDRACES

    KW - DISCOVERY

    KW - LOCI

    UR - http://www.scopus.com/inward/record.url?scp=77955293622&partnerID=8YFLogxK

    U2 - 10.1007/s00122-010-1273-1

    DO - 10.1007/s00122-010-1273-1

    M3 - Article

    VL - 120

    SP - 1525

    EP - 1534

    JO - Theoretical and Applied Genetics

    JF - Theoretical and Applied Genetics

    SN - 0040-5752

    IS - 8

    ER -