• 《中国科学引文数据库(CSCD)》来源期刊
  • 中国科技期刊引证报告(核心版)期刊
  • 《中文核心期刊要目总览》核心期刊
  • RCCSE中国核心学术期刊

3款猪50K SNP芯片基因型填充至序列数据的效果评估

曾浩南, 钟展明, 徐志婷, 滕金言, 袁晓龙, 李加琪, 张哲

曾浩南, 钟展明, 徐志婷, 等. 3款猪50K SNP芯片基因型填充至序列数据的效果评估[J]. 华南农业大学学报, 2022, 43(4): 10-15. DOI: 10.7671/j.issn.1001-411X.202110032
引用本文: 曾浩南, 钟展明, 徐志婷, 等. 3款猪50K SNP芯片基因型填充至序列数据的效果评估[J]. 华南农业大学学报, 2022, 43(4): 10-15. DOI: 10.7671/j.issn.1001-411X.202110032
ZENG Haonan, ZHONG Zhanming, XU Zhiting, et al. Evaluation on genotype imputation performance of three porcine 50K SNP chips from chip data to sequencing data[J]. Journal of South China Agricultural University, 2022, 43(4): 10-15. DOI: 10.7671/j.issn.1001-411X.202110032
Citation: ZENG Haonan, ZHONG Zhanming, XU Zhiting, et al. Evaluation on genotype imputation performance of three porcine 50K SNP chips from chip data to sequencing data[J]. Journal of South China Agricultural University, 2022, 43(4): 10-15. DOI: 10.7671/j.issn.1001-411X.202110032

3款猪50K SNP芯片基因型填充至序列数据的效果评估

基金项目: 财政部和农业农村部:国家现代农业产业技术体系资助
详细信息
    作者简介:

    曾浩南,硕士研究生,主要从事动物遗传育种研究,E-mail: hnzeric@hotmail.com

    通讯作者:

    张 哲,教授,博士,主要从事动物遗传育种研究,E-mail: zhezhang@scau.edu.cn

  • 中图分类号: S828.2

Evaluation on genotype imputation performance of three porcine 50K SNP chips from chip data to sequencing data

  • 摘要:
    目的 

    利用猪50K SNP(Single nucleotide polymorphisms)芯片开展基因组育种已经得到了广泛的应用与认可。基因型填充可在不增加基因型检测成本的前提下大幅提高基因型数据量,有利于开展复杂性状的遗传解析与遗传评估。本研究旨在评估3款猪SNP芯片基因型填充至序列数据的填充效果。

    方法 

    选用3款芯片共同检测的48头杜洛克猪群体作为填充的目标群体,260头猪的全基因组测序数据作为参考群体,使用Beagle5.1软件进行基因型填充,对比3款不同猪SNP芯片纽勤50K、中芯一号50K和液相50K基因型填充至序列数据的填充效果。

    结果 

    3款芯片原始的SNP数分别为50697、57466和50885个。填充至序列后,未质控时位点填充准确性(基因型一致性)分别为0.886、0.886和0.898,质控过滤DR2(Dosage R-squared)<0.95的位点后,填充准确性(基因型一致性)分别提升至0.974、0.976和0.969,位点数分别为3393066、3139095和3320627个。

    结论 

    不同芯片基因型填充至序列数据具有可行性,通过基因型填充可获得高质量的高密度基因型数据,可为后续的育种应用研究打下基础。

    Abstract:
    Objective 

    Porcine 50K SNP (single nucleotide polymorphisms) chips have been widely used in pig genomic breeding. Meanwhile, genotype imputation can significantly increase the amount of genotype data without increasing the cost of sequencing, which facilitates genetic resolution and genetic evaluation of complex traits. This study was aimed to evaluate the genotype imputation performance from genotype to sequence data of three porcine SNP chips.

    Method 

    A total of 48 Duroc pigs with three kinds of porcine SNP chips were used as target panel to evaluate the genotype imputation accuracy. A total of 260 pigs with whole genome sequencing data formed a reference panel for genotype imputation. The genotype imputation was performed using Beagle5.1 software to compare the imputation effect of Geneseek 50K, ZhongxinⅠ 50K and Liquid 50K.

    Result 

    The numbers of original SNPs in three kinds of chips were 50697, 57466 and 50885 respectively. The imputation accuracies (genotype consistencies) were 0.886, 0.886 and 0.898 respectively after imputation without any quality control. After filtering the imputed SNPs with low reliability DR2 (Dosage R-squared) <0.95, the imputation accuracies (genotype consistencies) of three kinds of chips were up to 0.974, 0.976 and 0.969 respectively, and the numbers of remaining SNPs were 3393066, 3139095 and 3320627 respectively.

    Conclusion 

    Genotype data from the three types of porcine SNP chips can be imputed to sequence data with a high imputation accuracy. This study provides useful reference for subsequent breeding application research.

  • 图  1   3款芯片之间的位点分布

    Figure  1.   Distribution of loci among three chips

    图  2   MAF、DR2与填充准确性(基因型相关性)的分布

    各点95%置信区间以垂直线标识

    Figure  2.   Distribution of MAF, DR2 and imputation accuracy (genotype correlation)

    The vertical line represents the 95% confidence interval of each point

    表  1   芯片之间重叠位点的基因型一致性与相关性

    Table  1   The consistency and correlation of overlapping loci among three chips

    芯片−芯片
    Chip-chip
    一致性
    Consistency
    相关性
    Correlation
    纽勤50K−中芯一号50K
    Geneseek 50K-ZhongxinⅠ50K
    0.999 0.996
    纽勤50K−液相50K
    Geneseek 50K - Liquid 50K
    0.991 0.985
    中芯一号50K−液相50K
    ZhongxinⅠ50K-Liquid 50K
    0.991 0.987
    下载: 导出CSV

    表  2   3款芯片基因型填充至序列数据的填充准确性1)

    Table  2   The imputation accuracy of three chips from chip data to sequencing data

    芯片
    Chip
    未质控
    No quality control
    质控标准 Quality control standard
    MAF≥0.1 DR2≥0.8 DR2≥0.95
    纽勤50K Geneseek 50K 0.886(0.828) 0.873(0.838) 0.938(0.917) 0.974(0.966)
    中芯一号50K ZhongxinⅠ50K 0.886(0.823) 0.876(0.835) 0.944(0.918) 0.976(0.959)
    液相50K Liquid 50K 0.898(0.814) 0.866(0.825) 0.930(0.909) 0.969(0.960)
     1)表中数据为位点基因型一致性(基因型相关性);基因型一致性指的是基因型完全一致的个数占总基因型个数的比例;基因型相关性用将基因型转换为0、1、2剂量编码后基因型之间的皮尔逊相关系数来表示
     1) Data in the table are genotype consistencies (genotype correlations) of loci; Genotype consistency refers to the proportion of the number of completely consistent genotypes in the total number of genotypes; Genotype correlation is represented by the Pearson correlation coefficient between the genotypes after converting genotype to dosage encoding of 0, 1, and 2
    下载: 导出CSV
  • [1] 唐立群, 肖层林, 王伟平. SNP分子标记的研究及其应用进展[J]. 中国农学通报, 2012, 28(12): 154-158. doi: 10.11924/j.issn.1000-6850.2012-0074
    [2] 徐云碧, 杨泉女, 郑洪建, 等. 靶向测序基因型检测(GBTS)技术及其应用[J]. 中国农业科学, 2020, 53(15): 2983-3004. doi: 10.3864/j.issn.0578-1752.2020.15.001
    [3] 何桑, 丁向东, 张勤. 基因型填充方法介绍及比较[J]. 中国畜牧杂志, 2013, 49(23): 95-100. doi: 10.3969/j.issn.0258-7033.2013.23.022
    [4] 叶绍潘. 基于全基因组测序数据的基因型填充准确性研究[D]. 广州: 华南农业大学, 2017.
    [5]

    BROWNING B L, ZHOU Y, BROWNING S R. A one-penny imputed genome from next-generation reference panels[J]. American Journal of Human Genetics, 2018, 103(3): 338-348. doi: 10.1016/j.ajhg.2018.07.015

    [6]

    BROWNING S R, BROWNING B L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering[J]. American Journal of Human Genetics, 2007, 81(5): 1084-1097. doi: 10.1086/521987

    [7]

    HOWIE B N, DONNELLY P, MARCHINI J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies[J]. PLoS Genetics, 2009, 5(6): e1000529. doi: 10.1371/journal.pgen.1000529

    [8]

    VANRADEN P M, SUN C, O'CONNELL J R. Fast imputation using medium or low-coverage sequence data[J]. BMC Genetics, 2015, 16: 82.

    [9]

    HICKEY J M, KINGHORN B P, TIER B, et al. A phasing and imputation method for pedigreed populations that results in a single-stage genomic evaluation[J]. Genetics Selection Evolution, 2012, 44(1): 9. doi: 10.1186/1297-9686-44-9

    [10]

    BECKER T, KNAPP M. Maximum-likelihood estimation of haplotype frequencies in nuclear families[J]. Genetic Epidemiology, 2004, 27(1): 21-32. doi: 10.1002/gepi.10323

    [11]

    SARGOLZAEI M, CHESNAIS J P, SCHENKEL F S. A new approach for efficient genotype imputation using information from relatives[J]. BMC Genomics, 2014, 15: 478. doi: 10.1186/1471-2164-15-478

    [12] 汪楷庭, 付璐, 孟庆利, 等. 基于填充测序数据的大白猪繁殖性状全基因组关联分析[C]//中国畜牧兽医学会. 第三届中国猪业科技大会暨中国畜牧兽医学会2019年学术年会论文集. 青岛: 中国畜牧兽医学会, 2019: 55.
    [13]

    CLEVELAND M A, HICKEY J M. Practical implementation of cost-effective genomic selection in commercial pig breeding using imputation[J]. Journal of Animal Science, 2013, 91(8): 3583-3592. doi: 10.2527/jas.2013-6270

    [14]

    ZHANG C, KEMP R A, STOTHARD P, et al. Genomic evaluation of feed efficiency component traits in Duroc pigs using 80K, 650K and whole-genome sequence variants[J]. Genetics Selection Evolution, 2018, 50(1): 14. doi: 10.1186/s12711-018-0387-9

    [15]

    GROSSI D A, BRITO L F, JAFARIKIA M, et al. Genotype imputation from various low-density SNP panels and its impact on accuracy of genomic breeding values in pigs[J]. Animal: An International Journal of Animal Bioscience, 2018, 12(11): 2235-2245.

    [16]

    ALILOO H, MRODE R, OKEYO A M, et al. The feasibility of using low-density marker panels for genotype imputation and genomic prediction of crossbred dairy cattle of East Africa[J]. Journal of Dairy Science, 2018, 101(10): 9108-9127. doi: 10.3168/jds.2018-14621

    [17]

    IBEAGHA-AWEMU E M, PETERS S O, AKWANJI K A, et al. High density genome wide genotyping-by-sequencing and association identifies common and low frequency SNPs, and novel candidate genes influencing cow milk traits[J]. Scientific Reports, 2016, 6: 31109. doi: 10.1038/srep31109

    [18]

    TALOUARN E, BARDOU P, PALHIÈRE I, et al. Genome wide association analysis on semen volume and milk yield using different strategies of imputation to whole genome sequence in French dairy goats[J]. BMC Genetics, 2020, 21(1): 19.

    [19]

    HUANG S, HE Y, YE S, et al. Genome-wide association study on chicken carcass traits using sequence data imputed from SNP array[J]. Journal of Applied Genetics, 2018, 59(3): 335-344. doi: 10.1007/s13353-018-0448-3

    [20] 邱奥, 王雪, 孟庆利,等. 3款猪50K SNP芯片基因型填充效果研究[J]. 中国畜牧杂志, 2021, 57(S1): 33-38.
    [21]

    BUTTY A M, SARGOLZAEI M, MIGLIOR F, et al. Optimizing selection of the reference population for genotype imputation from array to sequence variants[J]. Frontiers in Genetics, 2019, 10: 510. doi: 10.3389/fgene.2019.00510

    [22]

    PAUSCH H, AIGNER B, EMMERLING R, et al. Imputation of high-density genotypes in the Fleckvieh cattle population[J]. Genetics Selection Evolution, 2013, 45(1): 3. doi: 10.1186/1297-9686-45-3

    [23]

    DAETWYLER H D, CAPITAN A, PAUSCH H, et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle[J]. Nature Genetics, 2014, 46(8): 858-865. doi: 10.1038/ng.3034

    [24]

    MCCARTHY S, DAS S, KRETZSCHMAR W, et al. A reference panel of 64, 976 haplotypes for genotype imputation[J]. Nature Genetics, 2016, 48(10): 1279-1283. doi: 10.1038/ng.3643

    [25]

    DAVIES R W, FLINT J, MYERS S, et al. Rapid genotype imputation from sequence without reference panels[J]. Nature Genetics, 2016, 48(8): 965-969. doi: 10.1038/ng.3594

    [26]

    DAVIES R W, KUCKA M, SU D, et al. Rapid genotype imputation from sequence with reference panels[J]. Nature Genetics, 2021, 53(7): 1104-1111. doi: 10.1038/s41588-021-00877-0

    [27]

    RUBINACCI S, RIBEIRO D M, HOFMEISTER R J, et al. Efficient phasing and imputation of low-coverage sequencing data using large reference panels[J]. Nature Genetics, 2021, 53(1): 120-126. doi: 10.1038/s41588-020-00756-0

    [28]

    BOLORMAA S, GORE K, VAN DER WERF J H J, et al. Design of a low-density SNP chip for the main Australian sheep breeds and its effect on imputation and genomic prediction accuracy[J]. Animal Genetics, 2015, 46(5): 544-556. doi: 10.1111/age.12340

    [29]

    VAN DEN BERG I, BOICHARD D, LUND M S. Comparing power and precision of within-breed and multibreed genome-wide association studies of production traits using whole-genome sequence data for 5 French and Danish dairy cattle breeds[J]. Journal of Dairy Science, 2016, 99(11): 8932-8945. doi: 10.3168/jds.2016-11073

    [30]

    PICCOLI M L, BRITO L F, BRACCINI J, et al. Genomic predictions for economically important traits in Brazilian Braford and Hereford beef cattle using true and imputed genotypes[J]. BMC Genetics, 2017, 18(1): 2. doi: 10.1186/s12863-017-0475-9

    [31] 王珏, 刘成琨, 刘德武, 等. 基于不同密度SNP芯片在杜洛克公猪中的全基因组选择效果分析[J]. 中国畜牧杂志, 2019, 55(12): 75-79.
    [32]

    DUFFLOCQ P, PÉREZ-ENCISO M, LHORENTE J P, et al. Accuracy of genomic predictions using different imputation error rates in aquaculture breeding programs: A simulation study[J]. Aquaculture, 2019, 503: 225-230. doi: 10.1016/j.aquaculture.2018.12.061

    [33]

    AKBARPOUR T, HOSSEIN-ZADEH N G, SHADPARVAR A A. Marker genotyping error effects on genomic predictions under different genetic architectures[J]. Molecular Genetics and Genomics, 2021, 296(1): 79-89. doi: 10.1007/s00438-020-01728-z

图(2)  /  表(2)
计量
  • 文章访问数:  730
  • HTML全文浏览量:  27
  • PDF下载量:  594
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-10-25
  • 网络出版日期:  2023-05-17
  • 刊出日期:  2022-07-09

目录

    Corresponding author: ZHANG Zhe, zhezhang@scau.edu.cn

    1. On this Site
    2. On Google Scholar
    3. On PubMed

    /

    返回文章
    返回