700字范文,内容丰富有趣,生活中的好帮手!
700字范文 > Accurate circular consensus long-read sequencing improves variant detection and assembly of a human

Accurate circular consensus long-read sequencing improves variant detection and assembly of a human

时间:2019-07-02 11:12:54

相关推荐

Accurate circular consensus long-read sequencing improves variant detection and assembly of a human

Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome精确的循环一致长读测序改进了人类基因组的变异检测和组装

Aaron M. Wenger,Paul Peluso,[…]Michael W. Hunkapiller

Nature Biotechnologyvolume37,pages1155–1162()Cite this article

14kAccesses

61Citations

141Altmetric

Metricsdetails

Abstract

The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions <50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the ‘genome in a bottle’ (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of >15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.

目前使用的DNA测序技术可以产生高度精确的短读,也可以产生较不精确的长读。

我们报告了优化的循环一致序列(CCS),以提高单分子实时(SMRT)测序(PacBio)的准确性,并产生高精度(99.8%)长的高保真度(HiFi),平均长度为13.5 kb。

将我们的方法应用于鉴定良好的人类HG002/NA24385基因组序列,单核苷酸变异(SNVs)的准确率和查全率至少为99.91%,插入和缺失和50 bp (indels)的准确率和查全率至少为95.98%,结构变异的查全率至少为95.99%。

我们的CCS方法匹配或超过了短读测序检测小变异和结构变异的能力。

我们估计,在“瓶中基因组”(GIAB)基准集中,有2434个不一致是可纠正的错误。几乎所有(99.64%)变异都可以分阶段转化为单倍型,从而进一步改进变异检测。

单独使用CCS读取的从头基因组组装产生了连续且准确的基因组,其contig N50为15 Mb,一致性为99.997%,大大优于不太准确的长读取组装。

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。