TY - JOUR
T1 - Rapid and accurate alignment of nucleotide conversion sequencing reads with HISAT-3N
AU - Zhang, Yun
AU - Park, Chanhee
AU - Bennett, Christopher
AU - Thornton, Micah
AU - Kim, Daehwan
N1 - Funding Information:
This work was supported in part by the National Institute of General Medical Sciences (NIH) under grant R01-GM135341 and by the Cancer Prevention Research Institute of Texas (CPRIT) under grant RR170068 to D.K.
Publisher Copyright:
© 2021 Zhang et al. This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
PY - 2021/7
Y1 - 2021/7
N2 - Sequencing technologies using nucleotide conversion techniques such as cytosine to thymine in bisulfite-seq and thymine to cytosine in SLAM seq are powerful tools to explore the chemical intricacies of cellular processes. To date, no one has developed a unified methodology for aligning converted sequences and consolidating alignment of these technologies in one package. In this paper, we describe hierarchical indexing for spliced alignment of transcripts–3 nucleotides (HISAT-3N), which can rapidly and accurately align sequences consisting of any nucleotide conversion by leveraging the powerful hierarchical index and repeat index algorithms originally developed for the HISAT software. Tests on real and simulated data sets show that HISAT-3N is faster than other modern systems, with greater alignment accuracy, higher scalability, and smaller memory requirements. HISAT-3N therefore becomes an ideal aligner when used with converted sequence technologies.
AB - Sequencing technologies using nucleotide conversion techniques such as cytosine to thymine in bisulfite-seq and thymine to cytosine in SLAM seq are powerful tools to explore the chemical intricacies of cellular processes. To date, no one has developed a unified methodology for aligning converted sequences and consolidating alignment of these technologies in one package. In this paper, we describe hierarchical indexing for spliced alignment of transcripts–3 nucleotides (HISAT-3N), which can rapidly and accurately align sequences consisting of any nucleotide conversion by leveraging the powerful hierarchical index and repeat index algorithms originally developed for the HISAT software. Tests on real and simulated data sets show that HISAT-3N is faster than other modern systems, with greater alignment accuracy, higher scalability, and smaller memory requirements. HISAT-3N therefore becomes an ideal aligner when used with converted sequence technologies.
UR - http://www.scopus.com/inward/record.url?scp=85109025912&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85109025912&partnerID=8YFLogxK
U2 - 10.1101/gr.275193.120
DO - 10.1101/gr.275193.120
M3 - Article
C2 - 34103331
AN - SCOPUS:85109025912
SN - 1088-9051
VL - 31
SP - 1290
EP - 1295
JO - Genome Research
JF - Genome Research
IS - 7
ER -