TY - JOUR
T1 - PRESTO
T2 - A toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires
AU - Vander Heiden, Jason A.
AU - Yaari, Gur
AU - Uduman, Mohamed
AU - Stern, Joel N.H.
AU - O'connor, Kevin C.
AU - Hafler, David A.
AU - Vigneault, Francois
AU - Kleinstein, Steven H.
PY - 2014/7/1
Y1 - 2014/7/1
N2 - Summary: Driven by dramatic technological improvements, large-scale characterization of lymphocyte receptor repertoires via high-throughput sequencing is now feasible. Although promising, the high germline and somatic diversity, especially of B-cell immunoglobulin repertoires, presents challenges for analysis requiring the development of specialized computational pipelines. We developed the REpertoire Sequencing TOolkit (pRESTO) for processing reads from high-throughput lymphocyte receptor studies. pRESTO processes raw sequences to produce error-corrected, sorted and annotated sequence sets, along with a wealth of metrics at each step. The toolkit supports multiplexed primer pools, single- or paired-end reads and emerging technologies that use single-molecule identifiers. pRESTO has been tested on data generated from Roche and Illumina platforms. It has a built-in capacity to parallelize the work between available processors and is able to efficiently process millions of sequences generated by typical high-throughput projects.
AB - Summary: Driven by dramatic technological improvements, large-scale characterization of lymphocyte receptor repertoires via high-throughput sequencing is now feasible. Although promising, the high germline and somatic diversity, especially of B-cell immunoglobulin repertoires, presents challenges for analysis requiring the development of specialized computational pipelines. We developed the REpertoire Sequencing TOolkit (pRESTO) for processing reads from high-throughput lymphocyte receptor studies. pRESTO processes raw sequences to produce error-corrected, sorted and annotated sequence sets, along with a wealth of metrics at each step. The toolkit supports multiplexed primer pools, single- or paired-end reads and emerging technologies that use single-molecule identifiers. pRESTO has been tested on data generated from Roche and Illumina platforms. It has a built-in capacity to parallelize the work between available processors and is able to efficiently process millions of sequences generated by typical high-throughput projects.
UR - http://www.scopus.com/inward/record.url?scp=84903741407&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btu138
DO - 10.1093/bioinformatics/btu138
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 24618469
AN - SCOPUS:84903741407
SN - 1367-4803
VL - 30
SP - 1930
EP - 1932
JO - Bioinformatics
JF - Bioinformatics
IS - 13
ER -