Correction of sequencing errors in a mixed set of reads

Leena Salmela, Helsinki University, Finland, http://www.cs.helsinki.fi/u/lmsalmel/

Abstract
High throughput DNA sequencing technologies produce large sets of
short reads that may contain errors. Most sequencing technologies
produce reads in base space, i.e. the reads are sequences of
characters A, C, G, T, and N. The SOLiD sequencing platform codes each
pair of bases with a color and so the reads are sequences of four
colors. Errors in the reads present a challenge to further processing
of the reads, like de novo assembly. Error correction aims to reduce
the error rate of the reads. I will present an error correction tool
based on the suffix trie for correcting substitutions, insertions, and
deletions in a mixed set of reads produced by various sequencing
platforms. This tool is based on the SHREC program that is aimed at
correcting reads from SOLEXA/Illumina sequencing platform.

Author: Eric Rivals

Date: Feb 2011

HTML generated by org-mode 7.01h in emacs 22