Technical Report Number
An exponential increase in the speed of DNA sequencing over the past decade has driven demand for fast, space-efficient algorithms to process the resultant data. The first step in processing is alignment of many short DNA sequences, or reads, against a large reference sequence. This work presents WOODSTOCC, an implementation of short-read alignment designed for Graphics Processing Unit (GPU) architectures. WOODSTOCC translates a novel CPU implementation of gapped short-read alignment, which has guaranteed optimal and complete results, to the GPU. Our implementation combines an irregular trie search with dynamic programming to expose regularly structured parallelism. We first describe this implementation, then discuss its port to the GPU. WOODSTOCC’s GPU port exploits three generally useful techniques for extracting regular parallelism from irregular computations: dynamic thread mapping with a worklist, kernel stage decoupling, and kernel slicing. We discuss the performance impact of these techniques and suggest further opportunities for improvement.
Cole, Stephen V.; Gardner, Jacob R.; and Buhler, Jeremy D., "WOODSTOCC: Extracting Latent Parallelism from a DNA Sequence Aligner on a GPU" Report Number: WUCSE-2015-003 (2015). All Computer Science and Engineering Research.
Builds on previous version published in ISPDC 2014 proceedings with updated performance results and discussion.Permanent URL: http://dx.doi.org/10.7936/K7416V81