Massive Parallel Computing to Accelerate Genome-Matching Source Code

The attached files contain the final source code from GPU Computing Gems Emrald Edition Chapter 12 - "Massive Parallel Computing to Accelerate Genome-Matching", which explores the problem of read/genome sequence alignment in the CUDA environment. Emphasis is placed upon understanding and exploring the different ways CUDA code can be expressed and optimized.

Note that the attached code is still considered "alpha"-level development code - a great deal of polishing and refinement is needed before a robust, friendly user experience can be obtained. So don't expect a finished product!

The program is designed to work with fasta-style reads and genomes. Genomes are stored as text base pairs with a signle line at the top of the file describing the fasta metadata (beginning with a ">" character) and a single line that contains the entire genome. Read/target files alternate fasta metadata on one line followed by the sequence on the next. Alternately, fasta metadata can be omitted from the target files.

The code was compiled for development under Microsoft Visual Studio 2005; it uses C++, CUDA, OpenMP, and Windows libraries. Though not attempted, porting the code to Linux/Unix and GCC should not be difficult.

Feel free to direct any questions/comments to BenW.




See attached files

File attachments: 
Genome Search.zip69.1 KB