Technical Report Number
The Human Genome Project (HGP) has led to the deposit of human genomic sequence in the form of sequenced clones into various databases such as the DNA Data Bank of Japan (DDBJ) (Tateno and Gojobori, 1997), the European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database (Stoesser, et. al., 1999), and GenBank (Benson, et. al., 1998). Many of these sequenced clones occur in regions where sequencing has taken place either within the same sequencing center or other centers throughout the world. The assembly of extended segments of genomic sequence by looking at overlapping end segments is desired and is currently availabel only in a limited sense from the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/genome/seq/) and Oak Ridge National Laboratories' (ORNL) Genome Channel (http://compbio.ornl.gov/tools/channel/). We attempt to collate a definitive set of nonredundant extended segments of human genomic sequence by taking individual human entires in GenBank greater than 25 kilobases (kb) and extending them on either end. We address the several difficulties that arise when attempting to extend segments.
Rouchka, Eric C. and States, David J., "Assembly and Analysis of Extended Human Genomic Contig Regions" Report Number: WUCS-99-10 (1999). All Computer Science and Engineering Research.