Document Type

Technical Report

Department

Computer Science and Engineering

Publication Date

1999-01-01

Filename

WUCS-99-10.PDF

Technical Report Number

WUCS-99-10

Abstract

The Human Genome Project (HGP) has led to the deposit of human genomic sequence in the form of sequenced clones into various databases such as the DNA Data Bank of Japan (DDBJ) (Tateno and Gojobori, 1997), the European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database (Stoesser, et. al., 1999), and GenBank (Benson, et. al., 1998). Many of these sequenced clones occur in regions where sequencing has taken place either within the same sequencing center or other centers throughout the world. The assembly of extended segments of genomic sequence by looking at overlapping end segments is desired and is currently availabel only in a limited sense from the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/genome/seq/) and Oak Ridge National Laboratories' (ORNL) Genome Channel (http://compbio.ornl.gov/tools/channel/). We attempt to collate a definitive set of nonredundant extended segments of human genomic sequence by taking individual human entires in GenBank greater than 25 kilobases (kb) and extending them on either end. We address the several difficulties that arise when attempting to extend segments.

Comments

Permanent URL: http://dx.doi.org/10.7936/K7QF8R3Q

Share

COinS