A Model for Detecting Motifs in Biological Sequences

Andrew F. Neuwald, Washington University in St Louis
Phillip P. Green, Washington University in St Louis

Abstract

A method for detecting patterns in biological sequences is described that incorporates rigorous statistics for determining significances, and an algebraic system that, in combination with a depth first search procedure, can be used to efficiently search for all patterns up to a specified length. This method includes a context free command language grammar and is formulated using a mathematical model amendable to additions enhancements, The method was implemented and verified by detection of various types of patterns in protein sequences.