Document Type

Technical Report

Publication Date






Technical Report Number



Motivation: Pseudoknots have generally been excluded from the prediction of RNA secondary structures due to the difficulty in modeling and complexity in computing. Although several dynamic programming algorithms exist for the prediction of pseudoknots using thermodynamic approaches, they are neither reliable nor efficient. On the other hand, comparative methods are more reliable, but are often done in an ad hoc manner and require expert intervention. Maximum weighted matching (Tabaska et. al, Bioinformatics, 14:691-9, 1998), an algorithm for pseudoknot prediction with comparative analysis, suffers from low prediction accuracy in many cases. Here we present an algorithm, iterative loop matching, for predict-ing RNA secondary structures including pseudoknots reliably and efficiently. The method can utilize either thermodynamic or comparative information or both, thus is able to predict for both aligned sequences and individual sequences. Results: We have tested the algorithm on a number of RNA families, including both structures with and without pseudoknots. Using 8–12 homologous sequences, the algorithm correctly identifies more than 90% of base-pairs for short sequences and 80% overall. It correctly predicts nearly all pseudoknots. Furthermore, it produces very few spurious base-pairs for sequences without pseudoknots. Comparisons show that our algorithm is both more sensitive and more specific than the maximum weighted matching method. In addition, our algorithm has high prediction accuracy on individual sequences, comparable to the PKNOTS algorithm (Rivas & Eddy, J Mol Biol, 285:2053-68, 1999), while using much less computational resources. Availability: The program has been implemented in ANSI C and is freely available for academic use at˜zhang/projects/rna/ilm/.


Permanent URL: