C. Meyer
and R. Giegerich:
Matching and significance
evaluation
of combined sequence-structure motifs in RNA.
Z.Phys.Chem, 216:193--216, 2002.
The discipline of Algebraic Dynamic Programming is a powerful
method to design and implement versatile pattern matching algorithms
on sequences; here we consider mixed sequence and secondary
structure motifs in RNA. A recurring challenge when designing
new pattern matchers is to provide a statistical analysis of
pattern significance. We demonstrate that by the use of so-called
canonical pattern descriptions, the expected number of hits
on a sequence of length $n$ can be computed a priori, using
the pattern matcher itself. This provides a systematic way to
calibrate the specificity of pattern matching algorithms. The
technique is exemplified by examples using IRE and SECIS elements.