ALCOMFT-TR-01-145
|

|
Arun Jagota, Rune Bang Lyngsø and Christian Nørgaard Storm Pedersen
Comparing Stochastic Models
Århus.
Work package 4.
June 2001.
Abstract: Stochastic models are commonly used in bioinformatics, e.g. hidden
Markov models for modeling sequence families or stochastic
context-free grammars for modeling RNA secondary structure
formation. Comparing data is a common task in bioinformatics, and it
is thus natural to consider how to compare stochastic models. In
this paper we present the first study of the problem of comparing a
hidden Markov model and a stochastic context-free grammar. We
describe how to compute their co-emission - or collision -
probability, i.e. the probability that they independently generate
the same sequence. We also consider the related problem of finding a
run through a hidden Markov model and derivation in a grammar that
generate the same sequence and have maximal joint probability by a
generalization of the CYK algorithm for parsing a sequence by a
stochastic context-free grammar. We illustrate the methods by an
experiment on RNA secondary structures.
Postscript file: ALCOMFT-TR-01-145.ps.gz (158 kb).
System maintainer Gerth Stølting Brodal <gerth@cs.au.dk>