ALCOMFT-TR-01-144

Rune Bang Lyngsø and Christian Nørgaard Storm Pedersen
Complexity of Comparing Hidden Markov Models
Århus. Work package 4. June 2001.
Abstract: The basic theory of hidden Markov models was developed and applied to problems in speech recognition in the late 1960's, and has since then been applied to numerous problems, e.g. biological sequence analysis. In this paper we consider the problem of computing the most likely string generated by a given model, and its implications on the complexity of comparing hidden Markov models. We show that computing the most likely string, and approximating its probability within any constant factor, is \mathbfNP-hard, and establish the NP-hardness of comparing two hidden Markov models under the \ell_\infty- and \ell₁-norms. We discuss the applicability of the technique used to other measures of distance between probability distributions. In particular we show that it cannot be used to prove NP-hardness of determining the Kullback-Leibler distance between the probability distributions of two hidden Markov models, or of comparing them under the \ell_k-norm for any fixed even integer k.
Postscript file: ALCOMFT-TR-01-144.ps.gz (162 kb).

System maintainer Gerth Stølting Brodal <gerth@cs.au.dk>