April 15, 2005

The Viterbi Algorithm

G. David Forney Jr. recently posted on arXiv an extremely interesting article on the history and development of the Viterbi algorithm: The Viterbi Algorithm: A Personal History.

The Viterbi algorithm originated as a decoding algorithm for convolution codes. However, since then it's utility has been widespread. In particular, in speech recognition and computational linguistics, the Viterbi algorithm is used for decoding the most likely state sequence and it is also used in the Forward-Backward algorithm in Hidden Markov Models.

The interesting points from the above article:

  • Viterbi devised the algorithm to help him teach:
    the Viterbi algorithm for convolution codes ... came out of my teaching ... I found information theory difficult to teach, so I started developing some tools.
  • The Viterbi algorithm, when first published, was not known to be related to dynamic programming methods and also not known to provide the optimal or maximum likelihood solution. The original paper states that:
    this decoding algorithm is clearly suboptimal
    It was G. D. Forney, Jr. who later proved that the Viterbi algorithm was an exact recursive algorithm for the shortest path through a trellis diagram. The relationship to dynamic programming then became clear.

The article also provides various places where the Viterbi algorithm has been used in practice, including the Galileo mission to Jupiter in 1992 (it was used to boost the transmission bandwidth when the primary antenna failed to deploy).

Of course, nowadays there are many applications of Viterbi in Computational Linguistics where it is used for many sequence learning tasks, from finding person names or gene names in text, to word segmentation in languages like Chinese, and in Biological Sequence Analysis where it is used to find exon or intron boundaries in DNA sequences.

The article also mentions various relationships between algorithms for "codes on graphs" and Pearl's belief propogation algorithm for Bayesian networks. The following paper is a good reference on this topic (this paper is cited in the above article, but was first pointed out to me by Hassan Ait-Kaci):

S. M. Aji and R. J. McEliece, "The generalized distributive law," IEEE Trans. Inform. Theory, vol. 46, pp. 325-343, Mar. 2000.

Posted by anoop at 02:05 PM