Mathematics Department

University of California

Berkeley, California 94720 USA

**ABSTRACT: **
Hidden Markov Models (HMMs) are widely used in pattern recognition
applications, most notably speech recognition. However, they have been
studied primarily on a practical level, with the HMM matrices as the
fundamental objects and without considering the viewpoint of an
observer trying to accurately predict the future output. This
dissertation is a study of the processes represented by HMMs using the
concepts and techniques of stochastic automata derived from the study
of dynamical systems and their complexity. The goal is to understand
these processes in the language of stochastic automata. Along the
way, certain ideas of stochastic automata are characterized in a
measure-theoretic manner.

We begin by defining a process to be a stationary measure space of bi-infinite sequences. We define a process state to be a conditional distribution on the future of a process which corresponds to the state of knowledge held by an observer who has seen some or all of the process's history. This definition is similar in spirit to ideas used in dynamical systems, and it is a formalization of the notion of a "deterministic state" used in automata theory. We describe the process states of HMM processes. And we give a necessary condition for a process to have an HMM representation.

Following [Balasubramanian] and [Ito et al], we define Generalized Hidden Markov Models (GHMMs). These are structurally and operationally the same as HMMs, except that parameters which are interpreted as probabilities in defining HMMs are allowed to be negative in GHMMs. We describe necessary and sufficient conditions for two GHMMs (and thus two HMMs) to represent the same process, and and we give a method for finding the smallest possible GHMM equivalent to a given one.

Going further, we give an algorithm for constructing a GHMM that represents a process from the probabilities that process assigns to words. We prove that, for every process, either the algorithm constructs a GHMM that represents the given process or that no such representation exists. This characterizes the set of process representable by GHMMs. Finally, we describe an implementation of this algorithm which constructs GHMMs from sample sequences.

D. R. Upper, "Theory and Algorithms for Hidden Markov Models and Generalized Hidden Markov Models", Ph.D. Dissertation, Mathematics Department, University of California, Berkeley (February 1997). [ps.gz]= 895kb zipped [pdf]= 1,672kb

*In parts:*

**Chapters 1-2:**
[ps.gz]= 182kb
zipped [pdf]= 284kb
**Chapter 3:**
[ps]= 298kb
[pdf]= 487kb
**Chapter 4:**
[ps]= 205kb
[pdf]= 419kb
**Chapters 5-6:**
[ps]= 160kb
[pdf]= 290kb
**End:**
[ps]= 112kb
[pdf]= 213kb