| Karl Young Space Sciences Division NASA Ames Research Center Mail Stop 245-3 Moffet Field, California 94035 USA |
James P. Crutchfield Physics Department University of California Berkeley, California 94720, USA |
ABSTRACT: We review the thermodynamics of estimating the statistical fluctuations of an observed process. Since any statistical analysis involves a choice of model class -- either explicitly or implicitly -- we demonstrate the benefits of a careful choice. For each of three classes a particular model is reconstructed from data streams generated by four sample processes. Then each estimated model's thermodynamic structure is used to estimate the typical behavior and the magnitude of deviations for the observed system. These are then compared to the known fluctuation properties. The type of analysis advocated here, which uses estimated model class information, recovers the correct statistical structure of these processes from simulated data. The current alternative -- direct estimation of the Renyi entropy from time series histograms -- uses neither prior nor reconstructed knowledge of the model class. And, in most cases, it fails to recover the process's statistical structure from finite data -- unpredictability is overestimated. In this analysis, we introduce the fluctuation complexity as a measure of a process's total range of allowed statistical variation. It is a new and complementary characteristic in that it differs from the process's information production rate and its memory capacity.