Occam's Quantum Strop — John Mahoney

Our paper "Occam's Quantum Strop: Synchronizing and Compressing Classical Cryptic Processes via a Quantum Channel" was just published in Scientific Reports!

Lather up and get your copy here.

Non-technical overview

We are interested in discrete, stationary, stochastic processes.

What are those? you might ask..

Some examples: - the Fair Coin Process: just an infinite sequence of independent flips of a coin. Flip a coin. If it is heads, write down 0. If tails, write down 1. Repeat. - the Golden Mean Process: Flip a coin. If it is heads, write down 0 and start over. If it is tails, write down 1, then write down 0 next, then start over. - an Almost Fair Process: Flip 100 coins. If the number of heads is 49 or more, write 0. Then flip a single coin 100 times, each time recording 0 for heads and 1 for tails.

Clearly, we can make these instructions as complicated as we like, but the basic idea is that we have some way of writing down symbols that has to do with various coin flips and other rules.

You might already have the sense that the Golden Mean process is more complicated than the Fair Coin process. And that the Almost Fair Process is much more complicated than the other two.

There is a natural way of quantifying this complication which tells us something like: how many things must we "keep track of" when following this procedure?

Interestingly, there is a way of keeping track of everything using quantum mechanics that generally makes things less complicated. You might argue that anything to do with quantum mechanics is inherently more complicated. Fair enough. But I am willing to trust the quantum mechanicians out there to actually build and operate this thing so that I don't have to worry about these details.

Then we find that these quantum rules really are simpler in a meaningful way.

The gist of it is that when going through some of these more complicated process procedures you will sometimes find yourself in the position of having to remember something that you know will almost certainly end up not having any impact.

Crypticity quantifies the difference between the "cost" and the "value" of prediction. What feature of a process makes it cost a lot to predict, but not that valuable do perform the prediction? Imagine a future where weather forecasting has become a huge industry. For instance, station A makes the forecast that it might possibly rain tomorrow, but then it will definitely be sunny for the rest of the week. Specifically, it places a probability of 25% on it raining tomorrow. Station B has a similar but different forecast. It claims that the probability of rain tomorrow is 20%, but agrees that the rest will be sunny. What would we do with this information? These forecasts are probably similar enough to not worry about the difference.

Imagine though that you are a sunscreen and umbrella magnate. And that you are responsible for all daily sales of sunscreen and umbrellas in California. Filling the shelves with just the right amount of each will, over many such forecasts, lead to either the success or failure of your empire. In that case, you would probably pay attention to each of these forecasts.

We could take this fiction to the extreme, supposing it is advantageous to incorporate information from each of a very large number of such similar forecasts. Maybe your action is determined by something like the average of the predictions. Or maybe you have a particular station you trust. In any case, the important point is that each played some small role in determining your action.

If you had based your decision on just one of these forecasts, the action would have probably been very similar. Instead of sending 1000 umbrellas to Sacramento, you might have sent 990.

Incorporating more information allows you an edge in the market, and so this extra work might be worth it, but what does it cost? For starters, you have to subscribe to all of these esoteric weather stations on cable. And more, you have to sit and watch them. In closer to analogy to our work, you also have to read, record and process all of this information.

After all of this work, tomorrow we find that, as expected, the forecasts all say "sun".

While the initial meta-forecast was composed of many sub-forecasts (to be combined in some prescribed way), the next day's forecast required only one state. In the language of (probabilistic) finite-state machines, this is characterized by a set of several states all leading to the same state. This kind of "path-merger" is exactly the structure that the quantum representation is able to take advantage of.

In our original (classical) setup, the many forecasts were very similar, but we kept track of them "orthogonally". Quantum mechanics allows us to .....