## Hardy's Paradox

I was surprise by the interest in the last post, and I think a continuation of the topic using new material is in order.

As side notes apparently I got upgraded on Lubos scale to "confused", so I hope I finally got past the misunderstanding that I am a closet admirer of classical physics which was really absurd since I am deriving quantum mechanics from physical principles. Currently I am reading Jean Bricmont's new book Making Sense of Quantum Mechanics which is of course written from the point of view of Bohmian interpretation - Bricmont is well known Bohmian supporter- and I will report on my read after I will be done.

Now back on local realism. In an interferometer one cannot locate the path a particle is traveling because the "which way information" destroys the interference. Moreover, by adjusting the lengths of the two paths, one can tune the interference in such a way that only one output will collect all the particles, and the other output will record a null result. So what would happen if we combine two such interferometers in such a way that they touch at point P as in the picture below?

This is the setup of Lucien Hardy's thought experiment and the experiment has one more twist: in one interferometer we inject electrons, while in the other we inject their anti-particle: the positrons.

If the two interferometers are not touching, they are tuned in such a way that all the particles arrive at the "c" detectors , and none at the "d" detectors -c for constructive interference, d for destructive interference-. Suppose now that the loops touch at point P. If the particle from the right loop goes through P it blocks the w+/u+ path of the particle in the left interferometer and the left interferometer superposition is prevented as well. As a result, now the left particle can be detected at the d+ detector.

So far this is nothing fancier than the interferometer discussion from "Where is Waldo?". But now comes the catch. If we use electrons and positrons, when they both arrive at P we will have an annihilation resulting in a photon represented as $$\gamma$$ in the picture above. The key question for local realists is: can we detect at the same time the positron at d+ and the electron at d- ?

When we operate the interferometers without touching this can only happen in the case of a blocked path which kills the interference phenomena. However, in the touching case, when both paths are blocked you get anihilation and a gamma photon. Any local realistic description of the experiment cannot predict simultaneous detection of the electron and positron at d-, and d+. Still this is allowed to occur by quantum mechanics.

This thought experiment is independent of initial conditions and hence is free of the superdeterminism loophole. The challenge for the local realist is to explain why sometimes we get simultaneous detection at detectors d+/d- when the interferometers paths are touching at P, and never when the interferometers do not touch.

## GHZ for die-hard local realists

Last time I have discussed the impossibility to locate the particle along the path of an interferometer. However there is an even stronger argument against local realism which was due to Greenberger, Horne, and Zeilinger and was popularized by Sidney Coleman in his famous "Quantum Mechanics on your face" talk.

The setting is as follows: from a central station, three electrons are sent every minute to three very distant laboratories, say on the Moon, Mars, and Neptune, where three experimentalists decide to measure either the spin on the x axis or the spin on the y axis recording +1 or -1 based on the deflection of the electron in a Stern-Gerlach device. Their decision to measure on x or y axis is left at their own free will. The experiment is run for many years collecting a huge amount of data. Then the lab logs are brought back on Earth and compared. The following correlation emerges:

whenever the three experimentalists measure one spin on the x axis and two on the y axis, the product of the answers is +1.

Now is this consistent with quantum mechanics predictions? The initial GHZ state is:

$$|\psi\rangle = \frac{1}{\sqrt{2}}(|+++\rangle - |---\rangle)$$

and for example measuring x-y-y in laboratories 1-2-3 yields

$$\sigma_x^{(1)}\sigma_y^{(2)}\sigma_y^{(3)}|\psi\rangle = |\psi\rangle$$

because $$\sigma_x$$ flips a + into a - and $$\sigma_y$$ does the same thing and adds an $$i$$ factor.

and the same for the other 2 combinations: y-x-y, and y-y-x.

So far so good, but what would happen when all three experimentalists decided to measure all on the x axis? What would a die-hard local realist predict it would happen?

A local realist thinks the value of the measurements exist independent of measurement ("the Moon is there even when I am not looking at it"), and the experiment simply reveals the value. Since the three laboratories are far apart, the decision of what to measure in one laboratory cannot influence what it is measured at the other two laboratories. After all, the measurements are done every minute as the electrons arrive, and it takes more than 1 minute for the speed of light to propagate between any two laboratories.

So if the spin value exists independent of measurement, we have three equations:

$$SpinX_1 SpinY_2 SpinY_3 = +1$$
$$SpinY_1 SpinX_2 SpinY_3 = +1$$
$$SpinY_1 SpinY_2 SpinX_3 = +1$$

and by multiplication we get

$$SpinX_1 SpinX_2 SpinX_3 SpinY_1 SpinY_1 SpinY_2 SpinY_2 SpinY_3 SpinY_3= +1$$

Since the spins are either +1 or -1, the square of any spin is +1 and we simplify the equation to:

$$SpinX_1 SpinX_2 SpinX_3 = +1$$

What does quantum mechanics predict?

Recall that  $$\sigma_x$$ flips a + into a -, and so

$$\sigma_x^{(1)}\sigma_x^{(2)}\sigma_x^{(3)}|\psi\rangle = - |\psi\rangle$$

and the experimental results confirm that indeed

$$SpinX_1 SpinX_2 SpinX_3 = -1$$

in agreement with quantum mechanics and in blatant violation of local realism.

## Where is Waldo?

Two posts ago I discussed the bouncing droplets topic, but since that was part of an April Fools joke that maybe left a wrong impression on the topic. Today I want to present how quantum superposition contradicts the classical idea of a particle trajectory. For fun we will call our particle Waldo, and try to locate it along the path.

What I will talk about today is based on the example given by Jean Bricmont's in his book Making Sense of Quantum Mechanics  In turn this example is inspired by David Albert's book: Quantum Mechanics and Experience. As a disclaimer, Bohmian quantum mechanics does provide a location for "Waldo" but I will not discuss how in this post. However, as we will see, classical ideas are incompatible with the quantum effect of superposition.

For our discussion we want to measure the spin of an electron, on the z and x axis. We know how to do that: simply pass the electron through a zone of an inhomogeneous magnetic field along that axis and observe the deflection.We can think of this as a box with one input slot and two output slots which sorts the incoming electrons. Given the "z" or "x" orientation of the magnet inside this imaginary box we have two kinds of sorting boxes, and experimentally we establish two rules:

• once an electron is sorted one way by a box, passing the same electron through another box of the same kind sorts the electron the same way. (repeated measurements yields the same result)
• if an electron is sorted one way by a box, passing the same electron through another box of a different kind results in 50%-50% outcomes from the new box.

Now let's have a source of electrons which where already sorted UP by a Z-Box and let's pass them through an X-Box (no pun intended). By the rules above we get 50% of them sorted x-UP and 50% sorted x-DOWN. Let us further pass each of the two beams through Z-Boxes, and according to the rules above we have a 4 way split as in the picture below

Now here comes the surprise: make the final two Z-boxes a single Z-box by bouncing the electron beams off mirrors and recombining them before the one and only final Z-box:

What percentages do we obtain? Remember that the boxes are imaginary boundaries for the inhomegenous magnetic field and instead of using mirrors we can use the first setting and bring the boxes closer and closer by overlapping the magnetic fields inside until the two boxes become one. So the first picture would make us predict that when we have only one box we would get 25+25=50% z UP and 25+25=50% z DOWN, while in fact we get 100% z UP and 0% z DOWN. This quantum surprise has a name: quantum superposition. It is superposition which distinguishes quantum from classical mechanics.

Now suppose we block one of the paths in the second picture (which illustrates a Mach-Zehnder  interferometer) then by applying the second rule from above we would get 25% z UP, 25% zDOWN, and 50% lost by the barrier blocking the path.

Why is superposition at odds with the notion of trajectory? Which path does Waldo take when going through the interferometer? A classical way of thinking would be as follows:
• Does Waldo takes the upper branch path? No, because to make sure this is the path, we block the bottom branch path and this results in 50-50 split at the final Z Box, and the final result is all z UP.
• Does Waldo takes the lower branch path? No, because to make sure this is the path, we block the upper branch path and this results in 50-50 split at the final Z Box, and the final result is all z UP.
• Does Waldo takes both paths? No, we always find Waldo in one or the other path is we try to see where he is.
• Does Waldo takes no path? No, if both paths are blocked nothing is detected at the end.
The trouble with this reasoning is counterfactual definiteness which is obeyed by classical mechanics. For example in the first two arguments above we assume that a the experiment would be the same if we would place a barrier in one of the paths and this would not change anything. In fact, quantum mechanics is contextual and the experimental setting plays a critical role into what it is measured. However, in classical physics we assume the existence of properties of objects even when they have not been measured and this is not the case in the quantum world.

## How does RSA encryption work?

Since we have been discussing quantum computers, one of the main application of them is to break the RSA encryption. But what is this scheme and how does it work? Let's start with how encryption evolved during the history of mankind. In the ancient times one method was to peel in a spiral fashion the bark out of a tree branch and hold onto the stick. With the bark still attached you write the message vertically and after you peel it the message appears garbled up. To decode it you wrap the bark back on the branch and the letters neatly align if you have a branch of the right diameter. This is a rather poor method of hiding the message and better methods were required.

A big improvement in antiquity was the Caesar's cypher in which each letter is substituted by another letter. This method provides a large number of possibilities for encryption, but it has a simple weakness: in any language some words and some letter combinations are very common. For example in the English language the word "the" is the most common, and from it you can easily guess the substitution for the letters T, H, and E. Working in reversed word frequency order you can break this method of encryption relatively easily. The counter method for this was to daily change the substitution method by some sort of algorithm available to both the sender and receiver. One classical example of this was the Enigma machine used by Germany in the second world war.

Now if you do need to change the way the encryption is run after each encrypted message, is there a method which is unbreakable? Indeed it is, and it is based on XOR. In ASCII, each letter is represented by a number and each number has a binary representation of 0 and 1. If you have a one-time use of a cipher key: a sequence of random letters the same length as the message to be sent, you extract the binary representation of the cypher and you do a bit-wise XOR operation between the bits of the text and the bits of the cypher. The operation is reversible and as long as you do not reuse the key to allow the possibility of of the frequency attack explained above, the encryption is unbreakable. So why not use this method all the time and call it a day? Because the management of the keys is horrendous in practice.

The next improvement came from using asymmetric keys. The key idea is that of factorizing a number intro primes. It is trivial to multiply numbers, but there is no good known algorithm to do the factorization. The factorization time of a number made out of the product of two primes is proportional to the square root of the number. Say that we have a 400 digit number. The square root is a 200 digit number. The lifetime of the universe in seconds is an 18 digit number. A computer which factorizes a million tries per second, can only check $$10^{24}$$ possibilities. This means that the computer has to run for $$10^{176}$$ times the lifetime of the universe to factor a 400 digit number!

So what is the RSA algorithm?

 Adi Shamir, Ron Rivest and Len Adleman

Step 1: pick two large prime numbers p and q and multiply them.
Step 2: multiply (p-1) with (q-1) and find out a small number e relative prime with (p-1) * (q-1). pq and e form the public key you broadcast to the world.

Recall that each letter has a corresponding ASCII code and can be represented as a number.

Step3: the sender wants to encode a number M. Using the public key he computes the encryption of M as: $$M^e\ (mod~pq) = C$$ and he sends C to you.
Step 4: you first compute a number d such that $$ed=1 (mod~(p-1)(q-1))$$
Step 5: Finally you compute $$C^d (mod~pq) = M$$

The computations are in general time consuming and the method is used in practice only to encrypt the unbreakable (one-time used) XOR keys.

To break the RSA encryption one needs to be able to factorize the pq product. One way to do it is by Shor's algoritm. In this approach all steps are fast except the key part which is a quantum Fourier transform. In one of the prior posts I discussed a classical embedding of a quantum computer in what amounts to be an analog computer. The speedup in this setting is due to the ability to represent analog signals which are able to fast compute a Fourier transform.

The key question to ask is whether the computational speedup of the computation in a quantum computer is due to quantum mechanics, or is due to the nature of the analog structured being utilized in Shor's algorithm. Supporters of the Many Worlds Interpretation like David Deutsch contend it is quantum mechanics, but my take on it that this assertion is proven false by several concrete realization of analog computational devices.  Besides the emulation device created by Brian R La Cour and Granville E Ott, there are other prototype analog devices using optical waves capable of fast performing the key step in Shor's algorithm. The fathers of this line of research are David Ferry and Harris Akis from Arizona State University who published the first paper on this in 2001: Quantum wave processing in Superlattices and Microstructures - a journal unknown to the quantum foundation community.

## Annealing and the D-Wave Quantum Computer

Today I will stay in the realm of quantum computers and I want to talk about a commercial company who delivers more than 1000 qbits in a quantum chip. This is far more what the classical embedding scheme from last time can achieve, but at close inspection it is not that impressive. Granted, the D-Wave company can boast about a venture capital investment in the 100 to 200 million range, but it is the limitations of what can they do which prevents the company to be in the hundreds of billion range.

So what is annealing, and how does D-Wave use it to solve problems? Let me start by explaining what I did for my PhD thesis some time ago. I was working in the fiber optics area which are used to send signals (phone calls, internet traffic, TV signals). What you want to do is to increase the bit rate to take full advantage of all of the bandwidth available, but propagation over (large) distances makes the light pulses overlap with each other. Think here about big undersea fiber optic cables between say California and Japan. Key to this is the dispersion property of the optical fiber and dispersion in one point of the fiber can be different than the dispersion in another point (due to variations in manufacturing). A method is needed to measure this dispersion and the method available was to cut the fiber into 1 meter sections and measure the dispersion for each section. How can we do this in a non-destructive way? The theory of light propagation in optical fibers is well know: from Maxwell's equations, in a certain appropriate approximation one obtains the so-called nonlinear Schrodinger equation and numerical simulation methods are available to obtain the output given any input and any dispersion function.

Now the idea is simple: send some optical pulses through the fiber and measure what comes out. Then starting with a constant dispersion map, simulate the optical pulse propagation of the input and obtain some output which will be different than the measured one. Change the dispersion map values and repeat the simulation until you get the same output as the experimental ones.

So what I had was an optimization problem: I had to minimize a "cost" function, where the cost function value is given by some measure of the mismatch between the experimental and the simulated output. The problem is that the cost function is a function of many variables (over 1000). Now how can you find the minima of an unknown function of a single variable? Start at a point and evaluate  the function. You take a step left and evaluate the function. You take a step right and evaluate the function again. Then you know if you need to move to the left or to the right, is as simple as that. To generalize in multiple variables, you need to repeat an all directions and determine the direction of the steepest descent. In other words, you determine the gradient. But how do you use the gradient? Naively you take a step along the direction of the gradient, but this is not the best strategy because you encounter the steep valley problem. I wont't go in detail on what that problem is, and the strategies to go around it, because even if you solve it, you encounter the next problem: the local minima problem.

In optimization problems you are given an unknown landscape you need to explore and your task is to find the absolute minima of the landscape. Many problems fit into this category, from training neural networks to econometric problems. Annealing is a strategy for solving the local minima problem and the inspiration comes from the technique to grow crystals. The basic idea is as follows:

suppose I gave you a tray with a two dimensional landscape built on it and I place a small ball at an arbitrary place in this landscape. The ball will naturally get trapped into a local minima. Then I ask you to shake the tray up and down with a given amplitude such that the ball will jump out of the local minima and land in another local minima (the amplitude corresponds to the temperature). Also I will tell you to keep shaking it, but slowly over time reduce the amplitude of the shaking. When the process ends, the ball will have found the local minima. Similarly, when you grow a crystal, you need to slowly reduce the temperature which will allow the molecules to find the local minima in the crystal which is growing as the temperature drops.

The D-Wave chip is nothing but an annealing machine. However, it has a quantum twist: In quantum mechanics one encounters tunneling, and because of it the "ball exploring the landscape" can tunnel from one local minima to the next, and this is the basis of its quantum speedup because the annealing schedule can be done faster. There was much debate weather D-Wave is quantum or if it has any speedups over a classical system and Scott Aaronson was the "skeptic-in-chief". I think it is clear now that their chip is quantum, but the speedups are not really there because the kinds of the landscapes which are programmable on their computer are too simplistic. Several order of magnitude increases in the number of qbits are needed, but the good news is that D-Wave is doubling their qbit number every year.

So how does it work?

The basic building bloc is a Josephson junction which hosts a qbit. Because of the manufacturing process, different junctions are not the same and they need to be calibrated with an additional loop and a very precise magnetic flux going through it. The chip is cooled to 15 miliK for superconductivity and noise reduction and it is shielded to 50,000 times the Earth's magnetic field. The noise is still large, and repeating  a computation would result in a different answer 5% of the time.

Four superconducting loops are elongated and stacked next to each other horizontally. Then the same thing is done vertically and the two layers are coupled resulting in a 8 qbit cell which realizes an Ising Hamiltonian. The cells are connected to more cells resulting in the final chip. The coupling and the biases in the Hamiltonian are controllable and represent the actual problem being solved. This is done at room temperature. Then the chip is cooled and then the annealing process starts.

The annealing corresponds to gradually turning off an initial state Hamiltonian and turning on the Hamiltonian of the problem being solved. If done slowly, the system remains in the ground state at all times during the process. In the end the final spin configuration is read and this provides the answer. The process is repeated many times to either average the answer or get the best answer.

Now which problems can you solve using a D-Wave computer? All tasks a traditional quantum computer can do, but most of the problems (except optimization problems) are ill suited for the D-Wave chip. For example on their 1000 qbit chip, D-Wave was able to factorize the product of two prime numbers: 149 and 73: 10877=149x73, but this is a far cry from what needs to be done to break encryption. The way the factorization task is cleverly modeled on a D-Wave machine needs many orders of magnitude larger number of qbits that what is currently available. There is a lot of hype around D-Wave, but the commercially viable practical applications are few and far in between. I do not think the company covers its costs from sales, and it relies on venture capital and large grants from big companies like Google to keep increasing the number of qbits. So far they delivered on that, but if they will hit an insurmountable physical limit then the funding will evaporate. The value of the company relies mostly in its patents, and I feel that there is a very long road ahead for them to provide a genuine quantum speedup in a viable commercial setting. However it is a start.

## Classical emulation of a quantum computer

I am now resuming my scheduled topics and I will start a small series on classical-quantum debate. I will start with a very interesting paper by Brian R La Cour and Granville E Ott talking about the possibility of emulating a quantum computer with purely classical means. I met Brian last year at a conference but I did not attend his presentation due to a scheduling conflict. However, after that I looked up his research and I found it most intriguing.

The modern trend in quantum mechanics is not on discussing how counterintuitive the theory is or on Bohr-Einstein debate, but on how to use it to create useful applications not possible using classical means. One such application is a quantum computer which holds the promise of a massive speedup compared with classical ones. So the race is on to build such a thing, but to be of any value you need to have many quantum bits. However, the quantum states are very brittle and decoherence is your enemy. Against this background, here is this proposal by La Cour and Ott who say that you do not actually need quantum mechanics and you can emulate qbits using classical resources!!! Moreover, they built a working prototype. So what is going on here?

First, let me say this proposal does not contradict in any way quantum mechanics and the resulting system is pure classical. However, the scheme is quite clever and is able to realize the quantum algorithms speedups. The idea is that the computer is not digital, but analog using waves which naturally obey the superposition principle.

Here is the clever encoding scheme of a qbit corresponding to two complex numbers $$\alpha$$ and $$\beta$$:

$$\psi(t) = \alpha e^{i \omega_0 t} + \beta e^{-i \omega_0 t}$$

To recover $$\alpha$$ and $$\beta$$ from the signal one multiplies the signal with $$e^{\pm i \omega_0 t}$$ and then applies a low pass filter.

To add more qbits, they use increasing signal frequencies: $$\omega_i = 2^i \omega_0$$ and this ultimately provides the practical limitation of the scheme due to a finite amount of bandwidth.  This scheme cannot accommodate more than 40 qbits, but a 10 qbit system has the same processing power as a 1 GHz standard digital computer. While there is a speedup due to using quantum algorithms, there is also a slowdown due to signal extraction: the inner product of two wavefunctions is computed as a time integral of length $$2\pi /\omega_0$$. $$\omega_0$$ has to be small to accommodate as many qbits as possible.

But while there is superposition and a Hilbert space is present in the emulation scheme, there is no randomness and no true quantum behavior, so what can be achieved with this emulation? Surprisingly a lot because all quantum gate operations can be implemented.

The inherent determinism of the computation can be "fixed" by adding a random noise. But why do that? This makes sense if it helps speed up the extraction of partial information (remember that the inner product requires a lengthy time integration). More interesting is the realization of the teleportation protocol. The key sentence in the paper is this:

"although Alice is in possession of Bobʼs qubit, she does not manipulate it."

For separated systems, you do not actually get correlations higher than the Bell limit simply because there are no separated systems in this embedding scheme. So in a way it is cheating to call this teleportation; physically that can be achieved only with quantum mechanics. However the goal is to obtain a quantum computer able to process quantum algorithms and the actual algorithm is implementable in this embedding.

Apart from practical applications (which are rather modest due to the practical limit of 40 qubit) , this research is most valuable in opening up new roads in exploring foundations and computability. If a quantum computer is realizable with classical resources, why stop here and not realize a Popescu-Rohrlich box computer? Does anything like a PR box computer and a PR box algorithm exist? Nobody explored such concepts before because PR boxes do not exist in nature. But now we see that quantum algorithms do not rely on separated correlations. What is the relationship between a PR box and superposition? Does superposition prevent the existence of a PR box? And can we prove false the Church-Turing thesis?