## Solving Hilbert’s sixth problem (part three of many)

### A tale of two products

Thinking how to proceed on deriving quantum mechanics I came to the conclusion to reverse a bit the order of presentation and review first the goal of the derivation: a symmetric and a skew-symmetric product.

Let us start with classical mechanics and good old fashion Newton’s second law: F = ma. Let’s consider the simplest case of a one dimensional motion of a point particle in a potential V(x):

F = -dV/dx
a = dv/dt with “v” the velocity.
Introducing the Hamiltonian as the sum of kinetic and potential energy: H(p,x)= p2/2m + V(x) we have:

dp/dt = - ∂V/∂x = -∂/∂x (p2/2m + V) = -∂H/∂x
and
dx/dt=v=∂/∂p(p2/2m + V) = ∂H/∂p

In general, one talks not of point particles, but it is customary to introduce generalized coordinates q and generalized momenta p to take advantage of various symmetries of the problem (q = q1, q2,…,qn p = p1, p2,…,pn).

dp/dt = -∂H/∂q
dq/dt = +∂H/∂p

with H = H(q,p,t)

We observe two very important things right away: the equations are linear and q’s and p’s are in one-to-one correspondence.

Now we can introduce the Poisson bracket of two functions f(p,q) and g(p,q) as follows:

{f,g} = ∂f/∂q ∂g/∂p - ∂f/∂p ∂g/∂q

as a convenience to express the equations of motion like this:

dp/dt = {p, H}
dq/dt = {q, H}

The Poisson bracket defines a skew-symmetric product between any two functions f,g:

{f,g} = f o g = -g o f = -{g, f} and more importantly this product obeys the so-called Jacoby identity:

{f,{g,h}} + {h,{f,g}} + {g,{h,f}} = 0

This identity follows identically from the definition of the Poisson bracket and expanding and canceling the partial derivatives.

The Jacoby identity and the skew-symmetry property define a Lie algebra.

So in classical mechanics in phase space one defines two products: the regular function multiplication: f(q,p).g(q,p) which is a symmetric product, and the Poisson bracket {f(q,p),g(q,p)} which is a skew-symmetric product.

Now onto quantum mechanics. In quantum mechanics one replaces the Poisson bracket with the commutator [A, B] = AB-BA which can be understood as a skew-symmetric product between operators on a Hilbert space. There is also a symmetric product: the Jordan product defined as the anti-commutator: {A,B}=1/2 (AB+BA)

The commutator also obeys the Jacoby identity:

[A,[B,C]] + [C,[A,B]] + [B,[C,A]] =

[A, BC-CB] + [C, AB-BA] + [B, CA-AC]=

ABC-ACB –BCA+CBA + CAB-CBA-ABC+BAC +BCA-BAC-CAB+ACB =0

and the commutator also defines a Lie algebra, just like in classical mechanics.

How can we understand the Jordan product? In quantum mechanics operators do not commute and we cannot simply take the function multiplication.

To generate real spectra and positive probability predictions, observable operators must be self-adjoint: O=O meaning that in matrix form they are the same as the transposed and complex conjugate. However, because of transposition, the product of two self-adjoint operators is not self-adjoint:

(AB) = BA=BA != AB

However, the Jordan product preserves self-adjoint-ness:

{A,B} = ½ ( (AB) + (BA) ) = ½ (BA + AB) = {A,B}

if A=A and B=B

In quantum mechanics the Jordan product is a symmetric product.

Both classical and quantum mechanics have a symmetric and a skew-symmetric product:

CM                              QM

Symm                           f.g                                Jordan product
Skew-Symm                Poisson bracket            Commutator

Both classical and quantum mechanics have dualities:

CM: duality between qs and ps: q <---> p
QM: duality between observables and generators:q <---> -i ħ ∂/∂q = p

So in this post we solved the simple direct problem: extract a symmetric and a skew symmetric product.

In subsequent posts we will show two important things:
1)      we will derive the symmetric and skew-symmetric products of classical and quantum mechanics from composability

2)      we will solve the inverse problem: derive classical and quantum mechanics from the two products.

In the meantime: HAPPY NEW YEAR!

## Solving Hilbert’s sixth problem (part two of many)

### Picking the physical principles

We can now try to pick essential physics principles. Suppose we play God and we need to select the building blocks of reality. To avoid infinite regression (who created God?) we need something which is timeless. Outside space and time, the only things which qualify are mathematical relationships. Euclidean geometry existed well before ancient Greeks, and E=mc^2 was valid before Einstein and before the solar system was formed. The names of the mathematical relationships are just historical accidents.

Fine, but the nature of mathematical relationships is very different than the nature of reality. Sticks and stones may break my bones, but when was the last time you heard that someone was killed by Pythagoras’ theorem? If reality is nothing but mathematical relationships arranged in a way to avoid contradictions, we need to look at the essential differences between mathematics and reality (http://arxiv.org/abs/1001.4586).

One key difference is that of “objective reality”. How can we quantify this? Objective reality means that any two observers can agree on statements about nature. In other words, one can define a universal (non-contextual) notion of truth. In mathematics truth is defined as a consequence of the axioms but in nature truth is defined as the agreement with experiment. Between two incompatible axiomatic systems there is no possible concept of true and false and the same statement can be true in one system, and false in another. Take for example the statement p=”two parallel lines do not intersect”. The same p is true in Euclidean geometry and false in non-Euclidean geometry.

If universal truth is to exist, it implies the possibility to reason consistently and to define probabilities. In a more mundane setting we demand positivity: it is what can define a bit. We take positivity as the first physical principle. We are not specifying what kind of bit we are talking about: classical bit, quantum qbit, current probability density (zbit); only that objective reality (it) can generate information such that any two observers can agree.

There is another key difference between the abstract world of math and the concrete real world. In mathematics there is a disjoined set of mathematical structures and the job of a mathematician is to explore this landscape and find bridges between seemingly isolated areas. Nature on the other hand is uniform and the laws of nature are the same (invariant). There are no island universes in our reality (even if the multiverse may exist, we cannot interact with other pockets of reality with different laws of physics). In mathematics two triangles can be combined to form something else than another triangle, but in nature, the laws of physics for system A and the laws of physics for system B are the same with the laws of physics for system A+B. For example, the Newtonian laws of motion for the Earth, are the same with the Newtonian laws of motion for the Sun, and they do not change when we consider the Earth+Sun system. This may look trivial, but it is an extremely powerful observation and from it we will derive three kinds of dynamics: classical mechanics, quantum mechanics, and another type of mechanics not present in nature (which will show that it violates the positivity condition).

The second physical principle we consider is composition: the laws of nature are invariant under tensor composition.

So if you are God, your requirements for the job are: use timeless mathematical structures as your building blocks, do it in such a way that you create objective reality (ability to define a context independent notion of truth) and make sure that the laws of reality (physics) are invariant. If however you are a physicist wanting to solve Hilbert’s sixth problem, your starting physical principles are: positivity and composition. The idea that reality is made of nothing but of mathematical structures is known as the “mathematical universe hypothesis” but Tegmark’s proposal is done incorrectly: it looks at the similarity between mathematics and reality and proposes computability as the physical principle. The right way is to look at the differences and this leads to composition and positivity. Composition (or composability-which is part of the name of this blog) was initially proposed and explored by Emile Grgin and Aage Petersen, while positivity (the objective reality) as a physical principle was first proposed and explored by the author.

Next time I will start using composability (or the invariance of the laws of nature under tensor composition) to start deriving three (and only three) possible dynamics (two of which being classical and quantum mechanics) in the Hamiltonian formalism.

## Solving Hilbert’s sixth problem (part one of many)

### Outside in or inside out?

In 1900 David Hilbert proposed a set of problems to guide mathematics in the 20th century. Among them problem six asks for the axiomatization of physics.

Solving problem six is a huge task and the current consensus is that it is a pseudo-problem but I will attempt to prove otherwise in this and subsequent posts. I will also start formulating the beginning of the answer.

Let’s first try to get a feel for the magnitude of the problem. What does axiomatizing physics mean? Suppose the problem is solved and we have the solution on a piece of paper in front of us. Should we be able to answer any physics question without using experiments? Is the answer supposed to be a Theory of Everything? Let’s pause for a second and reflect on what we just stated: eliminate the need for experiments in physics!!! This is huge.

But what about Gödel Incompleteness Theorem? Because of it mathematics is not axiomatizable and has an infinite landscape. Do the laws of physics have an infinite landscape too?

The biggest roadblock for solving Hilbert’s sixth problem turns out to be Gödel Incompleteness Theorem. Let’s get the gist of it. Start with an antinomy (any antinomy will do): this statement is false. If the statement is true, its content is accurate but its content says that the statement is false. Contradiction. Likewise, if the statement is false, its negation is true, but the negation states that the statement is true. Again we have a contradiction. This was well known a long time before Gödel as the liars’ paradox. But now let’s follow Gödel and replace true and false with provable and unprovable. We get: this statement is unprovable. Suppose the statement is false. Then the statement is provable. Then there exists a proof to a false statement. Therefore the reasoning system is inconsistent. The only way to restore consistency is to have that the statement is true. Hence we just constructed a true but unprovable statement!

Now in a sufficiently powerful axiomatic system Sn suppose we start with axioms: a1, a2, …, a_n (at the minimum Sn must include the natural number arithmetic). Construct a statement P not provable in the axiomatic system (Gödel does this using the diagonal argument). Then we can add P to a1,…,a_n and construct the axiomatic system S_n+1 = a1, a2, …, a_n, P. We can also construct another axiomatic system S’_n+1 = a1, a2, …, a_n, not P. Both S_n+1 and S’_n+1 are consistent systems, but together are incompatible (because P and not P cannot be both true at the same time). The process can be repeated forever, and hence in mathematics there is no “Theory of Everything Mathematical”, no unified axiomatic system, and mathematics has an infinite domain.

So it looks like the goal of axiomatizing physics is hopeless. Mathematics is infinite, and mathematicians seem to be able to keep exploring the mathematical landscape. Since mathematicians are part of nature too, axiomatizing physics seem to demand math axiomatization as well. Case closed, Hilbert sixth problem must be a pseudo-problem, right?

However, it turns out there is another way to do axiomatization. Let’s start by looking at nature. We see that space-time is four dimensional, we see that nature is quantum at core, we see that the Standard Model has definite gauge symmetries. Nature is written in the language of mathematics. But WHY some mathematical structures are preferred  by Nature over others? We cannot say that those mathematical structures are unique, all mathematical structures are unique! We can say that some mathematical structures are distinguished.

Solving Hilbert sixth problem demands as a prerequisite finding a mechanism to distinguish a handful of mathematical structures from the infinite world of mathematics.

And in a well known case we know the answer. Consider the special theory of relativity: this is a theory based on a physical principle. Finding essential physical principles is what needs to be done first. Suppose we now have all nature’s physical principles written in front of us. What is the next step? The next step is to use them as filters to select distinguished mathematical structures. If we pick the principles correctly, the accepted mathematical structures will be those and only those which are distinguished by nature as well.

Whatever gets selected does not need to be a closed form theory of everything and we bypass the limitation from Gödel incompleteness theorem. Now this program is actually very feasible. Next time I will show how to pick the physical principles, we’ll pick two principles and in subsequent posts I’ll use those principles to derive quantum and classical mechanics step by step in a very rigorous mathematical way (it is rather lengthy to derive quantum mechanics and I don’t know how many posts I’ll need for it). It turns out that quantum and classical mechanics are also theories of nature based on physical principles just like theory of relativity is. The role of the constant of the speed of light postulate will be played in the new case by Bell’s theorem. In the process of deriving quantum mechanics we’ll make great progress towards solving Hilbert’s sixth problem, but we’ll still be far short from a full “theory of everything”.

## Soliton Theory (part 2 of 2)

Besides the KdV equation covered last time, some well known soliton equations are:

-Nonlinear Schrodinger equation (NLSE):

i ∂t ψ = - ½ ∂x2 ψ + k |ψ|2 ψ

φtt – φxx + sin φ = 0

∂x (∂t u + u ∂x u + 2ε2xxx u) ± ∂yy u = 0

I am most familiar with the nonlinear Schrodinger equation and its variations. NLSE was proposed some 40 years ago to be used for describing light propagation in optical fibers. People used signal fires

to transmit information since ancient times (or since Middle Earth J ) but one needs guided light transmission to eliminate interference from the atmosphere (rain, fog, etc). However using ordinary glass is not practical because ordinary glass is not transparent enough. Imagine looking through a glass 30 miles thick!!! 30 miles is the typical length for an optical fiber because it takes dissipation 30 miles to reduce the intensity in half and at that point signal amplification is required. Advances in glass manufacturing resulted in ultra-transparent optical fibers (at certain wavelengths) close to the theoretical transparency limit (this limit is due to Reyleigh scattering which is responsible for the color blue of the sky).

From Maxwell’s equations, one can derive NLSE. The anharmonic electron oscillation generates the nonlinear |ψ|2 ψ term. The ∂x2 ψ term corresponds to ordinary dispersion in the optical fiber and a soliton is a “light bullet” which carries one bit. Using solitons, a single optical fiber can carry giga (109) bits of information per second (the usual soliton pulse is measured in pico-seconds (10-9 s) but femto-second pulses for shorter distances are possible too). A single telephone call requires 64 kbit/s and a 1GB/s optical line can carry 15,625 concurrent calls. In 1999 rates of 300GB/s for a single fiber were commercially achieved and an optical cable bundle has much more than a single optical fiber. Advances in long distance transmission rates were so great that despite Moore’s law computers started to be viewed as the hopeless communication bottleneck.

For all their potential, optical solitons never materialize in practice and there is this funny disconnect between academic research and industry which I encountered several times. For example after I graduated I had a job interview at Bell Labs hoping to amaze them with my NLSE research, but using low intensity traditional pulses, they already achieved in practice one order of magnitude higher transmission rate than what was demonstrated with solitons at that time (based on publications in experimental journals). Solitons are not very practical due to arriving time instabilities and the need to use expensive active pulse repeaters every 30 miles instead of passive amplification (imagine an active repeater malfunctioning at the bottom of Pacific Ocean in an undersea cable).

Another academia-industry disconnect I encountered was the one on training a neural network for OCR (optical character recognition). Just like soliton theory generated thousands of research papers but the industry never adopted it, in neural networks there is an extensively studied “back propagation method”. In practice this is all useless and the industry has much more effective techniques for training neural networks, but they are all trade secrets. With the industry knowledge, reading the back propagation papers made for a good laugh.

But solitons are nice topic in themselves and they do have very interesting math properties.

In practice there are two standard techniques for computing soliton solutions: the Lax pair and the zero curvature condition. The Lax pair is linked with a Sturm-Liouville equation, and solitonic equations have an infinite number of conservation laws. Discovering a lot of conserved quantities for a new equation is a sure clue that the equation admits soliton solutions.

The KdV equation can be expressed as a Hamiltonian system in two distinct ways. In fact such a case is called bi-Hamiltonian, and the interplay between the two Hamitonians generates an infinite number of conservation laws.

Another way one can solve the solitonic equations is by the Riemann-Hilbert problem. Through this problem there is an unexpected link between solitons and renormalization in field theory.

In an initial value problem for solitonic equations, part of the initial condition excites dispersive waves, and (if the energy is large enough) part excites the solitonic pulses. To solve the Riemann-Hilbert problem, given a closed curve on the Riemann sphere (the complex plane + infinity) one has to marry two functions on each side of the curve given some initial value on the curve. In soliton land, the initial value on the curve corresponds to the dispersive waves, and solitons correspond to poles on the Riemann sphere. This is why solitons are robust: once there, the poles cannot be eliminated (in the absence of friction or additional effects which destroys the infinite number of conservation laws).

An excellent review on the Riemann-Hilbert problem and solitons can be found here. The gist of the Riemann-Hilbert problem is: reconstruct an analytic function from its singularities. As Alexander Its points out, in its most general way, integrability means that local properties (singularities) determine global behavior.

## Soliton theory (part 1 of 2)

In the last post I listed the amazing lectures of Mr. Bender on perturbation theory. If you managed to follow the lectures to the end, you got to see the WKB perturbation theory in action. The lecture ending was on extracting a “beyond all orders” behavior.

After watching a powerful movie, don’t you want sometimes to have a different ending? For example the movie Inception:

If you have not watch it, I won’t spoil it by saying more, but if you did, you understand what I mean.

So I cannot help and I’ll attempt to give an alternative ending to Mr. Bender lectures, by venturing into the wonderful area of soliton theory.

In lessons 13, 14 and 15 you got to see how to solve a potential well in Schrodinger’s equation using WKB. At each turning point there is a reflected wave, and one may ask: are there potential well for which there is no reflection? How would a reflectionless potential look like? This is the starting point of the so-called Inverse Scattering Theory .

But let’s start in historical fashion.

In 1832 a certain gentleman, Mr. John Scott Russell, got amazed by a peculiar wave along an English canal:

I was observing the motion of a boat which was rapidly drawn along a narrow channel by a pair of horses, when the boat suddenly stopped – not so the mass of water in the channel which it had put in motion; it accumulated round the prow of the vessel in a state of great agitation, then suddenly leaving it behind, rolled forward with great velocity, assuming the form of a large solitary elevation, a rounded, smooth and well-defined heap of water, which continued its course along the channel apparently without change of form or diminution of speed. I followed it on horseback, and overtook it still rolling on at a rate of some eight or nine miles an hour, preserving its original figure some thirty feet long and a foot to a foot and a half in height. Its height gradually diminished, and after a chase of one or two miles I lost it in the windings of the channel.”

It was not until 1895 when Diederik Korteweg and Gustav deVries derived the equation of this wave:

U_t + 6 U U_x + U_xxx = 0

(now called the KdV equation)

Because the equation contains the second power of U it is a nonlinear equation, and in particular it is a nonlinear partial differential equation, an ugly beast of intractable complexity which nobody knew how to solve.

Fast forward to 1952: Fermi, Pasta, and Ulam were doing computer modeling for a certain problem and noticed some odd periodic behavior.

Further investigating this in 1965, Zabusky and Kruskal observed the kind of solitary waves Mr. Russell witnessed. Those solitary waves or pulses were able to pass through one another with no perturbation whatsoever (and hence the term solitons) which was a very bizarre behavior.

The mathematical breakthrough occurred in 1967 when Gardner, Green, Kruskal and Miura discovered the inverse scattering technique for solving the KdV equation.

But what is inverse scattering? Let us start simpler with linear partial differential equations. How do we solve the initial value problem? The standard technique is that of a Fourier transform.

In a Fourier transform we multiply a function of x: f(x) with exp(ikx) and then we integrate over all x. What results if a function F(k). Fine but what does this have to do with partial differential equations?

Let us take the Fourier transform on the linear partial differential equation. Wherever we have a partial derivation, we integrate by parts and transfer the derivation to the exp(ikx) term. In turn this extracts the iK factor out of the exponent, and under the integral sign we transformed the linear partial differential equation into a polynomial equation in K. Wow (applause please)!!!

Solving polynomial equations is MUCH easier than solving partial differential equations.

So the general technique is the following: extract the (Fourier) modes, solve the easy time problem in (Fourier) modes, and perform a reversed (Fourier) transform to obtain the solution at a later time:

K(t=0)à-solve time evolution in polynomial equation-à---K(t=T)
^                                                                                                 \/
^                                                                                                 \/
^                                                                                                 \/
F(K(t=0))                                                                               F(K(t=T))
^                                                                                                 \/
^                                                                                                 \/
Fourier Transform                                                       Inverse Fourier Transform
^                                                                                                 \/
^                                                                                                 \/
f(x(t=0))                                                                                    f(x(t=T))

Something similar happens in nonlinear partial differential equation and the role of the Fourier transform is taken by solving a Schrodinger equation scattering problem. The scattering potential in the Schrodinger equation is the solution to the nonlinear partial differential equation. The typical solitonic solution is of the form 1/cosh. Solving nonlinear partial differential equations takes this form:

S(t=0)à-solve time evolution in polynomial equation-à---S(t=T)
^                                                                                                 \/
^                                                                                                 \/
^                                                                                                 \/
scattering parameters(t=0)                                      scattering parameters(t=T)
^                                                                                                 \/
^                                                                                                 \/
Direct Scattering Problem                                        Inverse Scattering Problem
^                                                                                                 \/
^                                                                                                 \/
f(x(t=0))                                                                                    f(x(t=T))

Solving the direct scattering problem is straightforward, and here one may use WKB for example, but in practice this does not happen. Using WKB to derive the reflectionless property (this is why solitons pass through each other unperturbed) would yield the 1/cosh solution.

The tricky problem is the inverse scattering by solving a Gelfand-Levitan-Marchenko problem. This is where the computation becomes intensive. The technique makes use of Jost functions to define the boundary conditions, but those are details.

Solitons theory is a very nice and rich area with unexpected links in mathematics and physics. Next time I’ll present some famous solitonic equations, the unexpected link with renormalization theory, and the real world potential usefulness of solitons.

## Mathematical rigor in theoretical physics

We have seen in the last post that mathematical sloppiness can easily lead you astray. But is this always the case?

This is not a new problem. John von Neumann sorted out the mathematical foundation of quantum mechanics in Mathematical Foundations of Quantum Mechanics. The first time I read this book I found it incredibly boring: this is what you learn in school. Then I learned to appreciate its sheer brilliance. The reason it looks so boring is because it was so good it became the standard. At the same time, competing with von Neumann was Dirac who introduced the well known “Dirac functions” – an invaluable tool for any quantum mechanics computation. Here is what von Neumann had to say about it:

“The method of Dirac [...] in no way satisfies the requirements of mathematical rigor – not even if these are reduced in a natural and proper fashion to the extent common elsewhere in theoretical physics” – OUCH!!!

Now I am not a historian and I don’t know the year von Neumann wrote the book (I have only the translation to English year), but it was probably in the 1930s well before the theory of distributions put Dirac’s delta function on a solid mathematical foundation.

Fast forward to present, I have found a series of outstanding lectures by Carl Bender which shook me to the core regarding to what it means to be a theoretical physicist. Towards the end of the series, the fog clears and I came back to my original beliefs about mathematical rigor along the lines of von Neumann, but Mr. Bender managed to give me a scare with some mathematical voodoo.

To give you a taste of the lectures, let me ask a question from Lecture 4: How much is:

1 - 1 + 1 - 1 + … = ?

This is stupid you may say: it is clearly a divergent series. Worse you can make it converge to any number. You pick one, say 26, Add the first 26 ones and then cancel the rest of the series. Does Hilbert hotel ring a bell?

1 + 0 – 1 + 1 + 0 – 1 + 1 + 0 – 1 + … = ?

Would it surprise you if I can prove that this gives a different answer than the first series? And all that we have extras are an infinite numbers of zeros!!!

Let’s proceed.

First we can introduce the Euler summation machine which takes a divergent series and spits out a number E:

So let our sum: Sum (a_n) be not convergent. Construct the following function:

f(x) = Sum(a_n x^n) for x < 1 where x is such that the sum converges

Define E=lim_{x->1} f(x)

Let’s apply it to: 1 – 1 + 1 – 1 + 1 – 1 …

f(x) = 1-x+x^2-x^3+… = 1/(1+x)

Therefore E = 1/2

Can we make other machines in this spirit?

Yes, and here is another one, the Borel summation:

Again Sum (a_n) is not convergent.

We know that: Integral dt exp^(-t) t^n = n! which means that

1 = Integral dt e^{-t} t^n / n!

Replace Sum (a_n) -> Sum (a_n)*1 = Sum (a_n)* Integral dt e^{-t} t^n / n!

Then flip the sum with the integral:

B = Integral dt exp^(-t) Sum (t^n a_ /n!)

Do E=B? Yes they do and here is why:

E and B are machines obeying two rules:

Rule 1: summation property
S(a0 + a1 + a2+ …) = a0 + S(a1 + a2+ …)

Rule 2: linearity
S( Sum(alpha a_n + beta b_n)) = alpha S( Sum (alpha a_n)) + beta S( Sum(b_n))

Let’s apply it to our two divergent series:

sum(1 -1 + 1 -1 + …) = S

S=1+ sum(-1 + 1 -1 + …) (by Rule 1)
S = 1 – sum(1 - 1 + 1 - 1 + …) (by Rule 2)
S = 1-S
2*S= 1
S= 1/2 BINGO!

Now the second series
S=           sum( 1 + 0 -1 +1 +0 -1 +1+…) =
S = 1+     sum( 0 – 1 +1+0 -1  +1 +0+…)=
S = 1+0+ sum(-1+ 1 +0 -1 +1  +0 -1+…)
3S = 1+1+0 +nothing(cancel term by term, no commutation of the order of the numbers in the series)
S = 2/3

Let’s double check with Euler:
f(x) = 1 – x^2 + x^3 – x^5 + x^6 –x^8 +…
= (1+x^3+x^6+…) – (x^2 + x^5 +x^8+…)
=1/(1-x^3) – x^2/(a-x^3) = (1-x^2)/(1-x^3)
lim x-> 1 f(1) = 2/3

Mr. Bender is also making provocative (but true) statements like:

“If you are given a series and you have to add it up the dumbest thing that you can possibly do is add it up […] and if the series diverges it’s not only a stupid idea, it doesn't work.”

Here is the complete series on You Tube:

Lecture 1:

Lecture 2:

Lecture 3:

Lecture 4:

Lecture 5:

Lecture 6:

Lecture 7:

Lecture 8:

Lecture 9:

Lecture 10:

Lecture 11:

Lecture 12:

Lecture 13:

Lecture 14:

Lecture 15:

Enjoy!

## Holonomy in quantum mechanics

### Bohm-Aharonov effect

Let’s start with the definition of holonomy: if you walk in  closed loop and the object you carry changes when you complete the loop then you have experienced a holonomy.

Now this sounds plain crazy so a simple example can illustrate it. Suppose you are a hunter living on the Equator and you go on a quest to explore the Earth. You walk a quarter of the Earth circumference on the Equator going east, you travel north all the way to the North Pole and then you go straight down to the starting point of your journey. During your journey you carry with you your spear always making sure it is pointing in the same direction. For definiteness sake, let’s say that originally your spear was pointing towards the North Pole. When you walk on the Equator and then towards the North Pole your spear is pointing north. However, on the last part of the journey, your spear will be pointing west. So upon your arrival the spear has a different orientation even though you always carried it parallel with itself. This is the result of Earth’s curvature.

Now something very similar happens in Einstein’s general relativity: the presence of mass curves the space-time and although you travel on geodesics in a straight line, nearby curves are not parallel. We feel this lack of parallelism as gravity.

Fine, we understand this, but what does holonomy have to do with quantum mechanics? Suppose I have a box with a quantum device which when I press a button can flash either a red light or a blue light. Suppose that every time I press the button only the red light will flash. Now I go on a similar quest on a closed space loop and when I press the button only the blue light is flashing. This would happen every time I would circle a zone of magnetic field although I cannot detect any magnetic field anywhere on my path.

Now this is downright freaky: there are no forces whatsoever along my path and still there is a measurable effect. Welcome to the wonderful Bohm Aharonov effect.

Mathematicians usually refers to geometric phases and call this effect a topological one. But surprisingly it has a nice mathematical explanation in terms of boundary conditions and domains in standard quantum mechanics. When learning quantum mechanics, pesky boundary conditions and domains tend to be ignored as pedantic crossing of t’s and dotting the i’s. This is the typical cavalier physicist’s attitude towards mathematical rigor. Don’t believe me? Ask any physicist to tell you the difference between Hermitean and self-adjoint. If one in one hundred knows the difference you are lucky.

The point is that self-adjointness demands the domains of the operator and its hermitean counterpart (complex conjugate and transposed) to be identical!!!
A hermitean operator may have even an infinity of different self-adjoint extensions. And the eigenvalues (the observed values) are all distinct.

Here is an example of how sloppiness can get you into trouble: http://www.mth.kcl.ac.uk/~streater/lostcauses.html#VIII

Now for the Bohm-Aharonov effect, it is the boundary condition which selects the eigenvalues. And this boundary condition comes from the magnetic flux carried by the solenoid.

An excellent review of those topics can be found in Asher Peres’ classic: “Quantum Theory: Concepts and Methods” and I hope I wetted your appetite to read this outstanding book.

Now, although true, changes without forces bother a lot a people. Last year a proposal was made to restore the role of forces in the Bohm-Aharonov effect: http://arxiv.org/abs/1110.6169 The paper has a clever idea: the apparatus and not the particle feels the force, but it is fundamentally flawed because any forces will give out the “which way information” which will destroy the interference pattern. Holonomy is a fundamental property of Nature and it is not explainable away.

## What is the number system of quantum mechanics?

### Gauge theory and quantions

Let me start with the answer for the problem I posted last time. The hard part for finding the projectors in quantum mechanics when the number system is non-commutative is to realize the proper place to put the scalars:

|psi>lambda<psi|

with lambda*lambda = lambda. If the number system is a division algebra (like in the case of quaternionic quantum mechanics) by dividing with lambda it follows that lambda = 1. However this is not the case for quantionic quantum mechanics. Expressing the quantions as a linear combination of the Pauli matrices and the identity operator over complex numbers:

lambda = a0 I + a1 Sigma1 +a2 Sigma2 + a3 Sigma3

and using the algebraic properties of the Pauli matrices, from lambda*lambda = lambda one gets:

a0 = ½
a1^2 + a2^2+a3^2 = ½

which is the equation of a sphere. A sphere can be parameterized by two numbers: latitude and longitude. However in this case the latitude and longitude are complex numbers, and the parametrization needs 2X2 = 4 real numbers. This is where the overall coefficient of 4 is coming from.

Coming back to gauge theory, the gauge degrees of freedom leave invariant every observable in the physical system. So why bothering with them one may ask? Because they have very big physical consequences.

Let’s explain it using a nice analogy from everyday life. Think of a thick rug.

At each point in the rug there is a piece fiber sticking out. Suppose somebody drags something across the rug (it could be a toy track for example) and leaves a mark on it. How can we describe this local disturbance in rug’s fibers? If we look very closely we notice displacements between nearby fibers. Mathematicians call this a “connection” which allows to quantify correctly the notion of change (covariant derivative) and a notion of moving from one place to another (parallel transport). What this means is that if the rug is rolled up or twisted in some way, we need to add the local disturbance to the global rug twist to predict correctly the location of each fiber.

Fine, it is not hard to grasp those mathematical concepts but what this has to do with physics?

The remarkable fact is that what mathematicians call connections, physicists call potentials (like the electromagnetic potential).

It took some time to recognize the gauge theory mathematical structure in the electromagnetic field, but if you recall from Maxwell’s equations, there is an electromagnetic four potential Aµ defined up to a gauge and the electromagnetic tensor is F µν = ∂µ Aν - ∂ν Aµ The covariant derivative in this case is Dµ = ∂µ - i Aµ

Grgin’s book has a nice parallel between the gauge theory of electromagnetism, gravity, and quantionic quantum mechanics (electroweak gauge theory).

But where is the gauge degree of freedom coming from? It comes from the inner product in quantum mechanics over an arbitrary number system.  Basically it is the exponent part in a generalized polar form decomposition of the number system similar with complex numbers polar form decomposition.

So here is how it works: in each point of space-time we attach a “fiber” in the form of a quantion. If in ordinary quantum mechanics one has functions of complex variables, here we have functions of quantionic variables. A key difference is in normalization. In a quantionic quantum field, by Zovko interpretation, we demand that the conservation of a quantionic current stemming from the inner product  q^* q. Everything else follows from this.

In gauge theories, the “marks of the toy truck on the rug” are actually particles (lines in a Feynman diagram). The forces are generated by gradients of the potentials, and the potentials are the “connections” allowing the “parallel transport”, or the means to compare nearby points in space-time. In the Standard Model, there are three gauge symmetries: U(1), SU(2), SU(3) corresponding to three fundamental forces: electromagnetism, weak, and strong force. Grand unification theories (GUT) seek to find a common “underlying rug”. A general feature of GUTs is the cross talk between related “fibers” which means that particles are not stable and in particular the proton is not stable and is eventually decaying into leptons.

When a quantion is expressed as a 4x4 matrix, the null entries (8 of them) are used by the electroweak potentials. The covariant derivative is a right quantion and the commutativity between left quantions and right quantions assures the Leibnitz identity needed to turn a right quantion into a derivation.

In quantionic quantum mechanics there is also a notion of curvature and holonomy as well.

Quantionic quantum mechanics has this dual interpretation as ordinary quantum mechanics or as gauge theory. As ordinary quantum mechanics the distinct number system predicts a new physical phenomenon not present in complex quantum mechanics: the zitter effect.

Can ordinary complex quantum mechanics predict holonomy effects too? The answer is yes and we will cover it next time.

## What is the number system of quantum mechanics?

### Inherently relativistic quantum mechanics

It was a cold 2007 January morning during rush our in L’enfant Plaza metro station in Washington DC when one of the best violinist virtuoso in the world, Joshua Bell

played on a Stradivarius violin some of the best classical pieces for about an hour. Do you think his masterpiece performance drew a crowd? The lottery kiosk nearby was attracting much more attention.

In the same 2007 year, Emile Grgin published his Structural Unification of Quantum Mechanics and Relativity book. Still to this day people buy into the “lottery” idea that quantum mechanics is solely about information and Born’s rule.

But maybe Grgin’s results were “crackpot”. That was the reaction of the archive when http://arxiv.org/abs/1204.3562 was reclassified from the quantum section to the general section dedicated to “laymen’s fantasies”.

So let’s prove quantionic quantum mechanics is the real deal and not some crazy idea. First quantions are the simplest non-trivial type I von Neumann algebra. All linear algebras have matrix and vector representations which come in pairs: left algebra and a column vector, and right algebra and a row vector.

For example, a complex number z = a + ib can be represented as:
a   -b
b  a

and

a
b

OR

a  b
-b a

and

a  b

It is a simple theorem that for linear associative algebras the left and right matrix representations commute. For quantions, the left and column representations are:

q1   q3   0    0
q2   q4   0    0
0     0    q1  q3
0     0    q2  q4
and

q1
q2
q3
q4

and the right and row representations are:

q1   0    q3   0
0    q1    0    q3
q2   0    q4   0
0    q2    0   q4

and

q1 q2 q3 q4

Both representations play a major role in the physics. For now, since in the 4x4 matrix algebra representation the left L and a right R quantions have non-overlapping parts, the von-Neumann double commutant theorem (http://en.wikipedia.org/wiki/Von_Neumann_bicommutant_theorem) holds: L^{} = L In turn, this puts the corresponding quantionic Hilbert module theory on a solid foundation.

To bring the discussion on a more known ground, here is the 1-to-1 mapping between quantions and spinors:

q1                                 -Psi2
q2        <----->sqrt(2)     Psi3^*
q3                                   Psi1
q4                                   Psi4^*

Psi1                                 q3
Psi2    <----->1/sqrt(2)   -q1
Psi3                                 q2^*
Psi4                                 q4^*

Dirac’s current

j^mu = Psi^{dagger} gamma^0 gamma^mu Psi

is the same as:

j^mu = (q^* q)^mu

Dirac’s equation in quantionic formalism reads:

D |q) = i m Gamma^1 |q^* )

With D a right derivation quantion.

The Klein-Gordon equation in quantionic formalism reads:

(D’Alembert + m^2) |q) = 0

with |q) the column quantion.

Dirac went from Klein Gordon’s equation to a linear equation by talking the square root of the d’Alembertian:

(gamma^mu partial_mu) * (gamma^mu partial_mu) =   D’Alembert

quantions offer another decomposition of D’Alembert’s operator:

D^sharp D = D’Alembert

With D a derivarion right quantion and the sharp operation the parity-reversed transformation P of a quantion (quantions have the discrete CPT = I symmetry).

Quantionic time evolution is best understood in the gauge theory sense, but to wrap the standard quantum mechanics description, recall this from Hardy’s paper:

|mn> = 1/sqrt(2) (|m> + |n>)
|MN> = 1/sqrt(2) (|m> + i|n>)

and that there were

½ N(N-1) projectors of the form |mn><mn|

Because quantionic quantum mechanics has a non-commutative number system the corresponding most general projectors here is of the form:

|mn>P<mn|
|MN>P<MN|

with P^2=P and P a unit quantion. We know p+ = (1+sigma)/2 and  p- = (1-sigma)/2, are spin projector operators where sigma are the Pauli matrices.

From here it is not hard to show that are 4 linear independent unit quantionic projectors P (one for identity and 3 for each of the Pauli matrices) corresponding to the spin degree of freedom, or to the 4 spinor components. Hardy’s formula then reads:

K = 4N^2 for N quantions

An instructive exercise for the reader is to investigate why quaternionic quantum mechanics does not have projectors of this type (I’ll give the answer in the next post).

Next time I’ll finish the quantionic quantum mechanics presentations from the gauge theory point of view.

## What is the number system of quantum mechanics?

### Quantionic quantum mechanics

So far we have seen that quantum mechanics can be expressed over real numbers, complex numbers, and quaternions. Physically, quantum mechanics over reals and quaternions do not lead to new predictions. Also this short list implies that quantum mechanics is about Born rule and information. We shall see that this is not the complete story.

Quantionic quantum mechanics is a direct counter example to arbitrarily restricting the number system to division algebras. Quantionic quantum mechanics was discovered by Emile Grgin (or Gergin) – I am proud to state that my “Grgin number” is one. Working in Peter Bergmann’s group, Grgin joined forces with Aage Petersen, Bohr’s personal assistant and investigated a line of thought from Bohr about the correspondence principle. This resulted in the composability principle and a dual quantum-classical mechanics framework (pre C* algebra) better than Segal’s quantum algebraic formalism. This work happened in early 70s and was forgotten after Grgin left academia and went to work for the industry. Recently Grgin retired and restarting working in this area resulting in quantionic quantum mechanics – a towering achievement which unfortunately is not well known.

Probably the best way to think quantionic quantum mechanics is terms of Darwin’s evolution: it is a “missing link” between regular quantum mechanics and gauge theory. When talking about quantions, one can either take the point of view of standard quantum mechanics, or the point of view of field theory. Its physical content is identical with Dirac’s theory of the electron, but its formalism is most illuminating.

So let start the story from the trusted Born’s rule. This implies that quantum mechanics is only about information. But is it? All experiments are done in space-time and events require 4 coordinate numbers (x,y,x,t) to be located. In turn Lorentz transformations and special relativity teaches us about 4-vectors, so in a relativistic quantum mechanics it is natural to think not of probabilities, but of probabilities currents. The starting point of quantionic quantum mechanics is a generalization of Born’s rule to probabilities currents (called the “Zovko interpretation”-after the person who discovered it).

Complex numbers have two norms, let’s call them A for algebraic and M for metric. In matrix representation a complex number z = a+ib correspond to a 2x2 matrix:

a   b
-b  a

The algebraic norm A is defined as: A(z) = z^{dagger} z:

a  -b     a  b   =  (a^2 + b^2)  1  0
b   a    -b  a                           0  1
The metric norm is the determinant: M(z) = det (z) = a^2 + b^2

The value is the same, but the meaning is very different. Quantionic quantum mechanics aims to lift this degeneracy and have two fully distinct norms.

Quantions are based on the SL(2, C) ~ SO(3,1) isomorphism and are defined as follows:

q1  q3   0   0
q2  q4   0   0
0    0   q1  q3
0    0   q2  q4

with q1, q2, q3, q4 complex numbers. The multiplication rule for the algebra of quantions (q1, q2, q3, q4) follows from the matrix multiplication rule.

OK, this is a bit dry, and to put it in physics context, for quantions M correspond to relativity or with picking a particular frame of reference, and A corresponds to quntum mechanics and the inner product.

For a quantion Q:
A(Q) = Q^dagger Q :  a future oriented 4-vector
M(Q) = det(Q) : a complex number

The reason for the block diagonal zero elements have to do with gauge theory and quantion-spinor correspondence, and to simplify the idea, we can talk for now about a reduced quantion:

q1  q3
q2  q4

A hermitean reduced quation:

r   z
z*  s

corresponds to a 4-vector: (p0, p1, p2, p3) in the Minkowski space as follows:

p0 + p3        p1+ i p2
p1 – i p2       p0 – p3

More important, A(Q) = Q^dagger Q is a future oriented 4-vector. (recall from prior posts that spin factors are realizations of the Jordan algebra)

The fundamental theorem of quantionic algebra is that the algebraic and metric norm commute: AM(Q) = MA(Q)

Pictorially this can be represented as:

q   ----------------A------------------->A(q) = future oriented 4-vector
|                                                       / |         on Minkowski space
|                                                    /    |
p -------------------A------------------>A(p) = null vector
|                                                  |       |
|        M                                               M     |
|        |                                                  |       M
M     |                                                  |        |
|       z   ---------------A----------------------->x
|                                                           |    /
|                                                           |  /
0 ------------------A---------------------> 0

with p a quantion of determinant zero

The 0-x axis represents the good old-fashion Born rule: the experimental predictions are positive probabilities.

A(q) inside the Minkowski cone A(p) represents Zovko interpretation as probability current.

But is quantionic quantum mechanics nothing but complex quantum mechanics with relativistic symmetry? No, the story is subtle (and this is where spin enters the picture).

In standard complex (non-relativistic) quantum mechanics, a state omega of a quantum mechanical system is a linear functional omega on the space of observables A with the property:

Omega (A^dagger A) >= 0 for all observables in A
Omega (I) = 1

The Hilbert space A has the inner product:

<A, B> = Tr (A^dagger B)

In quantionic quantum mechanics the inner product does not involve the trace. Here:

<A, B> = M (A^dagger B) = determinant (A^dagger B)

In turn this demands the existence of spin.

If I can coin a word, in complex quantum mechanics one talks of qubits, while in quantionic quantum mechanics one talks of z-bits (z from Zovko).

In quantionic quantum mechanics superposition is based on quantions. Recall Adler’s requirements for a modulus function N:

(1)        N(0) = 0
(2)        N(phi) > 0 if phi is not 0
(3)        N(r phi) = |r| N(phi)
(4)        N(phi_1 + phi_2) <= N(phi_1) + N(phi_2)

where N = AM in this case.

Now (2) has to be relaxed: there are quantions phi for which N(phi) = 0 and also (4) is not true in general.

The fact that (4) does not hold is not a problem because it does not hold in general for Hilbert modules either which generalize C* algebras for vector bundles (field theory). Quantionic quantum mechanics can be understood as electroweak gauge theory!!!

Naively, the relaxation of condition (2) can be treated with the tools of C* algebra in the usual GNS construction:

Here singular means that the sesquilinear form may fail to satisfy the non-degeneracy property of inner product. By the Cauchy–Schwarz inequality, the degenerate elements,x in A satisfying ρ(x* x)= 0, form a vector subspace I of A. By a C*-algebraic argument, one can show that I is a left ideal of A. The quotient space of the A by the vector subspace I is an inner product space. The Cauchy completion of A/I in the quotient norm is a Hilbert space H.

In quantionic quantum mechanics this quotient space is empty! This is another clue that there is something fundamentally different at play here and that the Jordan spin factor algebras, although embeddable in the nxn matrix case is a beast of its own.

From the zero determinant quantions, it is clear that quantionic algebra is a non-division associative algebra!!!

And why do we need division in quantum mechanics? It is silly to arbitrarily insist on a mathematical property just for our convenience. In what quantum mechanics book or paper have you seen any direct division by wavefunctions?

The beauty of quantionic quantum mechanics is that nothing needs to be postulated except Zovko’s interpretation. In subsequent posts we’ll see how Dirac equation follows naturally, we’ll investigate the spinor-quantionic correspondence, we’ll give a counter example for Hardy’s formula: K = N^r (for quantionic quantum mechanics the formula is K = 4 N^r), and we’ll present quantionic quantum mechanics from the gauge theory point of view.