Saturday, December 30, 2017

Is Walter Lewin wrong about Kirchhoff's law? 


Thinking how to organize the upcoming material about geometry and physics I came about a recent controversy about an electromagnetism lecture by Walter Lewin and I want to talk about it in this post. 

First, a little background. When I started teaching undergraduate physics I needed help on how to structure the lectures. Personally I wanted to emulate Feynman lectures, but the physics departments impose on you either Giancolli or Young and Friedman. I turned to internet for help and I found the outstanding undergrad lectures by Walter Lewin which I adapted to my needs - thank you Professor Lewin. In lesson 16 on electromagnet induction which you can watch below


on minute 34:54 the fireworks begins. The setup starts unassuming: a trivial circuit with a 1 V battery and two resistors in series. The current is computed as well as the voltage drop across the larger resistor. Then the battery is removed and is replaced by a solenoid in the middle of the circuit which generates an increasing magnetic field such that the induced voltage is the same 1 V as the battery. Now the question becomes: what is the voltage drop on each resistors? The answer is that two voltmeters connected on the same points one on one side of the loop and one of the other side will record different voltages of opposite polarities! In this case one will record +0.9 V and the other -0.1 V. And the experiment confirms this!!!

For  a complete step-by-step derivation see: web.mit.edu/8.02/www/Spring02/lectures/lecsup3-15.pdf

This is deeply at odds with our intuition and there are a lot of "proofs" of why Lewin is wrong. Here is one example which has the advantage over others that it is in English.



Lewin responded to those challenges in two distinct posts in a poor manner in my opinion: he simply repeated slowly and loudly his explanation and did not address the root cause of the discomfort people experience when presented with this counter-intuitive phenomena. So I will attempt to add my own  pedagogical explanation of how to understand this.

First, are the two voltmeters connected on the same points? It is not clearly visible on the video how the two voltmeters are connected, but if you have access to an electronics lab you can actually connect the two voltmeters on the very same points and repeat the experiment. So how is it possible that two identical voltmeters connected to the same points read something differently? This is because different currents flow through the two voltmeters.

Now you may say: this is absurd, why different currents flow through the two voltmeters? Can I just simply use only one voltmeter and place it on the right hand side of the circuit, and then flip it to the left hand side? Why would the reading change?

The reading does not change as long as you do not cross the changing magnetic field zone.  However as you cross the changing magnetic field the voltmeter reading changes gradually from the right hand side value of +0.9 V to the left hand side value of -0.1 V. We can actually compute how this happens and I will do it below. 

But before doing it, I want to say what is wrong in the "disproof" video above. First, the solenoid is the same size of the circuit and is placed under the circuit. This allows for the gradual voltage change due to crossing of the changing magnetic field to be misinterpreted as a voltage drop on the copper wiring. Second, the explanation is inconsistent with regards to the current intensities. If you compute the intensity in the copper wire due to 1 V difference you do not get 1 milli Ampers, but a huge current because copper wire has almost zero resistance. Also there is a big misunderstanding on the how to compute the flux by stating that a vertical plane has no flux going though it. If the loop would be completely vertical this would have been correct, but the the voltmeter loop also contains a horizontal path closing the circuit and this has non-zero magnetic flux crossing through it.

So now onto the computation. I will use only one voltmeter which is placed on the right like in the first picture below:


Then I move the connection points A and B and flip the voltmeter wire to cross the magnetic region (see the second picture and notice that the direction of i stays the same: from the voltmeter to r2).

Let's work out the math in the first case. There are two loops with currents I and i and the resistance of the voltmeter is R >> r1, r2:

I(r1+r1) - i r2= E    (induction law)
i r2 - I r2 + i R = 0  (Kirchhoff)

in the second equation r2 << R and we can ignore the first term: -I r2 + i R = 0 which means i << I and in the first equation we can ignore the negative term resulting in:

I (r1 + r2) = E from which we extract the current in the main loop.

Then V read by the voltmeter is V = i R = I r2 = E r2 /(r1+r2) = 0.9 V using the resistors and E value from the lecture.

Now we are proceeding to flip the voltmeter from right to the left. Suppose that during flipping we cross s% of the changing magnetic flux area (the yellow area) with the voltmeter loop. The first equation reads as before:

I(r1+r1) - i r2= E    (induction law)
but the second one is changed to:

i r2 - I r2 + i R = -sE  (induction law)

the same order of magnitude tricks apply and we get:

I (r1 + r2) = E
-I r2 + i R = - sE

and the voltage recorded by the voltmeter is V = i R = I r2 - sE

V = E r2 /(r1+r2) - sE  with  \(s \in (0,1)\)

So on the right side the voltmeter reads E r2 /(r1+r2) = 0.9 V and on the left side the voltmeter reads 
E r2 /(r1+r2) -E = -E r1/(r1 + r2) = -0.1 V with all the intermediate values in between as we flip the device.

This effect is shown in Mabilde's video from 17:40 to 18:00, but his explanation is wrong.

Alternatively when the voltmeter is flipped all the way to the left we can consider a loop not enclosing the area of changing magnetic field and arrive at the same -0.1 V by using Kirchhoff's just like we did on the right hand side.

If we consider two left and right voltmeters we can also understand why they record different values. We can consider the outer loop and apply the induction law to it and see that a current "i" flows from one voltmeter through the other due to the area of changing magnetic field. Since the voltmeters are polarity sensitive one would record a positive voltage and another one a negative voltage. In fact we can further simplify the setting by removing the two resistors completely and in this case one voltmeter would record +0.5 V and another -0.5 V:



In conclusion the Walter Lewin is completely correct and the controversy stems from blind trust in Kirchhoff's circuit laws which are valid only when there is no changing of magnetic flux in the circuits and a misunderstanding of Maxwell's equations.

Sunday, December 17, 2017

Physics and Geometry


Returning to gauge theory, physics is best understood in terms of geometry. The area is vast and I am considering how to best explain the key concepts in the most intuitive way. Today I want to start with the broad picture to present the relevant mathematical landscape. We need to consider two kinds of transformations:
  • transformations in space-time
  • gauge transformations of physical fields.
The mathematical machinery involved uses fiber bundles and Cartan's language of differential forms. There are two key differential forms: the connection 1-form and the curvature 2-form. Another essential ingredient is that of parallel transport. In terms of physics parallel transport corresponds to the transport of physical information. When we do parallel transport around a closed loop the final state is in general different than the initial state.

Here is some trivial example from ordinary high-school geometry. You are walking on Earth along the equator  and you carry with you an arrow which points North. You walk 1/4 circumference of the Earth when you decide to walk all the way to North Pole. At any point during your journey you keep the orientation of your arrow at time \(t\) parallel to the orientation of your arrow at time \(t+\Delta t\). Initially the arrow was perpendicular to the direction of travel, and when you started going North the arrow is parallel to the direction of travel. Once at North Pole you deice to take the shortest path to your starting point. What is the orientation of your arrow when you get back? Try this with a pencil and a ball.

The parallel transport on a closed loop can be used to define curvature. In terms of physics, gauge theory teaches us that:

force=curvature

When discussing gauge theory, the natural language is that of bundles. On bundles, one starts with product bundles but then one proceeds to general bundles obtained by gluing together product bundles. For Standard Model the product bundles are enough, but for gauge theory on curved space-time (string theory, quantum gravity) you need to use the most general bundles.

One problem arises then: how to relate what different observers see?

Changes in observers are described by cocycles. Cocycles depend on both the topology of the space-time manifold and the structure of the gauge group. The deviation of vector bundles from from product bundles is measured by the so-called characteristic classes (Chern, Euler, Pontryagin, Stiefel-Whitney, Thom classes).

To explain the math machinery of all this is a very ambitious project and I am not sure how far along I can carry the series but I will try. The geometry involved is very beautiful (at least to me). Please stay tuned.

Sunday, December 3, 2017

Industry vs. Academia


Before returning to talk about gauge theory I want to discuss a topic out of my personal experience. I started my career in academia, I switched to industry where I was very successful, and I manage to come back to academia. As such I can offer a good perspective of both and hopefully clarify misconceptions. 

For a little history, I published my first paper second year in college and after graduation I joined the most productive theoretical research group in Romania which was publishing about a paper a month. The sky was the limit and I chose to get my PhD in US. I went to UMCP and after graduation I joined my adviser's recently created company switching to industry where I rapidly climbed the corporate ladder. Now I am back in academia (with a foot still in the industry).

In academia there is this perception (and arrogance) that the outside world is not smart enough. This is farthest from the truth. The smartest person I ever met was Alain Connes (and also the most arrogant person I ever met) but beside this outlier, I can state with confidence that the industry has on average the smarter persons. In general their abilities correlate strongly with the amount of money or power they amass. Another misconception is that in industry you do not work on interesting or hard problems. The opposite is true, and I saw time and time again how the state-of-the-art knowledge in the industry is 10 to 20 years ahead of the  state-of-the-art knowledge in academia. The best industry knowledge is kept under wraps on a need to know basis as trade secrets and you need to be high enough on the food chain to get to know them.

For comparable skill sets, in the industry you earn 2 to 3 times the amount you will make in academia. When I was choosing my PhD adviser, I considered a thesis in particle phenomenology with Professor Rabindra Mohapatra. I had to qualify to be his grad student and after I did that I got to see the fine print of the deal. The entire particle physics group at UMCP (10 professors) had to pull their resources to fund a single research assistant (RA). In comparison, the professors in EE department each had between 10 to 20 RAs due to industry contracts. This turned me off and I picked a better funded adviser. A few years after graduation I was working in the industry and I was in the market for a house. It just happened that professor Mohapatra was selling his house and I got to visit it as a buyer. I did not bought it in the end but already within a few years in industry I had the same buying power as a distinguished full professor in academia.

When you climb the corporate ladder in the industry there are standard steps. You start as an intern and the requirement there is to have potential to grow and to be liked by the group. Then you are a junior engineer who is working under close supervision. When you can work independently in one area you become a senior engineer and a typical salary is around $90K/year.  If you can work across the board in any area, you are a principal engineer. From principle the next step is supervisor where you are responsible for the results of an entire team. Next step is manager who is a supervisor with power to do performance evaluations and make salary decisions. Up to this point the focus is on work and as a manager you get to know the dirty secrets but you still do not have a seat at the big boys table. Next is a director who is an execution machine. The most important skill of a director is to defend his back as peers and people above are out to get him. A director with industry knowledge is a vice president. A VP can run the company but he is not yet vetted from the social part: he is not a member of the C-level club (CEO, COO, CFO). At C-level you are a god for the company and your statements carry legal responsibility. For the people entering the industry, one general word of advice is that the human resource department is never your friend - don't get fooled by their friendliness or lunch events they organize. They are there to prevent the company from being sued and they do the dirty work of firing people - which no manager enjoys.  

I managed to climb all the way to director and VP level, and at every large company I worked the C-level are all crooks-no exceptions. Very smart, very competent, very polished, experts in the art of dissimulation and manipulation, and rotten to the core. Power does corrupt. I came to know countless unbelievable horror stories. 

Now back to physics and academia. One thing I hate was the publish or perish state of fact. Not everyday you have a big breakthrough and 99% of the published papers are utterly useless. People organize in groups citing each other meaningless results, and if your paper is correct most of the referee's comments are about not citing something. When I exited academia I made the decision to come back when I would have something important to say. Coming back turned out to be much much harder than I thought. 

Everything was an uphill battle: first results, gain archive endorsement, first paper, first conference, first teaching assignments. When you are in the system you cannot see how much your adviser is supporting you in your academic career. If you worked in industry and want to come back to academia only because you like physics you will not be successful. The first thing you need is to have a research domain of your own and also to have meaningful  results. What matters the most is what problems you are working on and for this you need good guidance. Good guidance is hard to come by even when you are in academia. I was lucky to have met a remarkable person: Emile Grgin who worked on a research topic from Bohr. Bohr passed this topic to his personal assistant Aage Peterson who worked with Grgin as postdocs at Yeshiva University in the 70s. Grgin switched to industry and I met him as he was retiring interested to make more progress on this topic. I learned the topic and I carried the torch. Lucky for me, to no merit of mine, the area turned out to be a gold mine. So now I was having something meaningful to work on and something meaningful to say and it was time to start the transition back to academia. Also there was no competition in my research area - a wonderful thing.

I might have been a young tiger before in academia but now I was a big nobody as far as academia was concerned. The first hurdle was rising above the large background noise of crackpots. I identified this organization FQXi and I won a third prize there on one of their essay contests. This opened up connections for me and now people would no longer ignore my emails and gave me the cold shoulder. I started blogging for FQXi and this is when I hit the second roadblock. After crackpots there is a layer of charlatans. I ran into the biggest of all Joy Christian. The problem was that he had powerful friends with a lot of clout and fighting Joy blacklisted me and delayed my academia comeback plan by 2-3 years until Joy's credibility was all gone. Finally after wasting in the desert of blacklisting I gained archive endorsement, I started to get invited to conferences, and I started publishing again. I also became a referee on various journals. It was time to start a blog (this one). The blog provided a much needed discipline to work a given set of hours a day on physics. In industry you live under a constant shadow of a deadline and it is very hard to set hours aside in the day to do anything else productive. Now it was time to complete the switch and make money again from a physics job. Here I ran into another large problem. Physics pays peanuts and now I cannot go back and live like a poor grad student anymore: I have a large mortgage to pay as well as college expenses for my kids to the tune of hundreds of thousands of dollars. I needed a side job in the industry to make up the difference, but no job provides that flexibility. The answer was to start my own consultancy company. This is no easy task, but I did have all the knowledge I needed.

So I finally had the first paying job at a university jumping between my company work and my physics duties. Physics was no longer a hobby.  Today I am very busy working over 80 hours a week and I am climbing very rapidly the academic levels. Coming back to academia is a game of building your credibility. A funny observation is that at every level no matter how low there are a lot of politics being played. To me this is very amusing. I know how to read people and get them what they expect from me when they expect it. 

One more observation. Industry is capitalism with the good and the bad. Academia is still run by the rules of a feudal system. There are big established hierarchies which are defended at length without regard for truth. Sadly most people in academia do not seek truth but power. You can see the hierarchy at the conferences on the order and length of time given to the speakers. I found this obscene. In a small world where everybody knows everybody the arrogance of most of the hot shots runs through the roof. There are also scientific crooks who abuse their power for personal gains. There are notable exceptions however. One of the most down to earth nice person who listens to what the lowliest grad student has to say is 't Hooft. I don't agree with his approach to quantum mechanics but I value him as a decent person. Another decent person I admire is Avshalom Elizur. For the record I am not associated with either of them or their groups and I am not praising them for any future gains. I am genuinely impressed by the way they conduct themselves in a sea of feudal system arrogance.

Sunday, October 29, 2017

The electromagnetic field


Continuing from last time, today I will talk about the electromagnetic field as a gauge theory.

1. The gauge group

In this case the gauge group is \(U(1)\) - the phase rotations. This group is commutative. This can be determined if we start from Dirac's equation and we demand that the group leaves the Dirac current of probability density:

\(j^\mu = \Psi^\dagger \gamma^0 \gamma^\mu \Psi\)

invariant.

2. The covariant derivative giving rise to the gauge group

Here the covariant derivative takes the form:

\(D_\mu = \partial_\mu - i A_\mu\)

To determine the gauge connection \(A_\mu\) we can substitute this expression in Dirac's equation:

\(i\gamma^\mu D_\mu \Psi = m\Psi\)

and require the equation to be invariant under a gauge transformation:

\(\Psi^{'} = e^{i \chi}\Psi\)

which yields:

\(A^{'}_{\mu} = A_\mu + \partial_\mu \chi\)

This shows that:
- the general gauge field for Dirac's equation is an arbitrary vector field \(A_\mu (x)\)
-The part of the gauge field which compensates for an arbitrary gauge transformation of the Dirac field \(\Psi (x)\) is the gradient of on an arbitrary scalar field.


3. The integrability condition

Here we want to extract a physically observable object out of a given vector field \(A_\mu (x)\). From above it follows that there is no external potential if \(A_\mu = \partial_\mu \chi\) and this is the case if and only if:

\(\partial_\mu A_\nu - \partial_\nu A_\mu = 0\)

4. The curvature

The curvature measures the amount of failure for the integrability condition and by definition is the left-hand side of the equation from above:

\(F_{\mu\nu} = \partial_\mu A_\nu - \partial_\nu A_\mu\)

and this is the electromagnetic field tensor.

5. The algebraic identities

There is only one algebraic identity in this case stemming from the curvature tensor antisymmetry:

\(F_{\mu\nu} +F_{\nu\mu} = 0\)

6. The homogeneous differential equations

If we take the derivative of \(F_{\mu\nu}\) and we do a cyclic sum we obtain:

\(F_{\mu\nu , \lambda} + F_{\lambda\mu , \nu} + F_{\nu\lambda , \mu} = 0\)

which is analogous with the Bianchi identity in general relativity.

This identity can be expressed using the Hodge dual as follows:

\(\partial_\rho {* F}^{\rho\mu} = 0\)

and this is nothing but two of the Maxwell's equations:

\(\nabla \cdot \overrightarrow{B} = 0\)
\(\nabla \times \overrightarrow{E} + \frac{\partial}{\partial t} \overrightarrow{B} = 0\)

7. The inhomogeneous differential equations

If we take the derivative of \(\partial_\beta F^{\alpha\beta}\) we get zero because the F is antisymetric and \(\partial_{\alpha\beta} = \partial_{\beta\alpha}\). and so the vector  \(\beta F^{\alpha\beta}\) is divergenless. We interpret this as a current of a conserved quantity: the source for the electromagnetic field and we write:

\(\partial_\rho F^{\mu\rho} = 4\pi J^\mu\)

where the constant of proportionality comes from recovering Maxwell's theory (recall that last time \(8\pi G\) came from similar arguments.

From this we now get the other two Maxwell's equations:

\(\nabla \cdot \overrightarrow{E} = 4\pi \rho\)
\(\nabla \times \overrightarrow{B} - \frac{\partial}{\partial t} \overrightarrow{E} = 4\pi \overrightarrow{j}\)

Now we can compare general relativity with electromagnetism:

Coordinate transformation - Gauge transformation
Affine connection \(\Gamma^{\alpha}_{\rho\sigma}\) - Gauge connection \(iA_\mu\)
Gravitational potential \(\Gamma^{\alpha}_{\rho\sigma}\) - electromagnetic potential \(A_\mu\)
Curvature tensor \(R^{\alpha}_{\beta\gamma\delta}\) - electromagnetic field \(F_{\mu\nu}\)
No gravitation \(R^{\alpha}_{\beta\gamma\delta} = 0\) - no electromagnetic field \(F_{\mu\nu} = 0\)


Sunday, October 8, 2017

The gravitational field


Today we will start implementing the 7 point roadmap in the case of the gravitational field. Technically gravity does not form a gauge theory but since it was the starting point of Weyl's insight, I will start with this as well and next time I will show how the program works in case of the electromagnetic field.

1. The gauge group

The "gauge group" in this case is the group of general coordinate transformations in a real four-dimensional Riemannian manifold M. Now the argument against Diff M as a gauge group comes from locality. An active diffeomorphism can move a state localized near the observer to one far away which can be different. However, for the sake of argument I will abuse this today and considered Diff M as a "gauge group" because of the deep similarities (which will explore in subsequent posts) between this and proper gauge theories like electromagnetism and Yang-Mills.

2. The covariant derivative giving rise to the gauge group

For a vector field \(f^\alpha\) the covariant derivative is defined as follow:

\(D_\rho f^\alpha = \partial_\rho f^\alpha +{\Gamma}^{\alpha}_{\rho\sigma} f^\alpha\)

where \({\Gamma}^{\alpha}_{\rho\sigma}\) is called an affine connection. If we demand that the metric tensor is a covariant constant under D we can find that the connection is:

\({\Gamma}^{\sigma}_{\mu\nu} = \frac{1}{2}[g_{\rho\mu,\nu} + g_{\rho\nu,\mu} - g_{\mu\nu,\rho}]\)

where \(f_{\rho,\sigma}  = \partial_\sigma f_\rho\) 


3. The integrability condition

We define this condition as the commutativity of the covariant derivative. If we define the notation: \(D_\mu D_\nu f_\sigma = f_{\sigma;\nu\mu}\) we can write this condition as:

\(f_{\rho;\mu\nu} - f_{\rho;\nu\mu} = 0\)

Computing the expression above yields:

\(f_{\rho;\mu\nu} - f_{\rho;\nu\mu} = f_\sigma {R}^{\sigma}_{\rho\mu\nu}\)
where
\({R}^{\sigma}_{\rho\mu\nu} = {\Gamma}^{\tau}_{\rho\mu}{\Gamma}^{\sigma}_{\tau\nu} - {\Gamma}^{\tau}_{\rho\nu}{\Gamma}^{\sigma}_{\tau\mu} + {\Gamma}^{\sigma}_{\rho\mu,\nu} - {\Gamma}^{\sigma}_{\rho\nu,\mu}\)

4. The curvature

From above the integrability condition is \({R}^{\sigma}_{\rho\mu\nu} = 0\) and R is called the Riemann curvature tensor.

5. The algebraic identities

The algebraic identities come from the symmetry properties of the curvature tensor which reduces the 256 components to only 20 independent ones. I am too tired to type the proof of the reduction to 20, but you can easily find the proof online.

6. The homogeneous differential equations

If we take the derivative of the Riemann tensor we obtain a differential identity known as the Bianchi identity:

\({R}^{\sigma}_{\rho\mu\nu;\tau} + {R}^{\sigma}_{\rho\tau\mu;\nu} + {R}^{\sigma}_{\rho\nu\tau;\mu} = 0\)

7. The inhomogeneous differential equations

This equation is of the form:

geometric concept = physical concept

And in this case we use the stress energy tensor \(T_{\mu\nu}\) and we find a geometric object with the same mathematical properties: symmetric and divergenless build out of curvature tensor. The left-hand side is the Einstein tensor:

\(G_{\mu\nu} = R_{\mu\nu} - \frac{1}{2}g_{\mu\nu}R\)

The constant of proportionality comes from recovering Newton's gravitational equation in the nonrelativistic limit. In the end one obtains Einstein's equation:

\(G_{\mu\nu} = 8\pi G T_{\mu\nu}\)

Next time I will go through the same process for the electromagnetic field and map the similarities between the two cases. Please stay tuned.

Sunday, September 24, 2017

The Math of Gauge Theories

With a bit of a delay I am resuming the posts on gauge theory and today I will talk about the math involved. 

In gauge theory you consider the base space-time as a manifold and you attach at attach point an object or what is called a fiber forming what it is called a fiber bundle. The picture which you should have in mind is that if a rug.


The nature of the fibers is unimportant at the moment, but they should obey at least the properties of a linear space. 

Physically think of the fibers as internal degrees of freedom at each spacetime point, and a physical configuration would correspond to a definite location at one point long the fiber for each fibers. 

The next key concept is that of a gauge group. A gauge group is the group of transformations which do not affect the observables of the theory. 

Mathematically, the gauge symmetry depends on how we relate points between nearby fibers and to make this precise we only need (only) one critical step: define a covariant derivative.

Why do we need this? Because an arbitrary gauge transformation does not change the physics and the usual ordinary derivative sees both infinitesimal changes to the fields, and the infinitesimal changes to an arbitrary gauge transformation. Basically we need to compensate for the derivative of an arbitrary gauge transformation.

If d is the ordinary derivative, let's call D the covariant derivative and their difference (which is a linear operator) is called either a differential connection, a gauge field, or a potential:

A(x) = D - d

D and d act differently: d "sees" the neighbourhood behaviour but ignores the value of the function on which it acts, and D acts on the value but is blind to the neighbourhood behaviour.   

The condition we will impose on D is that is must satisfy the Leibniz identity because it is derivative:

D(fg) = (Df)g+f(Dg)

which in turn demands:

A(fg) = (Af)g+f(Ag)

In general only one part of A may be used to compensate for gauge transformations, and the remaining part represent an external field that may be interpreted as potential. When no external potentials are involved, A usually respects integrability conditions. Those conditions depend on the concrete gauge theory and we will illustrate this in subsequent posts.

When external fields are present, the integrability conditions are not satisfied and this is captured by what is called a curvature. The name comes from general relativity where lack of integrability is precisely the space-time curvature.

The symmetry properties arising out of curvature construction gives rise to algebraic identities.

Next in gauge theories we have the homogeneous and inhomogeneous differential equations. As example of homogeneous differential equations are the Bianchi identities in general relativity and the two homogeneous Maxwell's equations. The inhomogeneous equations are related to the sources of the fields (current in electrodynamics, and stress-energy tensor in general relativity).

So to recap, the steps used to build a gauge theory are:

1. the gauge group
2. the covariant derivative giving rise to the gauge field
3. integrability condition
4. the curvature
5. the algebraic identities
6. the homogeneous equations
7. the inhomogeneous equations

In the following posts I will spell out this outline first for general relativity and then for electromagnetism. Technically general relativity is not a gauge theory because diffeomorphism invariance cannot be understood as a gauge group but the math similarities are striking and there is a deep connection between diffeomorphism invariange and gauge theory which I will spell out in subsequent posts. So for now please accept this sloppiness which will get corrected in due time.

Monday, September 4, 2017

The Bohm-Aharaonov effect


Today we come back to gauge theory and continue on Weyl's ideas. With the advent of quantum mechanics Weyl realized that he could reinterpret his change in scale as a change in the phase of the wavefunction. Suppose we make the following change to the wavefunction:

\(\psi \rightarrow \psi s^{ie\lambda/\hbar}\)

The overall phase does not affect the Born rule and we did not change the physics (here \(\lambda\) does not depend on space and time and it is called a global phase transformation). Let's make this phase change depend on space and time: \(\Lambda = \Lambda (x,t) \) and see where it leads. 

To justify this assume we are studying charged particle motion in an electromagnetic field and suppose that \(\Lambda\) corresponds to a gauge transformation for the electromagnetic field potentials \(A\) and \(\phi\):

\(A\rightarrow A + \nabla \Lambda\)
\(\phi \rightarrow \phi - \partial_t \Lambda\)

This should not change the physics and in particular it should not change Schrodinger's equation. To make Schrodinger's equation invariant under a local \(\Lambda\) change we need to add  \(-eA\) to the momentum quantum operator:

\(-i\hbar \nabla \rightarrow -i\hbar \nabla -eA\)

And the Schrodinger equation of a charged particle in an electromagnetic field reads:

\([\frac{1}{2m}{(-i\hbar\nabla -eA)}^2 + e\phi +V]\psi = -i\hbar\frac{\partial \psi}{\partial t}\)

But why do we have the additional \(eA\) term to begin with? It's origin is in Lorentz force. If \(B = \nabla \times A\) and \(E = -\nabla \phi - \dot{A}\), the Lagrangian takes the form:

\(L = \frac{1}{2} mv^2 - e\phi + ev\cdot A\)

which yields the canonical momenta to be:

\(p_i = \partial{\dot{x}_i} = mv_i + eA_i\)

and adding \(-eA\) to the momenta in the Hamiltonian yields Lorentz force from Hamlton's equations of motion. 

Coming back to Schrodinger's equation we notice that the electric and magnetic fields E and B do not enter the equation, but instead we have the electromagnetic potentials. Suppose we have a long solenoid which has inside a non zero magnetic field B, and outside zero magnetic field. Outside the solenoid, in classical physics we cannot detect any change if the current flows or not through the wire. However the vector potential is not zero outside the solenoid (\(\nabla\times A = 0\) does not imply \(A=0\)) and the Schrodinger equation solves differently when \(A = 0\) and \(A\ne 0\). 

From this insight Bohm and Aharonov came up with a clever experiment to put this to the test: in a double slit experiment, after the slits they proposed to add a long solenoid. Record the interference pattern with no current flowing through the solenoid and repeat the experiment with the current creating a magnetic field inside the solenoid. Since the electrons do not enter the solenoid, from classical physics we should expect no difference, but in quantum mechanics the vector potential is not zero and the interference pattern shifts. Unsurprisingly the experiment confirms precisely the theoretical computation.

There are several important points to be made. First, there is no classical explanation of the effect: E and B are not fundamental, but \(\phi\) and \(A\) are. It is mind boggling that even today there are physicists who do not accept this and continue to look for effects rooted in E and B. Second, the gauge symmetry is not just a accidental symmetry of Maxwell's equation but a basic physical principle which turns out to govern all fundamental forces in nature. Third, the right framework for gauge theory is geometrical and we will explore this in depth in subsequent posts. Please stay tuned.

Due to travel, the next post is delayed 2 days.

Sunday, August 20, 2017

Impressions from Yellowstone


I was on vacation for a week in Yellowstone and I will put the physics post on hold want to share what I saw. First, the park is simply amazing and I highly recommend to visit if you have the chance. You need at least 3 days as a bare minimum. The main road is like the number 8 and on the west (left) side you get to see lots of fuming hot spots ejecting steam and sulfur.



The colors are due to bacteria and different bacteria live at different temperatures giving the hot spots rings of color.

On the south side you get the geysers and Old Faithful which erupts every 90 minutes.


You need to be there approximately 1 hour before the eruption to get a sit on the benches which surround Old Faithful. There are other geysers but you don't know when they erupt.

On the east side at the bottom of the 8 there is Yellowstone lake which gives rise to Yellowstone river and the Yellowstone canyon. Not much to do at the lake, the water is very cold.  The river forms two large waterfalls and you can visit them on both sides.



Coming north on the east side, you encounter more waterfalls and a bit of bisons. If you are lucky you get to see in the distance bears usually eating a dead moose. By the way, there is a big business ripoff in terms of bear sprays. You can buy one for $50, but you should rent one for $10/day when you hike in the forest. Even better just buy a $1 bell to wear to let the wildlife you are there (bears avoid people if they can hear them coming).

You can hike Mt. Washburn (4 hour round trip hike) to get a panoramic view of the park 50 miles in any direction.



There is nothing to see in the east-west part of the road at the middle of the 8, and on the the north of the east road there is another road leading east in Lamar's valley. Here is where you see a ton of wildlife: bisons, moose, wolves. Literally there are thousands of bisons in big herds which often cross the road.




Driving in the park is slow (25 mph) due to many attractions on the side and the traffic jams caused by animals. You need one day for north part, one day for the south loop, and one day for Lamar valley.

Yellowstone is at the spot of a supervolcano which erupted 7 times in the past: when it erupts it covers half of US with volcanic ash. There is a stationary hot spot of magma and because the tectonic plate moves different eruptions occur in different places. The past eruption locations trace a clear path on the map.


Yellowstone park is located in the caldera (the volcano crater) of the last eruption.

Sunday, August 6, 2017

The origins of gauge theory


After a bit of absence I am back resuming my usual blog activity. However I am extremely busy and I will create new posts every two weeks from now on. I am starting now a series explaining gauge theory and today I will start at the beginning with Hermann Weyl's proposal.


In 1918 Hermann Weyl attempted to unify gravity with electromagnetism (the only two forces known at the time) and in the process he introduce the idea of gauge theory. He espouse his ideas in his book "Space Time Matter" and this is a book which I personally find hard to read. Usually the leading physics people have crystal clear original papers: von Neumann, Born, Schrodinger, but Weyl's book combines mathematical musings with metaphysical ideas in an unclear direction. The impression I got was of a mathematical, physical and philosophical random walk testing in all possible ways and directions and see where he could make progress. He got lucky and his lack of cohesion saved the day because he could not spot simple counter arguments against his proposal which could have stopped him cold in his tracks. But what was his motivation and what was his approach?

Weyl like the local character of general relativity and proposed (from pure philosophical reasons) the idea that all physical measurements are relative. I particular, the norm of a vector should not be thought as an absolute value, but as a value that can change at various point of spacetime. To compare at different points, you need a "gauge", like a device used in train tracks to make sure the train tracks remained at a fixed distance from each other. Another word he used was "calibration", but the name "gauge" stuck.

So now suppose we have a norm \(N(x)\) of a vector and we do a shift to \(x + dx\). Then:

\(N(x+dx) = N(x) + \partial_{\mu}N dx^{\mu}\)

Also suppose that there is a scaling factor \(S(x)\):

\(S(x+dx) = S(x) + \partial_{\mu}S dx^{\mu}\)

and so to first order we get that N changes by:

\(( \partial_{\mu} + \partial_{\mu} S) N dx^{\mu} \)
Since for a second gauge \(\Lambda\), \(S\) transforms like:

\(\partial_{\mu} S \rightarrow \partial_{\mu} S  +\partial_{\mu} \Lambda \)

and since in electromagnetism the potential changes like:

\(A_{\mu}  \rightarrow A_{\mu} S  +\partial_{\mu} \Lambda \)

Weyl conjectured that \(\partial_{\mu} S = A_{\mu}\).

However this is disastrous because (as pointed by Einstein to Weyl on a postcard) it implies that the clocks would change their frequencies based on the paths they travel (and since you can make atomic clocks it implies that the atomic spectra is not stable).

Later on with the advent of quantum mechanics Weyl changed his idea of scale change into that of a phase change for the wavefunction and the original objections became mute. Still more needed to be done for gauge theory to become useful.

Next time I will talk about Bohm-Aharonov and the importance of potentials in physics as a segway into the proper math for gauge theory. 

Please stay tuned.

Monday, July 10, 2017

The main problem of MWI is the concept of probability


Now it is my turn to present the counter arguments against many worlds. All known derivations of Born rule in MWI have (documented) issues of circularity: in the derivation the Born rule is injected in some form or another. However the problem is deeper: there is no good way to define probability in MWI.

Probability can be defined either in the frequentist approach as limit of frequency for large trial numbers, or subjectively as information update in the Bayesian approach. Both those approaches are making the same predictions. 

It is generally assumed by all MWI supporters that branch counting leads to incorrect predictions and because of this the focused is changed on subjective probabilities and the "apparent emergence" of Born rule. However this implicitly breaks the frequentist-subjective probability relationship. The only way one can use the frequentist approach is by using branch counting. Let's have a simple example.

Suppose you work at a factory which makes fair (quantum) coins which land 50% up and 50% down. Your job is quality assurance and you are tasked with finding the defective coins. Can you do your job in a MWI quantum universe? The only thing you can do is to flip the coin many times and see if it lands about 50% up and 50% down. For a fair coin there is no issue. However for a biased coin (say 80%-20%) you get the very same outcomes as in the case of the fair coins and you cannot do your job.



There is only one way to fix the problem: consider that the world does not split in 2 up and down branches, but say in 1 million up and 1 million down branches. In this case you can think that in the unfair case the world splits in 1.6 million up worlds, and 400 thousand down worlds. This would fix the concept of probability in MWI restoring the link between frequentist and subjective probabilities, but this is not what MWI supporters claim. Plus, this has problems of its own with irrational numbers and the solution is only approximate to some limit of precision which can be refuted by any experiment run long enough.

So to boil the problem down, in MWI there is no outcome difference in case of a fair coin versus an unfair coin toss: in both cases you get an "up world" and a "down world". Repeating the coin toss any number of times does not change the nature of the problem in any way. Physics is an experimental science and we test the validity of the theories against experiments. Discarding branch counting in MWI is simply unscientific

Now in the last post Per argued for MWI. I asked him to show what would happen if we flip a fair and an unfair coin three times to simply run through his argument on an elementary example and not hid behind general equations. After some back and forth, Per computed the distribution \(\rho\) in the fair and unfair case (to match quantum mechanics predictions) but the point is that \(\rho\) must arise out of the relative frequencies and not be computed by hand. Because the relative frequencies are identical in the two cases \(\rho\) must be injected by a different mechanism. His computation of \(\rho\) is the point where circularity is introduced in the explanation. If you look back in his post, this comes from his equation 5 which is derived from equation 3. Equation 3 assumes Born rule and is the root cause of circularity in his argument. Per's equation 7 recovers the Born rule in the limit case after assuming Born rule in equation 3 - q.e.d.

Sunday, June 25, 2017

Guest Post defending MWI


As promised, here is a guest post from Per Arve. I am not interjecting my opinion in the main text but I will ask questions in the comments section.

Due to the popularity of this post I am delaying the next post for a week.

The reason to abandon the orthodox interpretation of quantum mechanics is its incompleteness. Bohr and Heisenberg refused the possibility to describe the measurement process as a physical process. This is encoded in Bohr's claim that the quantum world cannot be understood. Such an attitude served to avoid endless discussions about the weirdness of quantum mechanics and divert attention to the description of microscopic physics with quantum mechanics. Well done! A limited theory is better than no theory.

But, we should always try to find theories that in a unified way describes the larger set of processes. The work by Everett and the later development of decoherence theory by Zeh, Zurek and others have given us elements to describe also the measurement process as a quantum mechanical process. Their analysis of the measurement process implies that the unitary quantum evolution leads to the emergence of separate new "worlds". The appearance of separate "worlds" can only be avoided if there is some mechanism that breaks unitarity.

The most well-known problem of Everett's interpretation is that of the derivation of the Born rule. I describe the solution of that problem here. (You can also check my article on the arxiv [1603.01625] Postulates for and measurements in Everett's quantum mechanics)

The main point is to prove that physicists experience the Born rule. That is by taking an outside view of the parallel worlds created in a measurement situation. The question, what probability is from the perspective of an observer inside a particular branch, is more a matter of philosophy than of science.

The natural way to find out where something is located is to test with some force and find out where we find resistance. The force should not be so strong that it modifies the system we want to probe. This corresponds to the first order perturbation of the energy due to the external potential U(x),

\(\Delta E =\int d^3 x {|\psi (x)|}^2 U(x)\)  (1)

This shows that \({|\psi(x)|}^2\) gives where the system is located. (Here, spin and similar indexes are omitted.)

The argumentation for the Born rule relies on that one may ignore the presence of the system in regions, where integrated value of the wave function absolute square is very small.

In order to have a well defined starting point I have formulated two postulates for Everett's quantum mechanics.

EQM1 The state is a complex function of positions and a discrete index j for spin etc,

\(\Psi = \psi_j (t, x_1, x_2, ...) \)  (2)

Its basic interpretation is given by that the density 

\(\rho_j (t, x_1, x_2,...) = {|\psi_j (t, x_1, x_2, ...)|}^2 \)  (3)

answers where the system is in position, spin, etc.

It is absolute square integrable normalized to one 

\( \int \int···dx_1dx_2 ··· \sum_j {|\psi_j (t, x_1, x_2, ...)|}^2 = 1\)  (4)

This requirement signifies that the system has to be somewhere, not everywhere. If the value of the integral is zero, the system doesn’t exist anywhere.

EQM2 There is a unitary time development of the state, e.g.,

\(i \partial_t \Psi = H\Psi \),

where H is the hermitian Hamiltonian. The term unitary signifies that the value of the left hand side in (4) is constant for any state (2).

Consider the typical measurement where something happens in a reaction and what comes out is collected in an array of detectors, for instance the Stern-Gerlach experiment. Each detector will catch particles that have a certain value of the quantity B we want measure.

Write the state that enter the array of detectors as sum of components that enter the individual detectors, \(|\psi \rangle = \sum c_b |b\rangle\), where b is one of the possible values of B. When that state has entered the detectors we can ask, where is it? The answer is that it is distributed over the individual detectors. The distribution is 

\(\rho_b = {|c_b|}^2 \)  (5)

This derived by integrate the density (3) over the detector using that the states \(|b\rangle\) have support only inside its own detector. 

The interaction between \(|\psi \rangle\) and the detector array will cause decoherence. The total system of detector array and \(|\psi \rangle\) splits into separate "worlds" such that the different values b of the quantity B will belong to separate "worlds".

After repeating the measurement N times, the distribution that answer how many times have the value \(b=u\) been measured is

\(\rho(m:N | u)= b(N,m) {(\rho_u)}^m{(\rho_{¬u})}^{N−m} \)  (6)

where \(b(N,m)\) is the binomial coefficient \(N\) over \(m\) and \(\rho_{¬u}\) is the sum over all \(ρ_b\) except \(b=u\).

The relative frequency \(z=m/N\) is then given by

\(\rho(z|u) \approx \sqrt{(N/(2\pi \rho_u \rho_{¬u}))} exp( −N{(z−\rho_u)}^2/(2\rho_u \rho_{¬u}) ) \)  (7)

This approaches a Dirac delta \(\delta(z − \rho_u)\). If the tails of (7) with low integrated value are ignored, we are left with a distribution with \(z \approx u\). This shows that the observer experiences a relative frequency close to the Born value. Reasonably, the observer will therefore believe in the Born rule.

The palpability of the densities (6) and (7) may be seen by replacing the detectors by a mechanism that captures and holds the system at the different locations. Then, we can measure to what extent the system is at the different locations (4) using an external perturbation (1). In principle, also the distribution from N measurements is directly measurable if we consider N parallel experiments. The relative frequency distribution (7) is then also in principle a directly measurable quantity.

A physicist that believes in the Born rule will use that for statistical inference in quantum experiments. According to the analysis above, it will work just as well as we expect it to do using the Born rule in a single world theory.

A physicist who believes in a single world will view the Born rule as a law about probabilities. A many-worlder may view it as a rule that can be used for inference about quantum states as if the Born rule is about probabilities.

With my postulates, Everett's quantum mechanics describe the world as we see it. That is what should be discussed. Not whether it pleases anybody or not.

If the reader is interested what to do in a quantum russian roulette situation, I have not much to offer. How to decide your future seems to be a philosophical and psychological question. As a physicist, I don't feel obliged to help you with that.

Per Arve, Stockholm June 24, 2017

Sunday, June 18, 2017

Impressions from FQMT 2017


Poster, FQMT

I just came back from Vaxjo where I had a marvelous time. It does sounds cliche, but this year was the best conference organized by Professor Khrennikov and I got many pleasant and unexpected surprises.

The conference did had one drawback: everyday after the official talks we continue the discussions about quantum mechanics well past midnight at "The Bishops Arms" where we drank too many beers causing me to gained a few pounds :)

At the conference I had a chance to meet and talk with Phillipe Grangier (he worked with Aspect on the famous Bell experiment) and I witness him giving the best cogent comments on all talks: from experimental to theoretical. He even surprised me when he asked at the end of my presentation why I am using time to derive Leibniz identity, where any other symmetry will do? Indeed this is true, but the drawback is that any other symmetry lacks generality later on during composition arguments. Suppose we compose two physical systems: one with with a continuous symmetry and another without, then the composed system will lack that symmetry. The advantage of using time is that it works for all cases where energy is conserved.

Grangier presented his approach on quantum mechanics reconstruction using contextuality and continuity (like in Hardy's 5 reasonable axioms paper). The problem of continuity is that it lacks physical intuition/motivation. Why not impose right away the C* condition: \(||a^* a|| = {||a||}^2\) and recover everything from it?

Bob Coecke and Aleks Kissinger book on the pictorial formalism: "Picturing Quantum Processes" was finally ready and was advertised at the conference. If you go to www.cambridge.org/pqp you can get with a 20% discount when you enter the code COECKE2017 at the checkout.

Coecke 's talk was about causal theories and his main idea was: "time reversal of any causal theory = eternal noise". This looks deep, but it is really a trivial observation: you can't get anything meaningful and you can't control signals which have an information starting point because the starting point corresponds to the notion of false and anything is derivable from false.

Robert Raussendorf from University of Vancouver had a nice talk about measurement based quantum computations where measurements are used to control the computation and he identified a cohomological framework.

One surprise talk for me was the one given by Marcus Appleby from University of Sydney who presented a framework of equivalence for Quantum Mechanics between finite and infinite dimensional cases. This is of particular importance to me as I recovered quantum mechanics in the finite dimensional case only and I am searching for an approach to handle the infinite dimensional case.

I made new friends there and I got very good advice and ideas - a big thank you. I also got to give many in person presentations of my quantum reconstruction program.

There was one person claiming he solved the puzzles of the many worlds interpretation. I sat next to him at the conference dinner and I invited him to have a guest post at this blog to present his solution. As a disclaimer, I think MWI lacks the proper notion of probability and I am yet to see a solution but I am open to listen to new arguments. What I would like to see is an explanation of how to reconcile the world split of 50-50% when the quantum probabilities are 80-20%? I did not see this explained in his presentation to my satisfaction, but maybe I was not understating the argument properly.

Saturday, June 10, 2017

Jordan-Banach, Jordan-Lie-Banach, C* algebras, and quantum mechanics reconstruction


This a short post written as waiting for my flight at Dulles Airport on my way to Vaxjo Sweden for a physics conference. 

First some definitions. a Jordan-Banach algebra is a Jordan algebra with the usual norm properties of a Banach algebra. A Jordan-Lie-Banach algebra is a Jordan-Banach algebra which is a Lie algebra at the same time. A Jordan-Lie algebra is the composability two-product algebra which we obtained using category theory arguments.

Last time I hinted about this week's topic which is the final step in reconstructing quantum using category theory arguments. What we obtain from category theory is a Jordan-Lie algebra which in the finite dimensional case has the spectral properties for free because the spectrum in uniquely defined in an algebraic fashion (things gets very tricky in the infinite dimensional case). So in the finite dimensional case JL=JLB.

But how can we go from Jordan-Banach algebra to C*? In general it cannot be done. C* algebras correspond to quantum mechanics and on the Jordan side we have the octonionic algebra which is exceptional. Thus cannot be related to quantum mechanics because octonions are not associative. However we can define state spaces for both Jordan-Banach and C* algebras and we can investigate their geometry. The geometry is definable in terms if projector elements which obey: \(a*a = a\). In turn this defines the pure states as the boundary of the state spaces. If the two geometries are identical, we are in luck. 

Now the key question is: under what circumstances can we complexify a Jordan-Banach algebra to get a C* algebra?

In nature, observables play a dual role as both observables and generators. In literature this is called dynamic correspondence. Dynamic correspondence is the essential ingredient: any JB algebra with dynamic correspondence is the self-adjoint part of a C* algebra. This result holds in general and can be established by comparing the geometry of the state spaces for JB and C* algebras.

Now for the punch line: a JL algebra comes with dynamic correspondence and I showed that in prior posts. The conclusion is therefore:

in the finite dimensional case: JL is a JLB algebra which gives rise to a C* algebra by complexification and by GNS construction we obtain the standard formulation of quantum mechanics. 

Quantum mechanics is fully reconstructed in the finite dimensional case from physical principles using category theory arguments! 

By the way this is what I'll present at the conference (the entire series on QM reconstruction).

Sunday, June 4, 2017

From composability two-product algebra to quantum mechanics

Last time we introduced the composability two-product algebra consisting of the Lie algebra \(\alpha\) and the Jordan algebra \(\sigma\) along with their compatibility relationship. This structure was obtained by categorical arguments using two natural principles of nature:

- laws of nature are invariant under time evolution
- laws of nature are invariant under system composition

What we did not obtain were spectral properties. However, in the finite dimensional case, we do not need spectral properties and we can fully recover quantum mechanics in this particular case. The trick is to classify all possible two-product algebras because there are only a handful of them. This is achieved with the help of the Artin-Weddenburn theorem

First some preliminary. We need to introduce a Lie-Jordan-Banach (JLB) algebra by augmenting the composability two-product algebra with spectral properties:
-a JLB-algebra is a composability two-product algebra with the following two additional properties:
  • \(||x\sigma x|| = {||x||}^{2}\)
  • \(||x\sigma x||\leq ||x\sigma x + y\sigma y||\)
Then we can define a C* algebra by compexification of a JLB algebra where the C* norm is:

\(||a+ib|| = \sqrt{{||a||}^{2}+{||b||}^{2}}\)

Conversely from a C* algebra we define a JLB algebra as the self-adjoint part and where the Jordan part is:

\(a\sigma b = \frac{1}{2}(ab+ba)\)

and the Lie part is:

\(a\alpha b = \frac{i}{\hbar}(ab-ba)\)

From C* algebra we recover the usual quantum mechanics formulation by GNS construction which gets for us:

- a Hilbert space H
- a distinguished vector \(\Omega\) on H arising out of the identity of the C* algebra
- a representation \(\pi\) of the algebra as linear operators on H
- a state \(\omega\) on C* represented as \(\omega (A) = \langle \Omega, \pi (A)\Omega\rangle_{H}\)

Conversely, from quantum mechanics a C* algebra arises as bounded operators on the Hilbert space.

The infinite dimensional case is a much harder open problem. Jumping from the Jordan-Banach operator algebra side to the C* and von Neuman algebras is very tricky and this involves characterizing the state spaces of operator algebras. Fortunately all this is already settled by the works of Alfsen, Shultz, Stormer, Topping, Hanche-Olsen, Kadison, Connes. 

Sunday, May 21, 2017

The algebraic structure of quantum and classical mechanics


Let's recap on what we derived so far. We started by considering time as a continous functor and we derived Leibniz identity from it. Then for a particular kind of time evolution which allows a representation as a product we were able to derive two products \(\alpha\) and \(\sigma\) for which we derived the fundamental bipartite relations.

Repeated applications of Leibniz identity resulted in proving \(\alpha\) as a Lie algebra, and \(\sigma\) as a Jordan algebra and an associator identity between them:

\([A,B,C]_{\sigma} + \frac{J^2 \hbar^2}{4}[A,B,C]_{\alpha} = 0\)

where \(J\) is a map between generators and observables encoding Noether's theorem.

Now we can combine the Jordan and Lie algebra as:

\(\star = \sigma\pm \frac{J \hbar}{2}\alpha\)

and it is not hard to show that this product is associative (pick \(\hbar = 2\) for convenience):

\([f,g,h]_{\star} = (f\sigma g \pm J f\alpha g)\star h - f\star(g\sigma h \pm J g\alpha h)=\)
\((f\sigma g)\sigma h \pm J(f\sigma g)\alpha h \pm J(f\alpha g)\sigma h + J^2 (f\alpha g)\alpha h \)
\(−f\sigma (g\sigma h) \mp J f\sigma (g\alpha h) \mp J f\alpha (g\sigma h) − J^2 f\alpha (g\alpha h) =\)
\([f, g, h]_{\sigma} + J^2 [f, g, h]_{\alpha} ±J\{(f\sigma g)\alpha h + (f\alpha g)\sigma h − f\sigma (g\alpha h) − f\alpha (g\sigma h)\} = 0\)

because the first part is zero by associator identity and the second part is zero by applying Leibniz identity. In Hilbert space representation the star product is nothing but the complex number multiplication in ordinary quantum mechanics

Now we can introduce the algebraic structure of quantum (and classical) mechanics:

A composability two-product algebra is a real vector space equipped with two bilinear maps \(\sigma \) and \(\alpha \) such that the following conditions apply:

- \(\alpha \) is a Lie algebra,
- \(\sigma\) is a Jordan algebra,
- \(\alpha\) is a derivation for \(\sigma\) and \(\alpha\),
- \([A, B, C]_{\sigma} + \frac{J^2 \hbar^2}{4} [A, B, C]_{\alpha} = 0\),
where \(J \rightarrow (−J)\) is an involution mapping generators and observables, \(1\alpha A = A\alpha 1 = 0\), \(1\sigma A = A\sigma 1 = A\)

For quantum mechanics \(J^2 = -1\). In the finite dimensional case the composability two-product algebra is enough to fully recover the full formalism of quantum mechanics by using the Artin-Wedderburn theorem.

The same structure applies to classical mechanics with only one change: \(J^2 = 0\).

In classical mechanics case, in phase space, the usual Poisson bracket representation for product \(\alpha\) can be constructively derived from above:
\(f\alpha g = \{f,g\} = f \overset{\leftrightarrow}{\nabla} g = \sum_{i=1}^{n} \frac{\partial f}{\partial q^i} \frac{\partial g}{\partial p_i} - \frac{\partial f}{\partial p_i} \frac{\partial g}{\partial q^i}\)

and the product \(\sigma\) is then the regular function multiplication.

In quantum mechanics case in the Hilbert space representation we have the commutator and the Jordan product:

\(A\alpha B = \frac{i}{\hbar}  (AB − BA)\)
\(A\sigma B = \frac{1}{2} (AB + BA)\)

or in the phase space representation the Moyal and cosine brackets:

\(\alpha = \frac{2}{\hbar}\sin (\frac{\hbar}{2} \overset{\leftrightarrow}{\nabla})\)
\(\sigma = \cos (\frac{\hbar}{2} \overset{\leftrightarrow}{\nabla})\)

where the associative product is the star product.

Update: Memorial Day holiday interfered with this week's post. I was hoping to make it back home on time to write it today, but I got stuck on horrible traffic for many hours. I'll postpone the next post for a week.

Monday, May 15, 2017

The Jordan algebra of observables


Last time, from concrete representations of the products \(\alpha\) and \(\sigma\) we derived this identity:

\([A,B,C]_{\sigma} + \frac{i^2 \hbar^2}{4}[A,B,C]_{\alpha} = 0\)

Let's use this in a particular case when \(C = A\sigma A\). What does the left hand side say?

\([A,B,C]_{\sigma} = (A\sigma B) \sigma (A\sigma A)) - A\sigma (B \sigma (A \sigma A))\) 

which if we drop \(\sigma\) for convenience sake reads:

\((AB)(AA) - A(B(AA))\)

If the right hand side is zero then we get the Jordan identity:

\((xy)(xx) = x(y(xx))\) where \(xy = yx\)

Now let's compute the right hand side and show it is indeed zero:

\([A,B,A\sigma A]_{\alpha} =  (A\alpha B) \alpha (A\sigma A)) - A\alpha (B \alpha (A \sigma A))\)

Using Leibniz identity in the second term we get:

\((A\alpha B) \alpha (A\sigma A)) - (A\alpha B) \alpha (A\sigma A) - B \alpha (A\alpha (A\sigma A))) = - B \alpha (A\alpha (A\sigma A))\)

But \(A\alpha (A\sigma A) = 0 \) because

\(A\alpha (A\sigma A) = (A\alpha A) \sigma A + A\sigma (A\alpha A) \)

and \(A\alpha A = -A\alpha A = 0\) by skew symmetry.

Therefore due to the associator identity, the product \(\sigma\) is a Jordan algebra. Now we need to arrive at the associator identity using only the ingredients derived so far. This is tedious but it can be done using only Jacobi and Leibniz identity. Grgin and Petersen derived it in 1976 and you can see the proof here

The associator identity is better written as:

\([A,B,C]_{\sigma} + \frac{J^2 \hbar^2}{4}[A,B,C]_{\alpha} = 0\)

where \(J\) is a map from the the product \(\alpha\) to the product \(\sigma\). The existence of this map is equivalent with Noether's theorem. It just happens that in quantum mechanics case \(J^2 = -1\) and the imaginary unit maps anti-Hermitean generators to Hermitean observables. 

In classical physics case, \(J^2 = 0\) and this means that the product \(\sigma\) is associative (in fact it is the ordinary function multiplication) and the product \(\alpha\) can be proven to be the Poisson bracket, but that is a topic for another day as we will continue to derive the mathematical structure of quantum mechanics. Please stay tuned.  

Sunday, May 7, 2017

Lie, Jordan algebras and the associator identity


Before I continue the quantum mechanics algebraic series, I want to first state my happiness for the defeat of the far (alt)-right candidate in France despite Putin's financial and hacking support. Europe has much better antibodies against the scums like Trump than US. In US the diseases caused by the inoculation of hate perpetuated over many years by Fox News has to run its course before things will get better.

Back to physics, first I will show that the product \(\alpha\) is indeed a Lie algebra. This is utterly trivial because we need to show antisymmetry and the Jacobi identity:

\(a\alpha b = -b\alpha a\)
\(a\alpha (b\alpha c) + c\alpha (a\alpha b) + b\alpha (c\alpha a) = 0\)

We already know that  the product \(\alpha\) is antisymmetric and we know that the it obeys Leibniz identity:

\(a\alpha (b\circ c) =  (a\alpha b) \circ c + b\circ (a\alpha c) \)

where \(\circ\) can stand for either \(\alpha\) or \(\sigma\). When \(\circ = \alpha\) we get:

\(a\alpha (b\alpha c) =  (a\alpha b) \alpha c + b\alpha (a\alpha c) \)

which by antisymmetry becomes

\(a\alpha (b\alpha c) = - c \alpha (a\alpha b) - b\alpha (c\alpha a) \)

In other words, the Jacobi identity.

Therefore the product \(\alpha\) is in fact a Lie algebra. Now we want to prove that the product \(\sigma\) is a Jordan algebra.

This is not as simple as proving the Lie algebra, and we will do it with the help of a new concept: the associator. Let us first define it. The associator of an arbitrary product \(\circ\) is defined as follows:

\([a,b,c]_{\circ} = (a\circ b)\circ c - a\circ (b\circ c)\)

as such it measures the lack of associativity. 

It is helpful now to look at the concrete realizations of the products \(\alpha\) and \(\sigma\) in quantum mechanics to know where we want to arrive. In quantum mechanics the product alpha is the commutator, and the product sigma is the anticommutator:

\(A \alpha B = \frac{i}{\hbar}[A,B] = \frac{i}{\hbar}(AB - BA)\)
\(A\sigma B = \frac{1}{2}\{A, B\} = \frac{1}{2}(AB+BA)\)

Let's compute alpha and sigma associators:

\([A,B,C]_{\alpha} = \frac{-1}{\hbar^2}([AB-BA, C] - [A, BC-CB]) = \)
\(=\frac{-1}{\hbar^2}(ABC-BAC-CAB+CBA - ABC+ACB+BCA-CBA)\)
\(= \frac{-1}{\hbar^2}(-BAC-CAB +ACB+BCA)\)


\([A,B,C]_{\sigma} = \frac{1}{4}(\{AB+BA, C\} - \{A, BC+CB\}) = \)
\(=\frac{1}{4}(ABC+BAC+CAB+CBA - ABC-ACB-BCA-CBA) = \)
\(=\frac{1}{4}(BAC+CAB -ACB-BCA)  \)

and so we have the remarkable relationship:

\([A,B,C]_{\sigma} + \frac{i^2 \hbar^2}{4}[A,B,C]_{\alpha} = 0\)

What is remarkable about this is that the Jordan and Lie algebras lack associativity in precisely the same way and because of this they can be later combined into a single operation. The identity above also holds the key for proving the Jordan identity.

Next time I'll show how to derive the identity above using only the ingredients we proved so far and then I'll show how Jordan identity arises out of it. Please stay tuned.

Sunday, April 30, 2017

The origin of the symmetries of the quantum products


Quantum mechanics has three quantum products: 
  • the Jordan product of observables
  • the commutator product used for time evolution
  • the complex number multiplication of operators 
The last product is a composite construction of the first two and it is enough to study the Jordan product and the commutator. In the prior posts notation, the Jordan product is called \(\sigma\), and the commutator is called \(\alpha\). We will derive their full properties using category theory arguments and the Leibniz identity. Bur before doing this, I want to review a bit the two products. The commutator is well known and I will not spend time on it. Instead I will give the motovation for the Jordan product. 

In quantum mechanics the observables are represented as self-adjoint operators: \(O = O^{\dagger}\) If we want to create another self-adjoint operator out of two self-adjoint operators A and B, the simple multiplication won't work because \((AB)^{\dagger} = B^{\dagger} A^{\dagger} = BA \ne AB\). The solution is to have a symmetrized product: \(A\sigma B = (AB+BA)/2\). A lot of the quantum mechanics formalism transfers to the Jordan algebra of observables, but this is a relatively forgotten approach because it is rather cumbersome (the Jordan product is not associative but power associative) and (as it is expected) it does not produce any different predictions than the standard formalism based on complex numbers.

Now back to obtaining the symmetry properties of the Jordan product \(\sigma\) and commutator \(\alpha\), at first we cannot say anything about the symmetry of the product \(\sigma\). However we do know that the product \(\alpha\) obeys the Leibniz identity. We have already use it to derive the fundamental composition relationships, so what else can we do? We can apply it to a bipartite system:

\(f_{12}\alpha_{12}(g_{12}\alpha_{12}h_{12}) = g_{12}\alpha_{12}(f_{12}\alpha_{12}h_{12}) + (f_{12}\alpha_{12}g_{12})\alpha_{12} h_{12}\)

where

\(\alpha_{12} = \alpha\otimes \sigma + \sigma\otimes\alpha\)

Now the key observation is that in the right hand side, \(f\) and \(g\) appear in reverse order. Remember that the functions involved in the relationship above are free of constraints, by judicious picks of their value can lead to great simplifications because \(1 \alpha f = f\alpha 1 = 0\). The computation is tedious and I will skip it, but what you get in the end is this:

\(f_1\alpha h_1 \otimes [f_2 \alpha g_2 + g_2 \alpha f_2 ] = 0\)

which means that the product alpha is anti-symmetric \(f\alpha g = -g\alpha f\)

If we use this property in the fundamental bypartite relationship we obtain in turn that the product sigma is symmetric: \(f\sigma g = g\sigma f\)

Next time we will prove that \(\alpha\) is a Lie algebra and that \(\sigma\) is a Jordan algebra. Please stay tuned.

Sunday, April 16, 2017

The fundamental bipartite relations


Continuing from where we left off last time, we introduced the most general composite products for a bipartite system:

\(\alpha_{12} = a_{11}\alpha \otimes \alpha + a_{12} \alpha\otimes\sigma + a_{21} \sigma\otimes \alpha + a_{22} \sigma\otimes\sigma\)
\(\sigma_{12} = b_{11}\alpha \otimes \alpha + b_{12} \alpha\otimes\sigma + b_{21} \sigma\otimes \alpha + b_{22} \sigma\otimes\sigma\)

The question now becomes: are the \(a\)'s and \(b\)'s parameters free, or can we say something abut them? To start let's normalize the products \(\sigma\) like this:

\(f\sigma I = I\sigma f = f\)

which can always be done. Now in:

\((f_1 \otimes f_2)\alpha_{12}(g_1\otimes g_2) = \)
\(=a_{11}(f_1 \alpha g_1)\otimes  (f_2 \alpha g_2) + a_{12}(f_1 \alpha g_1) \otimes (f_2 \sigma g_2 ) +\)
\(+a_{21}(f_1 \sigma g_1)\otimes  (f_2 \alpha g_2) + a_{22}(f_1 \sigma g_1) \otimes (f_2 \sigma g_2 )\)

if we pick \(f_1 = g_1 = I\) :

\((I \otimes f_2)\alpha_{12}(I\otimes g_2) = \)
\(=a_{11}(I \alpha I)\otimes  (f_2 \alpha g_2) + a_{12}(I \alpha I) \otimes (f_2 \sigma g_2 ) +\)
\(+a_{21}(I \sigma I)\otimes  (f_2 \alpha g_2) + a_{22}(I \sigma I) \otimes (f_2 \sigma g_2 )\)

and recalling from last time that \(I\alpha I = 0\) from Leibniz identity we get:

\(f_2 \alpha g_2 = a_{21} (f_2 \alpha g_2 ) + a_{22} (f_2 \sigma g_2)\)

which demands \(a_{21} = 1\) and \(a_{22} = 0\).

If we make the same substitution into:

 \((f_1 \otimes f_2)\sigma_{12}(g_1\otimes g_2) = \)
\(=b_{11}(f_1 \alpha g_1)\otimes  (f_2 \alpha g_2) + b_{12}(f_1 \alpha g_1) \otimes (f_2 \sigma g_2 ) +\)
\(+b_{21}(f_1 \sigma g_1)\otimes  (f_2 \alpha g_2) + b_{22}(f_1 \sigma g_1) \otimes (f_2 \sigma g_2 )\)

we get:

\(f_2 \sigma g_2 = b_{21} (f_2 \alpha g_2 ) + b_{22} (f_2 \sigma g_2)\)

which demands \(b_{21} = 0\) and \(b_{22} = 1\)

We can play the same game with \(f_2 = g_2 = I\) and (skipping the trivial details) we get two additional conditions: \(a_{12} = 1\) and \(b_{12} = 0\).

In coproduct notation what we get so far is:

\(\Delta (\alpha) = \alpha \otimes \sigma + \sigma \otimes \alpha + a_{11} \alpha \otimes \alpha\)
\(\Delta (\sigma) = \sigma \otimes \sigma + b_{11} \alpha \otimes \alpha\)

By applying Leibniz identity on a bipartite system, one can show after some tedious computations that \(a_{11} = 0\). The only remaining free parameters is \(b_{11}\) which can be normalized to be ether -1, 0, or 1 (or elliptic, parabolic, and hyperbolic). Each choice corresponds to a potential theory of nature. For example 0 corresponds to classical mechanics, and -1 to quantum mechanics.

Elliptic composability is quantum mechanics! The bipartite products obey:


\(\Delta (\alpha) = \alpha \otimes \sigma + \sigma \otimes \alpha \)
\(\Delta (\sigma) = \sigma \otimes \sigma - \alpha \otimes \alpha\)

Please notice the similarity with complex number multiplication. This is why complex numbers play a central role in quantum mechanics.

Now at the moment the two products do not respect any other properties. But we can continue this line of argument and prove their symmetry/anti-symmetry. And from there we can derive their complete properties arriving constructively at the standard formulation of quantum mechanics. Please stay tuned.