Problems vs. Puzzles

  • Dark matter is a problem.
  • Dark energy is a problem
  • Neutrino masses are a problem.
  • The observed abundance of matter over antimatter is a problem.
  • Quantum gravity is a problem.

In contrast:

  • The hierarchy “problem” is a puzzle.
  • The strong CP “problem” is a puzzle.
  • The spectrum of fermion masses and mixing angles in the Standard Model is a puzzle.
  • The cosmological flatness “problem” is a puzzle.
  • The quantization of electric charge is a puzzle.
  • The interpretation of Quantum Mechanics is a puzzle.
  • The question of why there are exactly three spatial dimensions is a puzzle.
  • The question of why there are exactly three fermion families is a puzzle.
  • The question of why the standard model gauge symmetry is $SU(3) \times SU(2) \times U(1)$ is a puzzle.
  • The strengths of the fundamental forces is a puzzle.

A problem is an inconsistency in a given theory (observation vs. theory or within a theory). A puzzle is something that seems to require an explanation.

Problems need solutions, puzzles don’t. Nature doesn’t care about what we find strange. We can ignore puzzles and nothing goes wrong.

So it seems obvious that we should focus on real problems. They are “the most promising route to progress“.

But there are rather simple solutions for most problems.

  • We can explain dark matter using one or multiple new particles which we add to the particle zoo.
  • We can explain neutrino masses using right-handed neutrinos.
  • We can explain dark energy using the cosmological constant.
  • We can explain the baryon asymmetry using leptogenesis.

Of course, it is possible to come up with alternative solutions. But these are usually baroque and seem as if you take a sledgehammer to crack a nut. Thus, as long as the simplest solutions aren’t experimentally excluded, it makes sense to stick with them. (C.f. Occam’s razor)

But unfortunately, experimental confirmation of these simple solutions wouldn’t be a huge step forward. The discovery of a dark matter particle or right-handed neutrinos wouldn’t lead to a paradigm shift. In some sense, they would simply be new facts which we add to the list of what we know about nature.

Formulated differently, working on problems is a safe bet with a rather small expected return on investment. Progress is more or less guaranteed because there has to be a solution, but will probably be quite small in the grand scheme of things.

And that’s why puzzles are interesting.

They have the potential to lead to paradigm shifts. But they are a lot riskier because it isn’t even clear that a solution exists or is needed at all.

  • An explanation why there are exactly three spatial dimensions possibly requires an understanding of how spacetime itself emerges.
  • A complete explanation of the fermion spectrum possibly requires a completely new perspective what elementary particles are.
  • An explanation of the standard model gauge symmetry probably requires a new understanding of how symmetries emerge in nature.
  • A resolution of the hierarchy puzzle probably requires a new way to understand why fundamental parameters have the values they have.

And that’s why I wouldn’t dismiss puzzles as bad problems. They are in a different category. It’s important to keep them separate. But it makes sense to think about them analogous to how it makes sense for some people to invest in startups and not only in established big companies.

A Mystery called Wick Rotation or can we understand the “Action” Formalism?

The Wick rotation pops up as a “mere technical trick” in quantum field theoretical calculations. Making the time coordinate complex $t \to i \tau$ is described as “analytic continuation” and helps to solve integrals. Certainly, there is nothing deep behind this technical trick, right?

Well, I’m no longer so sure.

There is one observation that makes me (and others) wonder:

The difference between the mystical theory of quantum fields and ordinary statistical mechanics is a Wick rotation $ t ➝ i /(kT) $.

This is puzzling. On the one hand, we have ordinary statistical mechanics that we understand perfectly well. When we want to make a statement about a system where we don’t know all the details, we invoke the principle of maximum entropy and get as a result the famous Boltzmann distribution $\propto exp(-E/T)$. This distribution tells us the probabilities to find the system, depending on the energy $E$, in the various macroscopic states. There is nothing mysterious about this. The principle of maximum entropy is simply an optimal guessing strategy in situations where we don’t know all the details. (If you don’t know this perspective on entropy you can read about it, for example, here). This interpretation due to Jaynes and the derivation of the Boltzmann distribution are completely satisfactory. It is no exaggeration when we say that we understand statistical mechanics.

On the other hand, we have the mysterious “probability distribution” in quantum field theory that is known as the path integral. I know no one who claims to understand why it works. The path integral is proportional to $exp(iS/\hbar)$, where $S$ denotes the action and even Feynman admitted:

I don’t know what action is“.

In his book QED, when he talks about the path integral, he writes

Will you understand what I’m going to tell you? …No, you’re not going to be able to understand it. … I don’t understand it. Nobody does.

It seems as if all the mysteries of the quantum world are encapsulated in a simple Wick rotation.

I’ve been searching for quite a while but wasn’t able to find any sufficient discussion of this curious fact.

There was some “recent” work by John Baez, which he described in his blog and also a paper. He also tried to make sense of Wick rotations by making use of it in a classical mechanics example. (See the “Homework on A spring in imaginary time” here and additionally, the discussion here). The lesson there was that “replacing time by “imaginary time” in Lagrangian mechanics turns dynamics problems involving a point particle into statics problems involving a spring.” In addition, several years ago Peter Woit tried to emphasize the confusion surrounding Wick rotations in a blog post. He wrote:

“I’ve always thought this whole confusion is an important clue that there is something about the relation of QFT and geometry that we don’t understand. Things are even more confusing than just worrying about Minkowski vs. Euclidean metrics. To define spinors, we need not just a metric, but a spin connection. In Minkowski space this is a connection on a Spin(3,1)=SL(2,C) bundle, in Euclidean space on a Spin(4)=SU(2)xSU(2) bundle, and these are quite different things, with associated spinor fields with quite different properties. So the whole “Wick Rotation” question is very confusing even in flat space-time when one is dealing with spinors.”

However, apart from that, there don’t seem to be any good discussions of the “meaning” of a Wick rotation and I still don’t know what to make of it. Yet, it seems clear that something very deep must be going on here. If the Boltzmann distribution can be understood perfectly by invoking the “best guess” strategy known as “maximal entropy”, has the path integral a similar origin? Probably, but so far, no one was able to find it.

In statistical mechanics, our best guess for the macroscopic state our system is in the state that can be realized through the maximal number of microscopic states. This state is known as the state with maximal entropy. Many microscopic details make no difference for the macroscopic properties and therefore, many microscopic configurations lead to the same macroscopic state. We don’t know all the microscopic details but want to make a statement about the macroscopic properties. Hence, we must use the best guess approach, and the best guess is the macroscopic state with maximum entropy.

Aren’t we doing in quantum theory something similar? We admit that we don’t know the fundamental microscopic dynamics. We don’t know which path a given particle takes from point $A$ to point $B$. Nevertheless, when pressured, we make a guess. Our best guess is the path with extremal action.

The observation by John Baez mentioned above that a Wick rotation connects a static description, like in statistical mechanics, with a dynamical description, like in quantum field theory, seems to make sense from this perspective.

Some nice thoughts in this direction are collected by Tommaso Toffoli in his two papers: “What Is the Lagrangian Counting?” and “Action, Or the Fungibility of Computation“. For example, he writes:

just as entropy measures, on a log scale, the number of possible microscopic states consistent with a given macroscopic description, so I argue that action measures, again on a log scale, the number of possible microscopic laws consistent with a given macroscopic behaviour. If entropy measures in how many different states you could be in detail and still be substantially the same, then action measures how many different recipes you could follow in detail and still behave substantially the same.

In addition, I think there could be a connection to recent attempts to understand quantum theory as extended probability theory, where we allow negative probabilities. This line of thought leads to complex probability amplitudes like we know them from quantum theory. For a nice introduction to this perspective on quantum theory, see this lecture by Scott Aaronson. Interestingly he argues that this extension of probability theory is all we need to derive quantum theory. Again, the switch to complex quantities seems to make all the difference.

I think this is a good example of an obvious, but, so far, not sufficiently understood the connection that could yield deep insights into the quantum world. I usually write about things, where I think I have understood something. However, here I mainly wrote this to organize my thoughts and as a reminder to think more about this in the future.

To finish, here is an incomplete list, where Wick rotations are also crucial

1.) We use a Wick rotation to classify all irreducible representations of the Lorentz group. In this context, the Wick rotation is often called “Weyl’s unitary trick”.
2.) A Wick rotation is used to analyze tunneling phenomena, like, for example, the famous instanton solutions in QFT.
3.) People who consider QFT at finite temperatures make heavy use of Wick rotations.

Demystifying the Hierarchy Problem

There is no hierarchy problem in the Standard Model. The Standard Model has only one scale: the electroweak scale. Therefore, there can’t be any hierarchy problem because there is no hierarchy. But, of course, there are good reasons to believe that the Standard Model is incomplete and almost inevitably if you introduce new physics at a higher scale, you get problems with hierarchies. Formulated more technically, only when the cutoff $\Lambda$ is physical, we have a real hierarchy problem.

However, whenever someone starts talking about the hierarchy problem, you should ask: which one?

  • There is a tree-level hierarchy problem which you get in many extensions of the Standard Model. As an example, let’s consider GUT models. Here the Standard Model gauge symmetry is embedded into some larger symmetry group. This group breaks at an extremely high scale and the remnant symmetry is what we call the Standard Model gauge symmetry. If you now write down the Higgs potential for GUT model, the standard assumption is that all parameters in this potential are of the order of the GUT scale because, well there isn’t any other scale and we need to produce a GUT scale vacuum expectation value. The mystery is now how the, in comparison, tiny vacuum expectation value of the electroweak Higgs comes about. In a GUT this Standard Model Higgs lives in the same representation as several superheavy scalars. The superheavy masses of these scalars are no problem if we assume that all parameters in the GUT Higgs potential are extremely large numbers. But somehow these parameters must cancel to yield the tiny mass of the Standard Model Higgs. If you write down two random large numbers it’s extremely unlikely that they cancel so exactly that you get a tiny result. Such a cancelation needs an explanation and this is what people call the tree-level hierarchy problem. The prefix “tree-level” refers to the fact that no loops are involved here. The problem arises solely by investigating the tree-level Higgs potential.
  • But there is also a hierarchy problem which has to do with loops, i.e. higher orders in perturbation theory. The main observation is that our bare Higgs mass $m$ (the parameter in the Lagrangian) gets modified if we move beyond the tree-level. While this happens for all particles, it leads to a puzzle for scalar particles like the Higgs boson because here the loop corrections are directly proportional to the cutoff scale $\Lambda$. Concretely, the physical Higgs mass we can measure is given by $$ m^2_P = m^2 + \sigma (m^2) +\ldots , $$ where $m$ is the bare mass, $m_P$ the physical mass and $\sigma (m^2) $ the one-loop corrections. The puzzle is now that if we want to get a light Higgs mass $m_P^2 \ll \Lambda^2$, we need to fine-tune the bare parameter $m^2$: $$ m^2 \approx \Lambda^2 +m_P^2. $$ For example, for a physical Higgs mass $m_P \approx 125$ GeV and a cutoff scale around the Planck scale $\Lambda \approx  10^19$ GeV, we find that $$ m^2 = (1+10^{-34}) \Lambda^2 .$$ This means that our bare mass $m$ must be tuned extremely precisely to yield the light Higgs mass that we observe. This is automatically the case if we have a large cutoff scale. If we include higher order in perturbation theory, the situation gets even worse. At each order of perturbation theory, we must repeat the procedure and fine-tune the bare Higgs mass even further. This is what people usually call the hierarchy problem because the core of the problem is that the cutoff scale $\Lambda$ is so far above the electroweak scale.

Now, here’s the catch. Nature doesn’t know anything about loops. Each loop represents a term in our perturbation series. Perturbation theory is a tool we physicists invented to describe nature. But nature knows nothing about the bare mass and loop corrections. She only knows the whole thing $m_P$, which is what we can measure in experiments. In other words, we can’t measure the bare mass $m$ or, say, the one-loop correction $\sigma (m^2)$. Therefore, these parameters only exist within our description and we can simply adjust them to yield the value for the physical Higgs mass that we need.

The situation could be very different. If we could measure $m$ or $\sigma (m^2)$, for example, because they can be calculated using other measurable parameters, there would be a real problem. If two measurable parameters would cancel so precisely, we would have every right to wonder. But as long as the bare mass is only something which exists in our description and isn’t measurable, there is not really a deep problem because we can simply adjust these unphysical parameters at will.

Similar arguments are true for the tree-level hierarchy problem. As long as we haven’t measured the GUT scale Higgs potential parameters, there is nothing to really wonder about. Maybe the large symmetry gets broken differently, was never a good symmetry in the first place or maybe the parameters happen to cancel exactly.

Two great papers which discuss this point of view in more technical terms are

So … there isn’t really a hierarchy problem?

There is one. But to understand it we need to look at the whole situation a bit differently.


The main idea of this alternative perspective is to borrow intuition from condensed matter physics. Here, we can also use field theory because we can excite the atoms our system consists of to yield waves. In addition, there can be particle-like excitations which are usually called phonons. For our problem here, the most important observation is that here we also have a cutoff scale $\Lambda$ which represents the inverse atomic spacing/the lattice spacing. Beyond this scale, our description doesn’t make sense.

With this in mind, we can understand what a hierarchy problem really is from a completely new perspective. Naively, we expect that if we excite our condensed matter system we only get small excitations. In technical terms, we generically only expect correlation lengths of the order of the lattice spacing. All longer correlation lengths need an explanation. The correlation length is inversely proportional to the mass associated with the excitation. Hence, in particle physics jargon we would say that for a system with cutoff $\Lambda$, we only expect particles with a mass of order $\Lambda$, i.e. superheavy particles.

Now the mystery is that we know that there are light elementary particles although the cutoff scale is presumably extremely high. In condensed matter jargon this means that we know that there are excitations with an extremely long correlation length compared with the fundamental lattice spacing.

This is the mystery we call the hierarchy problem.

But this is not only a helpful alternative perspective. It also allows us to think about possible solutions in completely different terms. We can now ask: under what circumstances do we get excitations with extremely long correlation length compared to the lattice spacing?

The answer is: whenever the system is close to a critical point. (The most famous example is the liquid-vapor critical point.)

A solution of the hierarchy problem, therefore, requires an explanation why nature seems so close to a critical point.

There are, as far as I know, two types of possible answers.

  • Either, someone/something tuned the fundamental parameters externally (whatever that means in this context). Condensed matter systems can be brought to a critical point by adjusting the temperature and pressure.
  • Or, there is a dynamical reason why nature evolved towards a critical point. This is known as self-organized criticality.

In the first category, we have multiverse-anthropic-principle type explanation.

If you are, like me, not a fan of these types of arguments, there is good news: self-organized criticality is a thing in nature. There are many known systems which evolve automatically towards a critical point. The most famous one is a sandpile.

For a brilliant discussion of self-organized criticality in general, see

A defining feature of systems close to a critical point is that we get complexity at all scales. Under normal circumstances, interesting phenomena only happen on scales comparable to the lattice spacing (which in particle physics possibly means the Planck scale).  But luckily, there are complex phenomena at all scales in nature, not just at extremely small scales. Otherwise, humans wouldn’t exist. This, I think, hints beautifully towards the interpretation that nature is fundamentally close to a critical point.

PS: As far as I know, Christof Wetterich was the first one who noticed the connection between criticality and the hierarchy problem in the paper mentioned above. Combining it with self-organized criticality was proposed in Self-organizing criticality, large anomalous mass dimension and the gauge hierarchy problem by Stefan Bornholdt and Christof Wetterich. Recently, the connection between self-organized criticality and solutions of the hierarchy problem was emphasized by Gian Francesco Giudice in The Dawn of the Post-Naturalness Era.)

PS: Please let me know if you know any paper in which the self-organized criticality idea is applied to the hierarchy problem in fundamental physics.