Jakob Schwichtenberg

Making Sense of Particle Physics Research

I recently attended a Workshop on “Open Questions in Particle Physics and Cosmology” in Göttingen, and there, among other things, I learned a classification for ideas/models beyond the standard model.

This categorization helps me a lot as a young researcher in understanding what is currently going on in modern particle physics. It not only helps me to understand the work of others better but also allows me to formulate better what kind of research I’m currently doing and want to do in the future.

Broadly the categorization goes as follows:

1.) Curiosity Driven Models (a.k.a. Why not?)

In these kinds of models anything goes, that is allowed by basic principles and data. Curiosity-driven research is characterized by no restrictions regarding the type of particle and interaction. In general, there is no further motivation for the addition of some particle, besides that it is not yet excluded by the data.

For this reason, many such research projects include a large number of such “curiosity-driven” particle and interaction additions and then perform parameter scans to check what the current experimental bounds for the various models are.

For example, a prototypical curiosity-driven research project checks what the current mass and interaction strength bounds for spin 0, spin 1/2, spin 1, … particles are that would show up by a specific signature at the LHC.

Usually, such models are called simplified models.

The motivation behind such efforts is to scan the possible “model landscape” as systematically as possible.

2.) Data Driven Models

Models in this second category are invented as a response to an experimental “anomaly”. An experimental “anomaly” is when the observation of an experiment is not exactly what is expected from the theory. Usually, the statistical significance is between 2-4 sigma and thus below the “magical” 5 sigma, when people start talking about a discovery. There can be many reasons for such an anomaly: an experimental error, an error in the interpretation of the experimental data, an error in the standard theory prediction or possibly it’s just a statistical fluctuation.

Some examples of such anomalies are:

The current flavor anomalies in $R_K$ and $R_{K^\star}$ observables.
The long-standing discrepancy in the anomalous magnetic moment of the muon, usually just called (“g-2”).
The infamous 750 GeV diphoton excess.
The Fermi LAT GC excess.
The Reactor Antineutrino Anomalies
The 3.5 keV X-ray line.
The positron fraction excess.
The DAMA/LIBRA annual modulation effect.
The “discovery” of gravitational waves by the BICEP2 experiment

It is not uncommon, those data-driven models try to explain several of these anomalies at once. For an example of a data-driven model, have a look at this paper and for further examples, see slide 7 here.

(Take note that most of the “anomalies” from the list above are no longer “hot”. For example, the 750 GeV diphoton excess is now regarded as a statistical fluctuation, the positron fraction can be explained by pulsars, the significance of the 3.5 keV X-ray line is decreasing, the reactor antineutrino anomalies can be explained by a “miscalculation“, the DAMA/LIBRA “discovery” has been refuted by several other experiments, the “discovery” of gravitational waves by the BICEP2 experiment is “now officially dead”…)

The motivation behind such research efforts is, of course, to be the first who proposed the correct explanation if the “anomaly” turns out to be a real discovery.

3.) Theory-Driven Models

Research projects in this third category try to solve some big theoretical problem/puzzle and predict something that can be measured as a byproduct.

Examples of such puzzles are:

The gauge hierarchy puzzle.
The strong CP puzzle.
The quantization of electric charge.

Again, as for the data-driven models, many models in this category try to solve more than one of these puzzles. Examples are supersymmetric models, which solves the gauge hierarchy puzzle; axion models, which solve the strong CP puzzle; and GUT models which explain the quantization of electric charge.

It is important to note that this classification is only valid for research in the category “hep-ph“, i.e. high-energy physics phenomenology.

In addition, there is, of course, a lot that is going on in “hep-th“, i.e. high-energy physics theory. Research projects in this category are not started to make a prediction for an experiment, but rather to understand some fundamental aspect of, say Yang-Mills theory better or to invent new methods to calculate amplitudes.

Quite prophetic and relevant for the classification above, is the following quote by Nobel Prize winner Sheldon Lee Glashow from a discussion at the “Conceptual Foundations of Quantum Field Theory” conference in 1996

“The age of model building is done, except if you want to go beyond this theory. Now the big leap is string theory, and they want to do the whole thing; that’s very ambitious. But others would like to make smaller steps. And the smaller steps would be presumably, many people feel, strongly guided by experiment. That is to say the hope is that experiment will indicate some new phenomenon that does not agree with the theory as it is presently constituted, and then we just add bells and whistles insofar as it’s possible. But as I said, it ain’t easy. Any new architecture has, by its very nature, to be quite elaborate and quite enormous. Low energy supersymmetry is one of the things that people talk about. It’s a hell of a lot of new particles and new forces which may be just around the corner. And if they see some of these particles, you’ll see hundreds, literally hundreds of people sprouting up, who will have claimed to predict exactly what was seen. In fact they’ve already sprouted up and claimed to have predicted various things that were seen and subsequently retracted. But you see you can’t play this small modification game anymore. It’s not the way it was when Sam and I were growing up and there were lots of little tricks you could do here and there that could make our knowledge better. They’re not there any more, in terms of changing the theory. They’re there in terms of being able to calculate things that were too hard to calculate yesterday. Some smart physicists will figure out how to do something slightly better, that happens. But we can’t monkey around. So it’s either the big dream, the big dream for the ultimate theory, or hope to seek experimental conflicts and build new structures. But we are, everybody would agree that we have right now the standard theory, and most physicists feel that we are stuck with it for the time being. We’re really at a plateau, and in a sense it really is a time for people like you, philosophers, to contemplate not where we’re going, because we don’t really know and you hear all kinds of strange views, but where we are. And maybe the time has come for you to tell us where we are. ‘Cause it hasn’t changed in the last 15 years, you can sit back and, you know, think about where we are.”

Will physicists be replaced by robots?

I always thought that such a suggestion is ridiculous. How could a robot ever do what physicists do? While many jobs seem to be in danger because of recent advances in automation – up to 47 % according to recent studies – the last thing that will be automated, if ever, are jobs like that of physicists which need creativity, right?

For example, this site, which was featured in many major publications, states that there is only a 10% chance that robots will take the job of physicists:

As physicist I have only 10% Probability of Automation in Future! Robots will have to work harder to get my job 🤖 https://t.co/NSLtxuzXDa

— Freya Blekman (@freyablekman) June 18, 2017

Recently author James Gleick commented on how shocked professional “Go” players are by the tactics of Google’s software “AlphaGo”:

If humans are this blind to the truth of Go, how well are we doing with theoretical physics? https://t.co/UeeGqyxRD8

— James Gleick (@JamesGleick) October 18, 2017

Sean Caroll answered and summarized how most physicists think about this:

We’re doing great with theoretical physics! It’s the worst possible analogy to how AI does better than humans at complex games like Go. https://t.co/Cki1q0hRgS

— Sean Carroll (@seanmcarroll) October 18, 2017

Fundamental physics is analogous to "the rules of Go." Which are simple and easily mastered. Go *strategy* is more like bio or neuroscience.

— Sean Carroll (@seanmcarroll) October 18, 2017

A Counterexample

Until very recently I would have agreed. However, a few weeks ago I discovered this little paper and it got me thinking. The idea of the paper is really simple. Just feed measurement data into an algorithm. Give him a fixed set of objects to play around with and then let the algorithm find the laws that describe the data best. The authors argue that their algorithm is able to rediscover Maxwell’s equations. These equations are still the best equations to describe how light behaves. Their algorithm was able to find these equations “in about a second”. Moreover, they describe their program as a “computational embodiment of the scientific method: observation, consideration of candidate theories, and validation.” That’s pretty cool. Once more I was reminded that “everything seems impossible until it’s done.”

Couldn’t we do the same to search for new laws by feeding such an algorithm the newest collider data? Aren’t the jobs of physicists that safe after all?

What do physicists do?

First of all, the category “physicist” is much too broad to discuss the danger of automation. For example, there are experimental physicists and theoretical physicists. And even inside these subcategories, there are further important sub-sub-categories.

On the experimental side, there are people who actually build experiments. Those are the guys who know how to use a screwdriver. In addition, there are people who analyze the data gathered by experiments.

On the theoretical side, there are theorists and phenomenologists. The distinction here is not so clear. For example, one can argue that phenomenology is a subfield of theoretical physics. Many phenomenologists call themselves theoretical physicists. Broadly, the job of a theoretical physicist is to explain and predict how nature behaves by writing down equations. However, there are many different approaches how to write down new equations. I find the classification outlined here helpful. There is:

Curiosity Driven Research; where “anything goes, that is allowed by basic principles and data. […] In general, there is no further motivation for the addition of some particle, besides that it is not yet excluded by the data.”
Data-Driven Research; where new equations are written down as a response to experimental anomalies.
Theory-Driven Research; which is mostly about “aesthetics” and “intuition”. The prototypical example, is of course, Einstein’s invention of General Relativity.

The job of someone working in each such sub-sub-category is completely different to the jobs in another sub-sub-category. Therefore, there is certainly no universal answer to the question how likely it is for “robots” to replace physicists. Each of sub-sub-categories mentioned above must be analyzed on its own.

What could robots do?

Let’s start with the most obvious one. Data analysis is an ideal job for robots. Unsurprisingly several groups are already working or experimenting with neural networks to analyze LHC data. In the traditional approach to collider data analyses, people have to invent criteria for how we can distinguish different particles in the detector. If the angle of two detected photons is large than X°, the overall energy of them smaller than Y GeV the particle is with a probability Z% some given particle. In contrast, if you use a neural network, you just have train it using Monte-Carlo data, where you know which particle is where. Then you can let the trained network analyze the collider data. In addition, after the training, you can investigate the network to see what it has learned. This way neural networks can be used to find new useful variables that help to distinguish different particles in a detector. I should mention that this approach is not universally favored because some feel that a neural network is too much of a black box to be trusted.

What about theoretical physicists?

In the tweet quoted above, Sean Carroll argues that “Fundamental physics is analogous to “the rules of Go.” Which are simple and easily mastered. Go *strategy* is more like bio or neuroscience.” Well yes, and no. Finding new fundamental equations is certainly similar to inventing new rules for a game. This is broadly the job of a theoretical physicist. However, the three approaches to “doing theoretical physics”, mentioned above, are quite different.

In the first and second approach, the “rules of the game” are pretty much fixed. You write down a Lagrangian and afterward compare its predictions with measured data. The new Lagrangian involves new fields, new coupling constants etc., but must be written down according to fixed rules. Usually, only terms that respect rules of special relativity are allowed. Moreover, we know that the simplest possible terms are the most important ones, so you focus on them first. (More complicated terms are “non-renormalizable” and therefore suppressed by some large scale.) Given some new field or fields writing down the Lagrangian and deriving the corresponding equations of motion is a straight-forward. Moreover, while deriving the experimental consequences of some given Lagrangian can be quite complicated, the general rules of how to do it are fixed. The framework that allows us to derive predictions for colliders or other experiments starting from a Lagrangian is known as Quantum Field Theory.

This is exactly the kind of problem that was solved, although in a much simpler setting, by the Mark A. Stalzer and Chao Ju in the paper mentioned above. There are already powerful algorithms, like, for example, SPheno or micrOMEGAs which are capable of deriving many important consequences of a given Lagrangian, almost completely automagically. So with further progress in this direction, it seems not completely impossible that an algorithm will be able to find the best possible Lagrangian to describe given experimental data.

As an aside: A funny name for this goal of theoretical physics that deals with the search for the “ultimate Lagrangian of the world” was coined by Arthur Wightman, who called it “the hunt for the Green Lion”. (Source: Conceptual Foundations of Quantum Field Theory: Tian Yu Cao)

What then remains on the theoretical side is “Theory-Driven Research”. I have no idea how a “robot” could do this kind of research, which is probably what Sean Caroll had in mind in his tweets. For example, the algorithm by Mark A. Stalzer and Chao Ju only searches for laws that consist of some predefined objects, vectors, tensors and uses predefined rules of how to combine them: scalar products, cross products etc. It is hard to imagine how paradigm-shifting discoveries could be made by an algorithm like this. General relativity is a good example. The correct theory of gravity needed completely new mathematics that wasn’t previously used by physicists. No physicists around 1900 would have programmed crazy rules such as those of non-Euclidean geometry into the set of allowed rules. An algorithm that was designed to guess Lagrangian will always only spit out a Lagrangian. If the fundamental theory of nature cannot be written down in Lagrangian form, the algorithm would be doomed to fail.

To summarize, there will be physicists in 100 years. However, I don’t think that all jobs currently done by theoretical and experimental physicists will survive. This is probably a good thing. Most physicists would love to have more time to think about fundamental problems like Einstein did.

Layers of Understanding

Update: I’ve now started a website motivated by the idea outlined in this post. It’s called Physics Travel Guide.com. For each topic, there are different layers such that everyone can find an explanation that speaks a language he/she understands.

Over the years I’ve had many discussions with fellow students about the question: when do you understand something?

Usually, I’ve taken the strong position that is summarized by this famous Vonnegut quote:

“any scientist who couldn’t explain to an eight-year-old what he was doing was a charlatan.”

In other words: you’ve only understood a given topic if you can explain it in simple terms.

Many disagree. Especially one friend who, studies math, liked to argue that some topics are simply too abstract and such “low-level” explanations may not be possible.

Of course, the quote is a bit exaggerated. Nevertheless, I think as a researcher you should be able to explain what you do to an interested beginner student.

I don’t think that any topic is too abstract for this. When no “low-level” explanation is available so far, this does not mean that it doesn’t exist, but merely that it hasn’t been found yet.

In my first year as a student, I went on a camping trip to Norway. At that time, I knew little math and nothing about number theory or the Riemann zeta function. During the trip, I devoured “The Music of the Primes” by Marcus Du Sautoy. Sautoy managed to explain to a clueless beginner student why people care about prime numbers (they are like the atoms of numbers), why people find the Riemann zeta function interesting (there is a relationship between the complex zeros of the Riemann zeta function and the prime numbers) and what the Riemann hypothesis is all about. Of course, after reading the book I still didn’t know anything substantial about number theory or the Riemann zeta function. However, the book gave me a valuable understanding of how people who work on these subjects think. In addition, after several years I still understand why people get excited when someone proposes something new about the Riemann hypothesis.

I don’t know any topic more abstract than number theory and if it is possible to explain something as abstract as the Riemann zeta function to a beginner student, it can be done for any topic, too.

My point is not that oversimplified PopSci explanations are what all scientists should do and think about. Instead, my point is that any topic can be explained in non-abstract terms.

Well maybe, but why should we care? An abstract explanation is certainly the most rigorous and error-free way to introduce the topic. It truly represents the state of the art and how experts think about the topic.

While this may be true, I don’t think that this is where real understanding comes from.

Maybe you are able to follow some “explanation” that involves many abstract arguments or some abstract proof and maybe afterward you realize that the concept or theorem is correct. However, what is still missing is some understanding of why it is correct.

Here is a great example, from the book “Street-Fighting Mathematics” by Sanjoy Mahajan:

There is a formula that tells you the result for the sum of the first $n$ odd numbers:

$$ S_n = 1+ 3 +5 + \ldots + (2n-1) = \sum_1^n (2k-1) = n^2$$

You can proof this, for example, by induction. After such a proof you are certainly convinced that the formula $\sum_1^n (2k-1) = n^2$ is correct. But still, you have no idea why it is correct.

Now, instead consider the following pictorial explanation:

We draw each odd number as an L-shaped puzzle piece:

Source: Street-Fighting Mathematics by Sanjoy Mahajan

Then, we can draw the sum of the first $n$ odd numbers as follows:

Source: Street-Fighting Mathematics by Sanjoy Mahajan

The odd numbers as puzzle pieces fit together such that we get a $n\times n$ square. We can see here that the sum is $n^2$. After seeing this proof, you’ll never forget why the sum of the first $n$ odd numbers equals $n^2$.

Especially most math books are guilty of relying solely on abstract explanations without any pictorial explanations or analogies. I personally find this oftentimes incredibly frustrating. No one gets a deep understand by reading pages full of definitions and proofs. This type of exposition discourages beginner students and simply communicates the message “well real math is complicated stuff”.

I recently read an interesting hypothesis about how this way of teaching math became the standard. In his book “Not Even Wrong” Peter Woit writes:

“What [Mathematicians] learned long ago was that to get anywhere in the long term, the field has to insist strongly on absolute clarity of the formulation of ideas and the rigorous understanding of their implications. Modern mathematics may be justly accused of sometimes taking these standards too far, to the point of fetishising them. Often, mathematical research suffers because the community is unwilling to let appear in print the vague speculative formulations that motivate some of the best new work, or the similarly vague and imprecise summaries of older work that are essential to any readable expository literature. […] The mathematics literature often suffers from being either almost unreadable or concerned ultimately with not very interesting problems […] I hope that the trend in mathematical teaching, writing, and editing will continue to recoil from the extreme of Bourbakisme, so that explanations and non-trivial examples can be presented and physicists (to say nothing of other scientists) can once more have a fighting chance of understanding what mathematicians are up to, as they did early in the twentieth century. […]’Bourbakisme’ refers to the activities of a very influential group of French mathematicians known collectively by the pseudonym Bourbaki. Bourbaki was founded by Andre Weil and others during the 1930s, partly as a project to write a series of textbooks that would provide a completely rigorous exposition of fundamental mathematical results. They felt such a series was needed in order to have a source of completely clear definitions and theorems to use as a basis for future mathematical progress. This kind of activity is what appalled Gell-Mann, and it did nothing for improving communication between mathematicians and physicists. While their books were arid and free of any examples, in their own research and private communications the mathematicians of Bourbaki were very much engaged with examples, non-rigorous argument and conjecture. […] The Bourbaki books and the point of view from which they emerged had a bad effect on mathematical exposition in general, with many people writing very hard-to-read papers in a style emulating that of the books.“

This passage reminded me of this famous quote by Chen-Ning Yang (the Yang in Yang-Mills theory):

“There are only two kinds of math books: Those you cannot read beyond the first sentence, and those you cannot read beyond the first page.”

The good news is that nowadays there exists a third kind of book that explains things pictorial and with analogies. This is where beginners should start. Here are some examples:

Visual Complex Analysis by Tristan Needham
Naive Lie Theory by John Stillwell
Visual Group Theory by Nathan Carter

For example, Needham manages to give you beautiful pictures for the series expansion of the complex exponential function, which otherwise is just another formula. Another example, I’ve written about here is what is really going between a Lie algebra and a given Lie group. You can accept the relationship as some abstract voodoo or you can draw some pictures and get some deep understanding that allows you to always remember the most important results.

This problem with too abstract explanations is not only a problem in mathematics. Many physics books suffer from the same problem. A great example is how quantum field theory is usually explained by the standard textbooks (Peskin-Schröder and Co.). Most pages are full of complicated computations and comments about high-level stuff. After reading one of these books you can not help yourself but have the impression: “well, quantum field theory is complicated stuff”. In contrast, when you read “Student Friendly Quantum Field Theory” by Robert Klauber, you will come to the conclusion that quantum field theory is at its core quite easy. Klauber carefully explains things with pictures and draws lots of analogies. Thanks to this, after reading his book, I was always able to remember the most important, fundamental features of quantum field theory.

Another example from physics are anomalies. Usually they are introduced in highly complicated way, although there exists a simple pictorial way to understand anomalies. Equally, the Noether theorem is usually just proven. Students accept its correctness, but have no clue why it is correct. On the other hand, there is Feynman’s picture proof of the Noether’s theorem.

The message here is similar to what I wrote in “One Thing You Must Understand About Studying Physics“. Don’t get discouraged by explanations that are too abstract for your current level of understanding. On any topic there exists some book or article that explains it in a language that you can understand and that brings you to the next level. Finding this book or article can be long and difficult process, but it is always worth it. If there really isn’t any readable on the topic that you are interested, write it yourself!