My Book “Physics From Symmetry” has been published!

P1030984 copy-smallUpdate 10.4.16:
Almost one year after its publication it’s time for a small recap. On the downside, the book has made me neither rich nor famous so far ;). However there are some things that make me particularly happy:

  • Rutwig Campoamor Stursberg has published a summary and review, in which he writes about Physics from Symmetry:  “As a first contact text it is remarkably well written and motivated, and constitutes a very good preparation for the study of the hard formalism of more advanced books. […] Summarizing […] this book describes rather well the not always easy to understand subject of symmetry methods in physics, and will be a valuable addition to the bibliography for any interested reader, and even for the expert, with an alternative point of view.”
  • Peter Rabinovitch has written a very positive review for the Mathematical Association of America.
  • The reviews at are almost exlusively positive and include:
    • This book should be a must for every science and math undergraduate.
    • “Undergraduates, today and in the future, are so lucky to have this book. It’s the book I wish I had as an undergraduate, it would have saved a lot of time and misplaced effort.”
    • This is the book I’ve been looking for all these years. […] Overall, an excellent, almost magical, text covering all the major areas of physics in an accessible manner. Highly recommended!
    • A perfect Peskin preface
  • Similarly at and
    • “A superb book. The future approach for undergraduate physics.”
    • “Really helped me understand what was going on in my undergraduate degree. […] recommended for everyone to read”
    • “Ausgesprochen zu empfehlen für alle Physikstudenten” (which translates to “Highly recommended for all students of physics“)
  • Physics from Symmetry is among the recommended literature for the lecture “Symmetry Groups in Physics” at the university of Hamburg.
  • I’ve received hundreds of emails from readers all around the world who reported errors, had questions or simply wanted to say that they liked the book. I enjoyed every single one of them, because it reminds me that people actually read what I’ve written. Always feel free to write me at mail |at|
  • There will be a German version of Physics from Symmetry by Springer Spektrum and the plan is that it gets published next year.

So, all in all the feedback has been overwhelmingly positive and I want to say thanks to everyone who took the time to write a few words!


After almost 18 month of hard work* my book has finally been published by Springer. Thanks to everyone who helped make the manuscript better and the team at Springer for their patience.

To learn more about the project have a look at the official homepage at

The book is 279 pages long, costs around 49.99$ and is available as hardcover and pdf.


*Writing was fun but editing was … not so fun


Vectors, Forms, p-Vectors, p-Forms and Tensors

This is a topic that can cause quite a bit confusion, so here is a short post I can come back to whenever I get confused.

Lets start with the definition of a vector. A vector is… uhmm … I guess you have a rough idea of what a vector is. Otherwise this is stuff for another post.

(The notion vector means here the abstract objects of a vector space. Euclidean vectors (=arrow-like objects) are just one special example.  Nevertheless, Euclidean vectors are the example that inspired the idea of abstract vectors. All objects that share some basic properties with Euclidean vectors, without the need of beeing arrow-like objects in Euclidean space, are called vectors.)


A one-form (1-form) is the dual object to a vector: A one-form $ \tilde w()$ eats a vector $V$ and spits out a number

\begin{equation} \tilde w(V) . \end{equation}

The word dual is used, because we can think of a vector $V()$ as an object that eats a one-form $\tilde w$ and spits out a number

\begin{equation}  V(\tilde w) \equiv \tilde w(V) . \end{equation}

We will see in a moment why defining such an object is a useful idea.

Two examples:

From matrix algebra: If we decide to call column vectors “vectors”, then row vectors are one-forms. Given a vector $ V=\begin{pmatrix} 2 \\ 4 \end{pmatrix}$, ordinary matrix multiplication (in the correct order) with a one-form $ \tilde w = \begin{pmatrix} 1 & 3 \end{pmatrix}$, results in a single real number:

\begin{equation} \tilde w(V) \equiv \begin{pmatrix} 1 & 3 \end{pmatrix} \begin{pmatrix} 2 \\ 4 \end{pmatrix} = 2+12 = 14

Another example used in quantum mechanics are bras $\langle\Psi\vert$ and kets $\vert\Phi\rangle$.  Kets are used to describe the inital state of a system, bras for the final states. Together they give a single number:

\begin{equation} \langle\Psi\vert \Phi\rangle

which is the probability amplitude for finding (measuring) the system $\vert \Phi\rangle$ in the final state $\langle\Psi\vert$.  Kets, like $\vert \Phi\rangle$ are vectors, and bras, like $\langle\Psi\vert$, are one forms.


Tensors are the natural generalization of the ideas described above. Tensors are linear operators on vectors and one-forms. A tensor of type $ \begin{pmatrix}N \\ N’ \end{pmatrix}$, eats $N$ one-forms and $N’$ vectors and spits out a single number.

A $ \begin{pmatrix}2 \\ 0\end{pmatrix}$ tensor is an object $F( \quad , \quad )$ that eats two one-forms, say $\tilde w$ and $\tilde v$ and spits out a number $F( \tilde w , \tilde v )$.  A $ \begin{pmatrix}1 \\ 2\end{pmatrix}$ tensor $F( \quad ; \quad , \quad)$ eats 1 one-form $\tilde w$ and 2 vectors, say $V$ and $W$ and spits out a number $F( \tilde w ; V, W)$ .

The term “linear operator” means that tensors obey ( for arbitrary numbers a, b)

\begin{equation}  F( a \tilde w + b \tilde v , \tilde z ; V, W) = a F(  \tilde w , \tilde z ; V, W) + b F(  \tilde v , \tilde z ; V, W)

and similarly for the other arguments. In general, the order of the arguments makes a difference

\begin{equation}  F( \tilde w , \tilde v ) \neq F( \tilde v, \tilde w ),

just as for a function of real variables, say $f(x,y)=4x+7y$, we have $f(1,3)\neq f(3,1)$.

A special and very important kind of tensors are anti-symmetric tensors. Anti-symmetry means that if we change the order of two arguments, the tensors changes only by a sign:

\begin{equation}  F_\mathrm{asym}( \tilde w , \tilde v ) = – F_\mathrm{asym}( \tilde v, \tilde w ).

Analogous a symmetric tensor does not care about the order of its arguments:

\begin{equation}  F_\mathrm{sym}( \tilde w , \tilde v ) =  F_\mathrm{asym}( \tilde v, \tilde w ).

If the tensor has more than two arguments of the same kind, the tensor is said to be totally antisymmetric (symmetric) if it is antisymmetric (symmetric) under the exchange of any of the arguments.

\begin{equation}  F_\mathrm{tasym}( \tilde w , \tilde v, \tilde z ) = – F_\mathrm{tasym}( \tilde v, \tilde w ,\tilde z) = F_\mathrm{tasym}( \tilde v,\tilde z, \tilde w ) = – F_\mathrm{tasym}(\tilde z,  \tilde v, \tilde w )

Antisymmetric tensors play a special role in mathematics and therefore they are given a special name: p-forms for antisymmetric $ \begin{pmatrix} 0 \\ p \end{pmatrix}$ tensors , and p-vectors for antisymmetric $  \begin{pmatrix} p \\  0\end{pmatrix}$ tensors.

P-Forms and P-Vectors

A p-form is simply an (antisymmetric in its arguments) object that eats p vectors and spits out a real number. Analogous a p-vector eats p one-forms and spits out a number (, and is antisymmetric in its arguments). For example a 2-form $\tilde w( , )$, eats two vectors $V$, $W$, spits out a real number $\tilde w(V , W)$, and is antisymmetric in its arguments $\tilde w(V , W) = – \tilde w(W , V)$.

P-forms are important, because they are exactly the objects we need if we want to talk about areas and volumes (and higher dimensional analogues).

If we have a metric (the mathematical object defining length) defining areas and volumes is straight forward. Nevertheless, the notion of area is less restrictive than the notion of metric and we can define area without having to define a metric on the manifold in question.

Lets see how this comes about.


We start by taking a step back and think about what properties a mathematical object, describing area should have. Suppose we have two (possibly infinitesimal) vectors, forming a two-dimensional parallelogram and we need the mathematical object that tells us the area of this parallelogram:

\begin{equation}  \mathrm{area}(V,W) = \text{ area of the parallelogram formed by V and W}.

An obious property is that the number $\mathrm{area}(V,W)$ ought to double if we double the length of one vector:

\begin{equation}  \mathrm{area}(2V,W) = 2 \times \text{ area of the parallelogram formed by V and W}



In addition, the area object should be additive under the addition of vectors

\begin{equation}  \mathrm{area}(V,W+Z) = \mathrm{area}(V,W)  + \mathrm{area}(V,Z)

A pictorial proof can be seen in the following figure:



Together these properties are what we called linearity above, and we see that this fits perfectly with the defintion of a  $  \begin{pmatrix} 0 \\  2 \end{pmatrix}$ tensor.

Another parallelo-area-zeroimportant propery is that $\mathrm{area}(V,W) $ must vanish if $V$ and $W$ are parallel, i.e. if $W=aV$ for some number $a$, we have

\begin{equation} \mathrm{area}(V,W)  =\mathrm{area}(V,aV)
\stackrel{!}{=} 0.

.Because we have $\mathrm{area}(V,aV)=a \times \mathrm{area}(V,V) \stackrel{!}{=} 0$, we start with $\mathrm{area}(V,V) \stackrel{!}{=} 0$ in the next step to see that $\mathrm{area}(V,W)$ must be an antisymmetric $  \begin{pmatrix} 0 \\  2 \end{pmatrix}$ tensor.

The proof is simple: From

\begin{equation}  \mathrm{area}(V,V) = 0,

if we write $V= U+W$, it follows

\begin{equation}  \mathrm{area}(U+W,U+W) = 0.

Using the linearity yields

\begin{equation} \mathrm{area}(U,U) +  \mathrm{area}(U,W)  + \mathrm{area}(W,U) + \mathrm{area}(W,W) = 0.

Using now $\mathrm{area}(V,V) = 0$ for $W$ and $U$, i.e., $\mathrm{area}(W,W) = 0$,  $\mathrm{area}(U,U) = 0$, yields

\begin{equation}  \underbrace{\mathrm{area}(U,U)}_{=0} +  \mathrm{area}(U,W)  + \mathrm{area}(W,U) + \underbrace{\mathrm{area}(W,W)}_{=0} =  \mathrm{area}(U,W)  + \mathrm{area}(W,U)  =  0 .

\begin{equation} \rightarrow   \mathrm{area}(U,W)   =  – \mathrm{area}(W,U).  \end{equation}

We conclude that demanding $\mathrm{area}(V,aV) \stackrel{!}{=} 0$, leads us directly to the property $\mathrm{area}(U,W) = – \mathrm{area}(W,U) $ of the area object. This is what we call antisymmetry.

Therefore, the appropriate mathematical object to describe area is an antisymmetric $  \begin{pmatrix} 0 \\  2 \end{pmatrix}$ tensor, which we call a 2-form.


What’s so special about the adjoint representation of a Lie group?

A representation is a map that maps each element of the set of abstract groups element to a matrix that acts on a vector space (see this post). The problem here is that at the beginning this can be quite confusing: If we can study the representation of any group on any vector space, where should we start?

Luckily, there exists exactly one distinguished representation, commonly called the adjoint representation.

First, recall some technicalities: The modern definition of a Lie group $G$ is that it’s a manifold whose elements satisfy the group axioms. Consequently, the Lie group looks in the neighborhood of any point (group element) like flat Euclidean space $R^n$ because that’s how a manifold is defined.

Now recall that the Lie algebra of a group is defined as the tangent space at the identity element $T_eG$ and Lie algebras are important, because, to quote from John Stillwell’s brilliant book Naive Lie Theory:

“The miracle of Lie theory is that a curved object, a Lie group G, can be almost completely captured by a a flat one, the Tangent space $T_eG$ of $G$ at the identity.”

I’ve written a long post about how and why this works.

It’s often a good idea to look at the Lie algebra of a group to study its properties, because working with a vector space, like $T_eG$, is in general easier than working with some curved space, like $G$. An important theoremc alled Ado’s Theorem, tells us that every Lie algebra is isomorphic to a matrix Lie algebra. This tells us that the knowledge of ordinary linear algebra is enough to study Lie algebras because every Lie algebra can be viewed as a set of matrices.

A natural idea is now to have a look at the representation of the group $G$ on the only distinguished vector space that comes automatically with each Lie group: The representation on its own tangent vector space at the identity $T_eG$, i.e. the Lie algebra of the group!

In other words, in principle, we can look at representations of a given group on any vector space. But there is exactly one distinguished vector space that comes automatically with each group: Its own Lie algebra. This representation is the adjoint representation.

In more technical terms the adjoint representation is a special map that satisfies $T(gh)=T(g)T(h)$, which is called a homomorphism, from $G$ to the space of linear operators on the tangent space at the identity $T_eG$. How does this representation look like?

A group can act on itself by left- and right-translation, given by the usual group multiplication $h \rightarrow gh$ and $h \rightarrow hg$. Both actions are not homomorphism because, denoting for example left-translation by $L_g$, i.e. $L_g(h) = gh$, we have $L_g (hj) \neq L_g(h) L_g(j)$, because $L_g(hj)=ghj \neq gh gj= L_g(h) L_g(j)$. Instead a homomorphism is given by a combination of left-translation by $g$ and right-translation by $g^{-1}$, commonly denoted by $I_g(h)=ghg^{-1}$ and called adjoint action. This is a homomorphism because

$I_g(hj)=I_g(h) I_g(j) = g h g^{-1} g j g^{-1} = g hj g^{-1} \quad \checkmark$, (using the definition of the inverse $g^{-1} g = 1$).

We have found a homomorphism that maps group elements to new group elements $I : G \rightarrow G$, but this is not a representation because $G$ is no vector space. Such a homomorphism to an arbitrary space (not necessarily a vector space) is called a realization. Nevertheless, we can use this homomorphism to derive a homomorphism to a vector space.

Firstly, take note that this homomorphism maps the identity to the identity for every group element $g$:

$I_g(e)= g e g^{-1}= g g^{-1} =e.$

If you haven’t already you should now read this post, because we will need in the following notions like curves and tangent vectors that are explained there in detail.

The property $I_g(e)=e$ means that any curve through $e$ on the manifold $G$ is mapped by this homomorphism to another (not necessarily the same) curve through $e$.Therefore the adjoint representation maps any tangent vector (of a curve on $G$) in $T_eG$ to another tangent vector in $T_eG$. In contrast left- (and right-)translations $L_g$ map tangent vectors in $T_eG$ to tangent vectors in $T_gG$.


Left-translations $L_g$ map the identity $e$ to the point $g$. Therefore any curve through $e$ is mapped by $L_g$ to a curve through $g$.

The (by $I_g$) induced map of any tangent vector in $T_eG$ (an element of the Lie algebra) to another tangent vector in $T_eG$ is called the adjoint transformation of $T_eG$ induced by g. This induced map defines a representation of the group $G$ on $T_eG$, because $T_eG$ is a vector space.

In the same spirit, we can consider Lie algebra representations (in contrast to Lie group representations). Analogous this means that the elements of the Lie algebra act on some vector space as linear transformations. Again a distinguished representation is given by the action of the Lie algebra elements on the distinguished vector space $T_eG$ (the Lie algebra itself).

The corresponding homomorphism can be derived from the homomorphism that defined the representation of the group $G$ on $T_eG$. The idea goes as follows:

Consider a curve $\gamma(t)$ on the manifold $G$ with $\gamma(0)=e \in G$ and tangent vector $\gamma'(0)=X \in T_eG$. Furthermore let the curve go through some arbitrary element $g \in G$. Using this curve we can rewrite the above adjoint action

$Ad_g (X)= \underbrace{g}_{\in G} \underbrace{X}_{\in T_eG} \underbrace{g^{-1}}_{\in G}$ as $Ad_g (Y)= Ad_{\gamma(t)}(Y) = \gamma(t) Y \gamma(t)^{-1} $

We get the Lie algebra homomorphism we are searching, called $ad$ (small $a$!) by differentiating this map at the identity $t=0$. Differentiating yields

$ \frac{d}{dt} Ad_{\gamma(t)}(Y) \big |_{t=0}= \frac{d}{dt} \gamma(t) Y \gamma(t)^{-1} \big |_{t=0} = \gamma'(0) Y \gamma(0)^{-1} + \gamma(0) Y \frac{d}{dt} \gamma(t)^{-1} \big |_{t=0}$

For a matrix Lie group we can easily calculate $ \frac{d}{dt} \gamma(t)^{-1}$, because of the matrix identity

$\frac{d}{dt} A^{-1}(t)=-A^{-1}(t) \big(\frac{d}{dt} A(t)\big) \,A^{-1}(t)$.

This identity follows from

$\frac{d}{dt} \big(A(t)\,A^{-1}(t)\big)=\frac{d}{dt} \big(e \big)=0$. Using the product rule

$\big( \frac{d}{dt} A(t) \big) A^{-1}(t) + A(t) \big( \frac{d}{dt} A^{-1}(t) \big) =0$ and multiplying from the left with $A^{-1}(t)$ yields

$ \rightarrow A^{-1}(t) \big( \frac{d}{dt} A(t) \big) A^{-1}(t) = – A^{-1}(t) A(t) \big( \frac{d}{dt} A^{-1}(t) \big) = \big( \frac{d}{dt} A^{-1}(t) \big) \quad \checkmark $

Therefore we have

$ \frac{d}{dt} Ad_{\gamma(t)}(Y) \big |_{t=0}= \frac{d}{dt} \gamma(t) Y \gamma(t)^{-1} \big |_{t=0} $

$= \gamma'(0) Y \gamma(0)^{-1} + \gamma(0) Y (-\gamma(0)^{-1} \gamma'(0) \gamma(0)^{-1}) = XY e – e Y eX e $

$ = XY-YX$, where we used the definitions for the curve we made above ($\gamma(0)=\gamma(0)^{-1}=e$ and $\gamma'(0)=X)$.

We see that the adjoint action of the Lie algebra on itself is given by commutator. Thus, this is a way of seeing that the Lie bracket is the natural product of the tangent space $T_eG$, i.e. of the Lie algebra. The representation of the Lie algebra on itself is given by the adjoint action $ad_X$, i.e. by the Lie bracket! (Recall that a representation is, by definition, a map.)

This a way of figure out the Lie bracket of a given group. If we aren’t considering matrix Lie groups (for which the Lie bracket is the commutator) the Lie bracket may be something different and the steps we followed above let us calculate the corresponding Lie bracket. Nevertheless, we know from Ado’s theorem that every Lie algebra can be considered as matrix Lie algebra with the commutator as Lie bracket.

In addition, the Lie algebra representation we derived above is often used as a model for all Lie algebra representations. A Lie algebra homomorphism (and therefore representation) can be defined as a map, respecting the adjoint action! To be precise:

A Lie algebra representation $(\Phi,V)$ of a Lie algebra $T_eG$ on some vector space $V$ is defined as a linear map
$\Phi$ between any Lie algebra element $X\in T_eG$ and a linear transformation $T(g)$ of some vector space $V$
satisfying $\Phi([X,Y])= [\Phi(X), \Phi(Y)]$

In the same way, a Lie group representation is defined as a linear map (to some vector space) respecting the group element combination rule $\Phi(gh)=\Phi(g)\Phi(h)$, a Lie algebra representation is a linear map (to some vector space) respecting the natural product of the Lie algebra, i.e. the Lie bracket.