Jakob Schwichtenberg

My Book “Physics From Symmetry” has been published!

P1030984 copy-smallUpdate 10.4.16:
Almost one year after its publication it’s time for a small recap. On the downside, the book has made me neither rich nor famous so far ;). However there are some things that make me particularly happy:

  • Rutwig Campoamor Stursberg has published a summary and review, in which he writes about Physics from Symmetry:  “As a first contact text it is remarkably well written and motivated, and constitutes a very good preparation for the study of the hard formalism of more advanced books. […] Summarizing […] this book describes rather well the not always easy to understand subject of symmetry methods in physics, and will be a valuable addition to the bibliography for any interested reader, and even for the expert, with an alternative point of view.”
  • Peter Rabinovitch has written a very positive review for the Mathematical Association of America.
  • The reviews at Amazon.com are almost exlusively positive and include:
    • This book should be a must for every science and math undergraduate.
    • “Undergraduates, today and in the future, are so lucky to have this book. It’s the book I wish I had as an undergraduate, it would have saved a lot of time and misplaced effort.”
    • This is the book I’ve been looking for all these years. […] Overall, an excellent, almost magical, text covering all the major areas of physics in an accessible manner. Highly recommended!
    • A perfect Peskin preface
  • Similarly at Amazon.co.uk and Amazon.de:
    • “A superb book. The future approach for undergraduate physics.”
    • “Really helped me understand what was going on in my undergraduate degree. […] recommended for everyone to read”
    • “Ausgesprochen zu empfehlen für alle Physikstudenten” (which translates to “Highly recommended for all students of physics“)
  • Physics from Symmetry is among the recommended literature for the lecture “Symmetry Groups in Physics” at the university of Hamburg.
  • I’ve received hundreds of emails from readers all around the world who reported errors, had questions or simply wanted to say that they liked the book. I enjoyed every single one of them, because it reminds me that people actually read what I’ve written. Always feel free to write me at mail |at| jakobschwichtenberg.com.
  • There will be a German version of Physics from Symmetry by Springer Spektrum and the plan is that it gets published next year.

So, all in all the feedback has been overwhelmingly positive and I want to say thanks to everyone who took the time to write a few words!


 

After almost 18 month of hard work* my book has finally been published by Springer. Thanks to everyone who helped make the manuscript better and the team at Springer for their patience.

To learn more about the project have a look at the official homepage at PhysicsFromSymmetry.com.

The book is 279 pages long, costs around 49.99$ and is available as hardcover and pdf.


 

*Writing was fun but editing was … not so fun

 

What’s so special about the adjoint representation of a Lie group?

A representation is a map that maps each element of the set of abstract groups element to a matrix that acts on a vector space (see this post). The problem here is that at the beginning this can be quite confusing: If we can study the representation of any group on any vector space, where should we start?

Luckily, there exists exactly one distinguished representation, commonly called the adjoint representation.

First, recall some technicalities: The modern definition of a Lie group G is that it’s a manifold whose elements satisfy the group axioms. Consequently, the Lie group looks in the neighborhood of any point (group element) like flat Euclidean space Rn because that’s how a manifold is defined.

Now recall that the Lie algebra of a group is defined as the tangent space at the identity element TeG and Lie algebras are important, because, to quote from John Stillwell’s brilliant book Naive Lie Theory:

“The miracle of Lie theory is that a curved object, a Lie group G, can be almost completely captured by a a flat one, the Tangent space TeG of G at the identity.”

I’ve written a long post about how and why this works.

It’s often a good idea to look at the Lie algebra of a group to study its properties, because working with a vector space, like TeG, is in general easier than working with some curved space, like G. An important theoremc alled Ado’s Theorem, tells us that every Lie algebra is isomorphic to a matrix Lie algebra. This tells us that the knowledge of ordinary linear algebra is enough to study Lie algebras because every Lie algebra can be viewed as a set of matrices.

A natural idea is now to have a look at the representation of the group G on the only distinguished vector space that comes automatically with each Lie group: The representation on its own tangent vector space at the identity TeG, i.e. the Lie algebra of the group!

In other words, in principle, we can look at representations of a given group on any vector space. But there is exactly one distinguished vector space that comes automatically with each group: Its own Lie algebra. This representation is the adjoint representation.

In more technical terms the adjoint representation is a special map that satisfies T(gh)=T(g)T(h), which is called a homomorphism, from G to the space of linear operators on the tangent space at the identity TeG. How does this representation look like?

A group can act on itself by left- and right-translation, given by the usual group multiplication hgh and hhg. Both actions are not homomorphism because, denoting for example left-translation by Lg, i.e. Lg(h)=gh, we have Lg(hj)Lg(h)Lg(j), because Lg(hj)=ghjghgj=Lg(h)Lg(j). Instead a homomorphism is given by a combination of left-translation by g and right-translation by g1, commonly denoted by Ig(h)=ghg1 and called adjoint action. This is a homomorphism because

Ig(hj)=Ig(h)Ig(j)=ghg1gjg1=ghjg1, (using the definition of the inverse g1g=1).

We have found a homomorphism that maps group elements to new group elements I:GG, but this is not a representation because G is no vector space. Such a homomorphism to an arbitrary space (not necessarily a vector space) is called a realization. Nevertheless, we can use this homomorphism to derive a homomorphism to a vector space.

Firstly, take note that this homomorphism maps the identity to the identity for every group element g:

Ig(e)=geg1=gg1=e.

If you haven’t already you should now read this post, because we will need in the following notions like curves and tangent vectors that are explained there in detail.

The property Ig(e)=e means that any curve through e on the manifold G is mapped by this homomorphism to another (not necessarily the same) curve through e.Therefore the adjoint representation maps any tangent vector (of a curve on G) in TeG to another tangent vector in TeG. In contrast left- (and right-)translations Lg map tangent vectors in TeG to tangent vectors in TgG.

Left_Trafo-Map

Left-translations Lg map the identity e to the point g. Therefore any curve through e is mapped by Lg to a curve through g.

The (by Ig) induced map of any tangent vector in TeG (an element of the Lie algebra) to another tangent vector in TeG is called the adjoint transformation of TeG induced by g. This induced map defines a representation of the group G on TeG, because TeG is a vector space.

In the same spirit, we can consider Lie algebra representations (in contrast to Lie group representations). Analogous this means that the elements of the Lie algebra act on some vector space as linear transformations. Again a distinguished representation is given by the action of the Lie algebra elements on the distinguished vector space TeG (the Lie algebra itself).

The corresponding homomorphism can be derived from the homomorphism that defined the representation of the group G on TeG. The idea goes as follows:

Consider a curve γ(t) on the manifold G with γ(0)=eG and tangent vector γ(0)=XTeG. Furthermore let the curve go through some arbitrary element gG. Using this curve we can rewrite the above adjoint action

Adg(X)=gGXTeGg1G as Adg(Y)=Adγ(t)(Y)=γ(t)Yγ(t)1

We get the Lie algebra homomorphism we are searching, called ad (small a!) by differentiating this map at the identity t=0. Differentiating yields

ddtAdγ(t)(Y)|t=0=ddtγ(t)Yγ(t)1|t=0=γ(0)Yγ(0)1+γ(0)Yddtγ(t)1|t=0

For a matrix Lie group we can easily calculate ddtγ(t)1, because of the matrix identity

ddtA1(t)=A1(t)(ddtA(t))A1(t).

This identity follows from

ddt(A(t)A1(t))=ddt(e)=0. Using the product rule

(ddtA(t))A1(t)+A(t)(ddtA1(t))=0 and multiplying from the left with A1(t) yields

A1(t)(ddtA(t))A1(t)=A1(t)A(t)(ddtA1(t))=(ddtA1(t))

Therefore we have

ddtAdγ(t)(Y)|t=0=ddtγ(t)Yγ(t)1|t=0

=γ(0)Yγ(0)1+γ(0)Y(γ(0)1γ(0)γ(0)1)=XYeeYeXe

=XYYX, where we used the definitions for the curve we made above (γ(0)=γ(0)1=e and γ(0)=X).

We see that the adjoint action of the Lie algebra on itself is given by commutator. Thus, this is a way of seeing that the Lie bracket is the natural product of the tangent space TeG, i.e. of the Lie algebra. The representation of the Lie algebra on itself is given by the adjoint action adX, i.e. by the Lie bracket! (Recall that a representation is, by definition, a map.)

This a way of figure out the Lie bracket of a given group. If we aren’t considering matrix Lie groups (for which the Lie bracket is the commutator) the Lie bracket may be something different and the steps we followed above let us calculate the corresponding Lie bracket. Nevertheless, we know from Ado’s theorem that every Lie algebra can be considered as matrix Lie algebra with the commutator as Lie bracket.

In addition, the Lie algebra representation we derived above is often used as a model for all Lie algebra representations. A Lie algebra homomorphism (and therefore representation) can be defined as a map, respecting the adjoint action! To be precise:

A Lie algebra representation (Φ,V) of a Lie algebra TeG on some vector space V is defined as a linear map
Φ between any Lie algebra element XTeG and a linear transformation T(g) of some vector space V
satisfying Φ([X,Y])=[Φ(X),Φ(Y)]

In the same way, a Lie group representation is defined as a linear map (to some vector space) respecting the group element combination rule Φ(gh)=Φ(g)Φ(h), a Lie algebra representation is a linear map (to some vector space) respecting the natural product of the Lie algebra, i.e. the Lie bracket.

How is a Lie Algebra able to describe a Group?

If you understand the idea Lie Group= Manifold, you can easily understand one of the most curious facts of Lie theory:

The Lie algebra g, which is defined as the tangent space at the identity  TeG, is able to tell us almost everything about a given Lie group G.

The connection between Lie algebra elements and Lie group elements is established by the exponential map. We can understand this easily from the “naive”, historical approach to Lie theory that deals with infitesimal transformations. The differential geometry perspective enables us to understand this connection in a more pictorial way.

There are four simple concepts one needs to know in order to understand the connection between Lie algebra elements and Lie group elements: Curves, Functions, Vector Fields, and Integral Curves.

Curves:

Curve-Illustration

Illsutration of a curve on M, as a map from R1 onto M. The point λ of R1 is mapped onto P in M.

One way to think about a curve on a manifold M is as continuous series of points in M. The definition we want to use is, that a curve is a mapping from an open set of R1 into M. Therefore, a curve associates with each point λ of R1 (which is just a real number) a point in M. The curved is said to be parametrized by λ. The point in M is called the image point of λ. Two curves can be different even though they have the same image points in M, if they assign a different parameter value to the image points.

Functions:

function-Illustration

Illustration of a function on a manifold as map from M onto R1

Functions are in some sense inverse to curves. A function on a manifold M assigns a real number (= a element of R1) to each point of M. To make sense of the word differentiable when talking about functions on M, it helps to think about differentiability in terms of coordinates.

Remember that a manifold is defined as a set for which in the neighborhood of any point a map onto Rn exists. In other words: We can assign in the neighborhood of any point coordinate values to the points in the neighborhood. Therefore we can combine this map onto Rn, with the map (function) from M to R1 to get a map from Rn onto R1. A function is said to be differentiable if the it is differentiable in Rn.

In the abstract sense a function is a map f(P), where P denotes some point in M. By assigning coordinate values to this point, we have a map f(x1,x2,,xn). If this function is differentiable in its arguments, the function is said to be differentiable. The coordinate map itself gives us a function for each coordinate. For example x2(P) is a function that maps each point P to the corresponding coordinate value x2.

Tangent Vectors:

Now we head to something that is quite hard to grasp when stumbling about it for the first time: The modern, abstract definition of a tangent vector.

We start with a curve that passes through some point P of M. This curve is described, using the coordinate map, by the equations xi(λ). A differentiable function f=f(x1,x2,,xn), abbreviated f=f(xi), on M assigns a value to each point of the curve. To be precise we call this function, g(λ), because f is a function of (x1,x2,,xn) and g is a function of λ.

g(λ)=f(x1(λ),x2(λ),,xn(λ))=f(xi(λ))

We can differentiate, which yields, using the chain rule

dgdλ=idxidλδfδxi.

Because this equation holds for any function f, we can write

ddλ=idxidλδδxi.

(If this seems strange to you, you may check out chapter 8.1 of the Feynman Lectures about Quantum Mechanics, which has a great explanation regarding this, “removing objects from an equation, if it holds for arbitrary objects of this kind)

If we would be talking exclusively about Euclidean space, we would interpret dxidλ as the components of a vector tangent to the curve. dxi are infinitesimal displacement along the curve by dividing them by a real number λ, gives the rate of change in this direction. Dividing by a real number does only change the scale, not the direction of the displacement.

tangent-vector-map

Illustration of two different curves having the same tangent vector at some point P

Every curve has a unique tangent vector at any point. In contrast a tangent vector can be the tangent vector for infinitely many curves. For example, if we take a look at the simple curve

xi(λ)=λai
with constants ai. The tangent vector at the point P, λ=0 is dxidλ=ai. A slightly different curve
xi(μ)=μ2bi+μai
goes through the same point P for μ=0 and has the same tangent vector at P, dxidμ=ai.

Illustration of two curves having the same path but different parameterizations.

Illustration of two curves having the same path but different parameterizations.

Furthermore, we can reparametrize the first curve by xi(ν)=(ν3+ν)ai. This curve goes through the same points, but has different values ν associated with them. At ν=0, this curve goes again through P and the tangent vector is dxidν=ai.

We conclude: Every vector belongs to a whole equivalence classes of curves at this point. More dramatically one can say: The vector characterizes a whole equivalence classes of curves at this point. The defining feature of this equivalence class is the vector.

This observation motivates the modern definition of a tangent vector. In Euclidean space we have no problem with talking about displacements. A vector in Euclidean space points from one point to another. On a manifold there is, in general, no distance relation between points. Therefore, we need a more sophisticated idea to be able to talk about vectors.

Consider a curve xi=xi(μ) through P. Analogous to the considerations above we have

ddμ=idxidμδδxi.

Using two numbers a and b, we can take a look at the linear combination

addλ+bddμ=i(adxidλ+bdxidμ)δδxi.

adxidλ+bdxidμ are components of a new vector that is tangent to some curves through P. Therefore, for one of these curves that is parametrized by Φ, we can write at P

ddΦ=i(adxidλ+bdxidμ)δδxi.

and therefore

ddΦ=addλ+bddμ.

At this point we can see that we have a one-to-one correspondence between the space of all tangent vectors at some point P and the space of all derivatives along curves at P.

The directional derivatives, like ddλ, behave like the usual vectors under addition and are therefore said to form a vector space. A basis for this vector space is always given by {δδxi}, because we have seen that any directional derivative can be written as a linear combination of the derivatives δδxi, the derivatives along the coordinate lines. The components in this basis are {dxidλ}.

The definition of a tangent vector that mathematicians prefer is that ddλ is the tangent vector to the curve xi(λ). This definition is advantageous, because it involves no displacements over finite separations and it makes reference to coordinates. The definition makes sense, because, as we can see above, ddλ has the components {dxidλ}, if we pick some coordinate system, which are exactly the numbers one associates naively with an “arrow” tangent to a curve.

440px-Image_Tangent-plane

Illustration of the tangent space for some point on the sphere

The subtle point is that we are no longer allowed to add vectors located at different points of the manifold. The vectors are said to live in the tangent space of M at P, denoted TPM. Therefore vectors at different points live in different spaces and are not allowed to be added together. Using the sphere as an example, one may picture this tangent space as a plane lying tangent to the sphere this point.

Vector Fields and Integral Curves

Another important term is vector field. A vector field is a rule that defines a vector at each point of the manifold. We have one tangent space at each point of the manifold and a vector field is a rule that picks one vector from each tangent space. As already noted above: Every curve has at each point a tangent vector. We can turn this a bit upside-down. Given a vector field, i.e. a tangent vector at each point of the manifold, we can find, if we pick one point P, precisely one curve going through P that has exactly the tangent vector the vector field picks as a tangent vector at any point the curves passes through. Such a curve is called a integral curve, because given a vector field Vi(P), in coordinates Vi(P)=vi(xj), the statement that these vectors are tangent vectors to curves is mathematically

dxidλ=vi(xj).

This is just a first order differential equation for xi(λ) that has always a unique solution in some neighborhood of P.

The Exponential Function on a Manifold

We are now in the position to understand the tool that connects Lie algebra elements and Lie group elements: The exponential function. Given a vector field Y=ddλ, we can find the corresponding integral curve xi(λ) through some point. The coordinates of two points (λ0 and λ0+ϵ) on this curve are related by the Taylor series

xi(λ0+ϵ)=xi(λ0)+ϵ(dxidλ)λ0+12!ϵ2(d2xidλ2)λ0+=(1+ϵddλ+12!ϵ2d2dλ2+)xi|λ0=exp(ϵddλ)xi|λ0

The exp notation here should be understood as a shorthand notation for series of differential operators that act on xi, evaluated at λ0.

The Lie algebra

The Lie algebra is defined as the tangent space at the identity TeG, of the manifold G. We will now see that each element of TeG, i.e., each tangent vector at the identity defines a unique vector field V on G.

We can then find the corresponding integral curve for this vector field V that goes through e and has tangent vector Ve at the identity. By using the exp operator, as defined above, we can get to any point on this curve. V is completely determined by Ve and therefore the points of G on this curve can be denoted by

gve(t)=exptV|e

This is how we are able to get Lie group elements from Lie algebra elements. We have a one-to-one correspondence between tangent vectors and differential operators and are therefore able to use the exp notation to describe the corresponding Taylor series. This Taylor series connects different points on the same curve and therefore we are able to get from each element of the Lie algebra TeG, elements of the Lie group G.

We will now take a look at how each tangent vector at the identity defines a unique vector field.

The Lie algebra and left-invariant vector fields

Given a Lie group G,  natural maps (diffeomorphisms) are left- and right-translations. That is any element g of G defines a map from a point h of the manifold G onto some new point gh (left-translation) or hg (right-translation).

hgh    lefttranslationorhhg    righttranslation

Left_Trafo-Map

Left translations along g, map a neighborhood of e onto a neighborhood of g. In addition curves through e are mapped onto curves through g, and therefore tangent vectors at e onto tangent vectors at g

 

Here the group acts on itself.  Of course we are able to examine how a group acts on different manifolds, too, and this idea is investigated further by representation theory. For now it suffices to have a look at these maps of G onto itself.

Everything that follows can be done with right-translations, but its convention to use left-translations.

As always we denote the identity element of the group by e. The left-translation map along a particular g, maps any neighborhood of e onto a neighborhood of g (See the figue below). In addition, this map, maps curves into curves and therefore tangent vectors into tangent vectors.

The map between tangent vectors is commonly denoted Lg:TeTg. A vector field  V is said to be left-invariant if Lg maps V at e to V at g (and not to some different vector field at g, for example, some W), for all g. Because of the group composition law this is equivalent to saying that a vector field is left-invariant if Lg maps V at h to V at gh, for any h.

Every vector in Te defines a left-invariant vector field. The rule for getting a vector at each point of the manifold is given by Lg.  By multiplication with some gG, we get from any vector in TeG a vector in TgG, and by definition this is what we call a left-invariant vector field.

Therefore, the set of left-invariant vector fields is isomorphic to the elements of TeG, i.e., the Lie algebra of G. Mathematicians say the set of left-invariant vector fields is the Lie algebra of G. ( The defining feature of a Lie algebra, the  Lie bracket can be easily proved using coordinates. One needs to show that Lg maps [V,W] at e into [V,W] at g, which means the vector field [V,W]  is again left-invariant. )

The distinction is necessary, because of the abstract definition of a Lie algebra. The set of all vector fields on G form a Lie algebra, too, but this is not the Lie algebra one has normally in mind when talking about the Lie algebra of a given Lie group.

Summary

In the last section we saw, how each element of TeG defines uniquely a vector field on G. Each element of the Lie algebra can be identified with a left-invariant vector field.

To each left-invariant vector field, and therefore to each element of TeG belongs one unique integral curve through the identity element e of G.  We can get to points on this curve, by using the exponential map, which is shorthand for the Taylor series

gve(t)=exp(tV)|e.

Starting with a different element of TeG we have a different integral curve and are able to reach different points of G. (This does not hold in general. Two different elements of TeG, would be v and 2v. The corresponding integral curves go through the same points of G, only the “speed” (parameter value) differs. Nevertheless, for some elements we can get to different points)

This is how the Lie algebra TeG of group G is able to describe to describe the group G.