Jakob Schwichtenberg

Short Introduction to and Motivation for Representation Theory

What may seem at a first glance like just another mathematical gimmick of group theory, is of incredible importance in physics. One can consider the Poincaré group (the set of all transformations that leave the speed of light constant) and use the framework of representation theory to construct the irreducible representations of this group. (The irreducible representations are the basic building blocks all other representations can be built of.) The straight-forward examination of the irreducible representations of the Poincaré group gives us physicists the appropriate mathematical tools needed to describe nature at the most fundamental level.

The lowest dimensional representation is trivial and called scalar or spin $0$ representation, because the objects (scalars) the group acts on in this representation are used to describe elementary particles of spin $0$. (In this representation the group doesn’t changes the objects in question at all.)
The next higher-dimensional representation is called spin $\frac{1}{2}$ or spinor representation, because the objects (spinors) the group acts on in this representation are used to describe elementary particles of spin $\frac{1}{2}$.
The third representation is called spin $1$ or vector representation, because the objects (vectors) the group acts on in this representation are used to describe elementary particles of spin $1$.

But what exactly is a representation?

For theoretical considerations its often useful to regard any group as an abstract group. This means defining the group by its manifold structure and the group operation. For example $SU(2)$ is the three sphere $S^3$, the elements of the group are points of the manifold and the rule associating a product point $ab$ with any two points $b$ and $a$ satisfies the usual group axioms. In physical applications one is more interested in what the group actually does, i.e. the group action.

An important idea is that one group can act on many different kinds of objects (this will make much more sense in a moment). This idea motivates the definition of a representation: A representation is a map between any group element $g$ of a group $G$ and a linear transformation $T(g)$ of some vector-space $V$ in such a way that the group properties are preserved:

$T(e)=I$ (The identity element of the group transforms nothing at all)
$T(g^{-1})=\big ( T(g) \big )^{-1} $ (Every inverse element is mapped to the corresponding inverse transformation
$T(g)\circ T(h) = T(gh)$ (The combination of transformations corresponding to $g$ and $h$ is the same as the transformation corresponding to the point $gh$)

This concept can be formulated more general if one accepts arbitrary (not linear) transformations of an arbitrary (not vector) space. Such a map is called a realization.

In physics one is concerned most of the time with linear transformations of objects living in some vector space (for example, Hilbert space in Quantum Mechanics or Minkowski space for Special Relativity), therefore the concept of a representation is more relevant to physics than the general concept called realization.

A representation identifies with each point (abstract group element) of the group manifold (the abstract group) a linear transformation of a vector space. The framework of representation theory enables one to examine the group action on very different vector spaces.

One of the most important examples in physics is $SU(2)$. For example one can examine how $SU(2)$ acts on the complex vector space of dimension two: $C^2$ (the action on $C^1$ is trivial). The objects living in this space are complex vectors of dimension two. Therefore $SU(2)$ acts on these objects as $2\times2$ matrices. The matrices (=linear transformations) acting on $C^2$ are just the usual matrices one identifies with $SU(2)$. Nevertheless we can examine how $SU(2)$ acts on $C^3$. There is a well defined framework for constructing such representations and as a result $SU(2)$ acts on complex vectors of dimension three as $3\times 3$ matrices for which a basis is given by

\begin{equation} J_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 0& 1 & 0 \\ 1&0 & 1 \\ 0 & 1 & 0 \end{pmatrix} , \qquad J_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} 0& -i & 0 \\ i&0 & -i \\ 0 & i & 0 \end{pmatrix} , \qquad J_3 = \begin{pmatrix} 1& 0 & 0 \\ 0&1 & 0 \\ 0 & 0 & 1 \end{pmatrix} \end{equation}

One can go on and inspect how $SU(2)$ acts on higher dimensional vectors. This can be quite confusing and maybe its better to call the group $S^3$ instead of $SU(2)$, because usually $SU(2)$ is defined as the set of complex $2\times 2$ (!!) matrices satisfying

$U^\dagger U = 1$ and $\det(U)=1$

and now we write $SU(2)$ as $3 \times 3$ matrices. Therefore one must always keep in mind that one means the abstract group, instead of the $2 \times 2 $ definition, when one talks about higher dimensional representation of $SU(2)$ or any other group.

Typically a group is defined in the first place by a representation. This enables one to study the group properties concretely. After this initial study its often more helpful to regard the group as an abstract group, because its possible to find other, useful representations of the group.

Lie Group Theory – A Completely Naive Introduction

In this posts we discuss how continuous symmetries can be described mathematically. Many important features of such symmetries can be described using something simple, called Lie algebras. In the second part of this post you will see why a new object, called Lie bracket, is the defining feature of a Lie algebra.

As an aside: Group theory is the branch of mathematics one uses to work with symmetries. If you don’t know already what group theory is all about, have a look at my my post about it.

The branch of Group theory that deals with continuous symmetries is called Lie theory. This means that Lie groups have elements which are arbitrary close to the identity transformation. (The identity transformation is the transformation that changes nothing at all.)

An arbitrary group has, in general, no element close to the identity. Take for example the symmetries of a square. The set of transformations that leave the square invariant are four rotations: a rotation by 0°, by 90°´, by 180° and one by 270°, plus some mirror symmetries. A rotation by 0,000001°, which is very close to the identity transformation (a rotation by 0°), is not in the set of transformations.

Next, take a look at the symmetry transformations of a circle. Certainly, a rotation by 0,000001° is a symmetry of the circle. One says, the symmetry group of a circle is continuous, because the rotation parameter (the rotation angle) can take on arbitrary (continuous) values. Mathematically, with the identity denoted $I$, an element $g$ close to the identity is denoted \begin{equation} g(\epsilon)=I+ \epsilon X \end{equation} where the $\epsilon$ is, as always in mathematics, some really, really small number and $X$ is an object, called generator, we will talk about in a moment.

Such small transformations, when acting on some object change barely anything. In the smallest possible case, such transformations are called infinitesimal transformation. Nevertheless, repeating such an infinitesimal transformation often enough, results in a finite transformation. Think about rotations: Many small rotations in one direction is equivalent to one big rotation in the same direction.

Mathematically, we can write the idea of repeating a small transformation many times \begin{equation} h(\theta)=(I+ \epsilon X) (I+ \epsilon X) (I+ \epsilon X) … = (I+ \epsilon X) ^k, \end{equation} where $k$ denotes how often we repeat the small transformation.

If $\theta$ denotes some finite transformation parameter, e.q. 50° or so, and $N$ is some really big number that makes sure we are close to the identity, we can write the term above as \begin{equation} g(\theta)=I+ \frac{\theta}{N} X \end{equation} The transformations we want to consider are the smallest possible, which means $N$ must be the biggest possible number, i.e. $N \rightarrow \infty $. To get a finite transformation from such a infinitesimal transformation, one has to repeat the infinitesimal transformation infinitly often. Mathematically \begin{equation} g(\theta)= \lim_{N \rightarrow \infty} (I+ \frac{\theta}{N} X) ^N \end{equation} which is in the limit just the exponential function \begin{equation} g(\theta)= \lim_{N \rightarrow \infty} (I+ \frac{\theta}{N} X) ^N = e^{\theta X} \end{equation} In some sense the object $X$ generates the finite transformation $g$, which is why its called the generator. This will be made more precise in a moment, but first take a look at another way of looking at this:

If we are considering a continuous group of transformations that are given by matrices, we can make a Taylor expansion of an element of the group around the identity, as long as the element is close to the identity, which means the transformation parameter $\theta$ is small. For a big transformation parameter $\theta$, which means a bigger transformation the first terms of the Taylor series aren’t a good approximation for the transformation. The Taylor series is \begin{equation} g(\theta)=I+ \frac{ dg}{d \theta }. |_{\theta=0} \theta + \frac{ d^2g}{d \theta^2 }. |_{\theta=0} \theta^2 + … = \sum_n \frac{d^n g }{d \theta^n }\big{|}_{\theta=0 } \theta^n \end{equation} This series expansion can be written in a more compact form, remembering the series expansion of the exponential series \begin{equation} g(\theta)= e ^{ \frac{ dg}{d \theta }. |_{\theta=0} \theta } \equiv \sum_n \frac{d^n g }{d \theta^n }\big{|}_{\theta=0 } \theta^n. \end{equation} and we can see now, how to make the connection to the previous description. By comparision with the result above: $ X = \frac{ dg}{d \theta }. |_{\theta=0} $

The idea behind such lines of thought is, that one can learn a lot about a group by looking at the important part of the infinitesimal elements (denoted $X$ above): the generators.

In the following posts, using the language of differential geometry, the reason for why this is possible and how it makes things much easier, will be explained in great detail.

For matrix Lie groups one defines the Lie algebra corresponding to the Lie group as the collection of objects that give an element of the group when exponentiated. (This is an easy definition one can use when restricting to matrix Lie groups. Later we will introduce a more general definition.)

In mathematical terms:

For a Lie Group $G$ (given by a $n\times n$ matrix), the Lie algebra $\frak{g}$ of $G$ is given by those $n\times n $ matrices $X$ such that $e^{tX} \in G$ for $t \in R$.

We know from the definition of a group that a group is more than just a collection of transformations: The definition of a group includes a binary operation $\circ$ for combining transformations. For matrix Lie groups this is just ordinary matrix multiplication. Naivly one may think that the same combination rule $\circ$ is valid for elements of the Lie algebra. This is not the case! The elements of the Lie algebra are given by matrices (a famous theorem of Lie group theory, called Ado’s Theorem, states that every Lie algebra is isomorph to a matrix Lie algebra.), but the multiplication of two matrices of the Lie algebra doesn’t has to be an element of the Lie algebra. Instead, there is another combination rule for the Lie algebra that is directly connected to the combination rule of the corresponding Lie group.

The connection between the combination rule of the Lie group and the combination rule of the Lie algebra is given by the famous Baker-Campbell-Hausdorff-Formula:

\begin{equation} \mathrm{e}^{X} \circ \mathrm{e}^{Y} = \mathrm{e}^{X+Y+\frac{1}{2}[X,Y]+\frac{1}{12}[X,[X,Y]]-\frac{1}{12}[Y,[X,Y]]+\ldots} \end{equation}

On the left hand side, we have the multiplication of two elements of the Lie group, lets name them $g$ and $h$, which we can write in terms of the corresponding generators:

\begin{equation} gh= \mathrm{e}^{X} \circ \mathrm{e}^{Y} \end{equation}

On the right-hand side we have a single object of the group $\mathrm{e}^{X+Y+\frac{1}{2}[X,Y]+\frac{1}{12}[X,[X,Y]]+\ldots} $ and the multiplication of the group elements have been translated to a sum of Lie algebra elements. The new symbol appearing in this sum $[,]$ is called Lie bracket and for matrix Lie groups it is given by $[X,Y]=XY-YX$, which is called the commutator of $X$ and $Y$.The elements $XY$ and $YX$ need not to be part of the Lie algebra, but their difference always is!

We learn from the Baker-Campbell-Hausdorff-Formula that the natural product of the Lie algebra is , not as one would naivly think, ordinary matrix multiplication, but the Lie bracket $[,]$. One says, the Lie algebra is closed under the Lie bracket, just as the group is closed under the corresponding composition rule $\circ$, e.q. matrix multiplication.

The abstract definiton of a Lie algebra will use the Lie bracket as the defining object. A Lie algebra is defined by its Lie bracket. As always, things become much easier when we look at some examples.

Example: Rotations in three dimensions and the group SO(3)

We start with rotations in three dimensions. The defining feature of a rotation is that it leaves the length of any vector unchanged. The length of a vector in three-dimensions is given by the standard scalar-product of the vector with itself

$ length = \vec v\cdot \vec v = \vec v^T \vec v,$

where the upper $T$ denotes transposing. We exchanged the scalar product $\cdot$ by ordinary matrix (a three dimensional vector is then just a (1,3) matrix) multiplication, which requires that we transpose (exchanging columns and lines of a matrix) the vector to the left, to get the same result that we get using the rules of the scalar product. In what ways can we transform the arbitrary vector $\vec v $ such that its length remains unchanged? We denote the transformation by a matrix $O$ and therefore $v’=O \vec v$. The length of this transformed vector is

$ length’ = \vec v’\cdot \vec v’ = \vec v’^T \vec v’ = \vec v ^T O^T O \vec v,$ (using that for two matrices $A$ and $B$, $(AB)^T= B^T A^T$ holds)

and requiring that the matrix $O$ doesn’t change the length of the vector means

$ length’ = length \rightarrow \vec v’^T \vec v’= \vec v^T \vec v \rightarrow \vec v ^T O^T O \vec v = \vec v^T \vec v$

and therefore $O^T O =1$ must hold. This is defining condition of a matrix Lie group, called $O(3)$, where the $O$ stands for orthogonal and $3$ for 3 dimensions.

Unfortunately, leaving length unchanged is not an exclusive feature of rotations. Mirror transformations have this property, too. Nevertheless, rotations preserve the orientation (right-handed coordinate system $\rightarrow$ right-handed coordinate system) , whereas mirror transformations change it, by definition (right-handed coordinate system $\rightarrow$ left-handed coordinate system). Therefore we restrict the transformation to rotations if we demand $\det(O)=1$. (This is a basic result of linear algebra, you can read about for example at Wikipedia if you don’t know already about it.)

Therefore, all $3\times3$ matrices fulfilling the two condition $O^T O =1$ and $\det(O)=1$ describe rotations in three dimensions. These two conditions define a matrix Lie group called $SO(3)$, where $O$ stands for orthogonal ($O^T O =1$) and $S$ for special ($\det(O)=1$ ). A basis for this group are the usual rotation matrices in three dimensions. (You can check that these matrices fulfil the defining conditions and that they are linearly independent.)

\begin{equation} R_x= \begin{pmatrix}
1&0 &0 \\ 0&\cos(\theta) &- \sin(\theta) \\ 0 & \sin(\theta)& \cos(\theta)
\end{pmatrix} \qquad R_y= \begin{pmatrix}
\cos(\theta)&0 &-\sin(\theta) \\ 0& 1 & 0 \\ \sin(\theta)& 0& \cos(\theta)
\end{pmatrix} \end{equation}

\begin{equation} R_z= \begin{pmatrix}
\cos(\theta)& \sin(\theta) &0 \\ -\sin(\theta) &\cos(\theta) & 0 \\ 0 & 0& 1
\end{pmatrix} \end{equation}

For example, if we want to rotate the vector
\begin{equation} \vec v = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix}
\end{equation} around the z-axis, we multiply it with the corresponding rotation matrix
\begin{equation} R_z(\theta) \vec v = \begin{pmatrix}
\cos(\theta)& \sin(\theta) &0 \\ -\sin(\theta) &\cos(\theta) & 0 \\ 0 & 0& 1
\end{pmatrix} \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} = \begin{pmatrix} \cos(\theta) \\-\sin(\theta) \\ 0 \end{pmatrix} \end{equation}

Lie algebra of SO(3)

We can now derive the Lie algebras of $SO(3)$. The definition given in the last section for the Lie algebra was that its the set of matrices that gives elements of the group by exponentiation.

The defining conditions for the $SO(3)$ group are

\begin{equation} \label{eq:defSO3} O^T O \stackrel{!}{=} I \qquad {\mathrm{ and }} \qquad \det(O)\stackrel{!}{=}1\end{equation}

Consider an infinitesimal transformation

\begin{equation} O_{\mathrm{inf }} = I+ \epsilon J,\end{equation}

where we get a finite transformation as usual by exponentiating. Putting this into the first defining condition we get

\begin{equation}( I+ \epsilon J)^T (I+ \epsilon J)=I \qquad \rightarrow \qquad I + \epsilon J^T + \epsilon J + \underbrace{\epsilon^2 J^T J}_{\approx 0 \text{ because } \epsilon^2\approx0} = I \end{equation}

\begin{equation} \label{eq:bed1} \rightarrow J^T + J \stackrel{!}{=}0 \end{equation}
A user named assjtt at reddit pointed out correctly that $J^T + J \stackrel{!}{=}0$ already implies $tr(J) = 0$ and therefore the following lines aren’t necessary to derive this.

Using the second condition and the identity $\det({\mathrm{e }}^{A})= {\mathrm{e }}^{{\mathrm{tr }}(A)}$ for the matrix exponential function, we see

\begin{equation} \det(O) \stackrel{!}{=} 1 \qquad \rightarrow \qquad \det({\mathrm{e }}^{\Phi J}) = \mathrm{e }^{\Phi{\mathrm{tr }}(J)} \stackrel{!}{=} 1
\end{equation}

\begin{equation} \label{eq:bed2} \rightarrow {\mathrm{tr }}(J) \stackrel{!}{=} 0 \end{equation}

Three linear independent (therefore these matrices are a basis of all 3×3 matrices fulfilling these conditions, from which all other can be constructed by linear combinations.) matrices fulfilling the conditions eq.\ref{eq:bed1} and eq.\ref{eq:bed2} are

\begin{equation} \label{eq:SO3-generators} J_1= \begin{pmatrix}
0&0 &0 \\ 0&0 & -1 \\ 0 & 1& 0
\end{pmatrix} \qquad J_2= \begin{pmatrix}
0&0 &1 \\ 0&0 & 0 \\ -1 & 0& 0
\end{pmatrix} \qquad J_3= \begin{pmatrix}
0&-1 &0 \\ 1&0 & 0 \\ 0 & 0& 0
\end{pmatrix}. \end{equation}

These are the generators of the $SO(3)$ group! We can compute the Lie-Bracket of these generators by brute-force computation, which yields

\begin{equation} [J_i, J_j]= i \epsilon_{ijk} J_k, \end{equation}

where $\epsilon_{ijk}$ is the Levi-Civita symbol.

Lets see what finite transformation matrix we get from the first generator. We can focus on the lower right 2×2 matrix $j_1$ and ignore the zeroes for a moment.

\begin{equation} J_1= \begin{pmatrix}
0 & \\
& \underbrace{\begin{pmatrix} 0& -1 \\ 1& 0 \end{pmatrix}}_{\equiv j_1} \\
\end{pmatrix} \end{equation}

We can immediately compute

\begin{equation} (j_1)^2= -1, \end{equation}

therefore

\begin{equation} (j_1)^3= \underbrace{(j_1)^2}_{=-1} j_1 = -j_1 \quad , \quad (j_1)^4= +1 \quad , \quad (j_1)^5= +j. \end{equation} In general we have

\begin{equation} (j_1)^{2n} = (-1)^n I \qquad {\mathrm{and }} \qquad (j_1)^{2n+1} = (-1)^n j_1 \end{equation}

which we can use when evaluating the exponential function as series expansion

\begin{equation}R_{1 \mathrm{ \ fin }} = \mathrm{e }^{\Phi j_1} = \sum_{n=0}^\infty \frac{\Phi^n j_1^n}{n!}
\end{equation}

\begin{equation}= \sum_{n=0}^\infty \frac{\Phi^{2n}}{(2n)!} \underbrace{(j_1)^{2n }}_{(-1)^n I} + \sum_{n=0}^\infty \frac{\Phi^{2n+1} }{(2n+1)!} \underbrace{(j_1)^{2n+1 }}_{(-1)^n j_1}
\end{equation}

\begin{equation}= \underbrace{\left(\sum_{n=0}^\infty \frac{\Phi^{2n}}{(2n)!} (-1)^n \right)}_{=\cos(\phi)}I + \underbrace{\left(\sum_{n=0}^\infty \frac{\Phi^{2n+1} }{(2n+1)!} (-1)^n \right)}_{=\sin(\phi)} j_1
\end{equation}

\begin{equation} = \cos(\phi) \begin{pmatrix} 1& 0 \\ 0& 1 \end{pmatrix} + \sin(\phi) \begin{pmatrix} 0& -1 \\ 1& 0 \end{pmatrix} = \begin{pmatrix} \cos(\phi)& -\sin(\phi) \\ \sin(\phi)& \cos(\phi) \end{pmatrix}
\end{equation}

And therefore the complete finite transformation matrix is

\begin{equation}R_1= \begin{pmatrix}
1&0 &0 \\ 0&\cos(\theta) & – \sin(\theta) \\ 0 & \sin(\theta)& \cos(\theta)
\end{pmatrix}
\end{equation}

which we can recognize as the well-known rotation matrices in 3-dimensions.

The group $SU(2)$ and its Lie algebra

Next we look at a Lie group, called $SU(2)$, which has a close connection to rotations in three dimensions, too. Unfortunately, exploring this connection takes some time and in this post I want to focus on a different aspect. I will treat here $SU(2)$ as an abstract group defined by the conditions

\begin{equation} U^\dagger U = (U^\star )^T = 1 \end{equation}
\begin{equation} \det(U) =1. \end{equation}

and describe the connectionbetween $SU(2)$ and rotations in a different post. At this point you have to believe me that $SU(2)$ is incredibly important in physics and therefore worth a look.

$S$ in $SU(2)$ stands again for special ($\det(U) =1$), $U$ for unitary ($U^\dagger U = 1$) and $2$ for the dimensions. $SU(2)$ is the set of $2 \times 2$ matrices, fulfilling these conditions.

If we next want to derive the Lie algebra of $SU(2)$, we start again with the defining conditions of the group. If we write them in terms of the generators we get

\begin{equation} ({\mathrm{e }}^{iX})^{\dagger} {\mathrm{e }}^{iX} = 1 \end{equation}

\begin{equation} \det({\mathrm{e }}^{iX})=1 \end{equation}

The first condition tells us, using the Baker-Champell-Hausdorf Theorem and $[X,X]=0$, that that $X$ must be hermitian ($ X^\dagger = X$). If we use the identity $\det({\mathrm{e }}^{A})= {\mathrm{e }}^{{\mathrm{tr }}(A)}$ we see from the second condition.

\begin{equation} \label{eq:SU2gencondition} \det({\mathrm{e }}^{iX})= {\mathrm{e }}^{i {\mathrm{tr }}(X)} = 1 \end{equation}

that tr($X$)$\overset{!}{=} 0$. Therefore our generators must be traceless hermitian matrices. A basis for the hermitian traceless $2×2$ matrices is formed by $3$ matrices. (A complex $2×2$ matrix has 4 complex entries and therefore 8 degrees of freedom. Because of the two conditions only three degrees of freedom remain.) Arbitrary hermitian traceless $2×2$ matrices can be built of linear combinations of these matrices, that are often called Pauli matrices.

\begin{equation}
\label{eq:Paulimatrices}
\sigma_1 =
\begin{pmatrix}
0&1 \\ 1&0
\end{pmatrix}
\qquad , \qquad
\sigma_2 =
\begin{pmatrix}
0&-i \\ i&0
\end{pmatrix}
\qquad , \qquad
\sigma_3=
\begin{pmatrix}
1&0\\0&-1
\end{pmatrix}
\end{equation}

Using this basis we are able to derive the Lie Bracket. Computing yields

\begin{equation} \label{eq:sigma-warum2weg} [\sigma_i, \sigma_j]= 2 i \epsilon_{ijk} \sigma_k \end{equation}

where $\epsilon_{ijk}$ is again the Levi-Civita symbol. To get rid of the nasty $2$ its conventional to define the generators of $SU(2)$ as $J_i= \frac{1}{2} \sigma_i$. Therefore the Lie algebra reads

\begin{equation} [J_i, J_j]= i \epsilon_{ijk} J_k \end{equation}

We see that $SU(2)$ and $SO(3)$ share the same Lie bracket relation between its generators. The abstract definition of a Lie algebra uses the Lie bracket relation as its defining feature and therefore one says: $SU(2)$ and $SO(3)$ have the same Lie algebra.

We learn:
There is one Lie algebra to each Lie group, but there can be many groups having the same Lie algebra. Therefore, the reverse identification (Lie algebra $\rightarrow$ Lie group) is, in general, not possible.

This can be quite confusing and I want to state one advanced result at this point: There is precisely one distinguished Lie group for each Lie algebra. This group is distinguished because it has the property of beeing simply connected. (This notion will become clear in the following posts, when we look at Lie group theory from the differential geometrical point of view.)

To emphasize this important point:

There is precisly one distinguished (simply connected) Lie group corresponding to each Lie algebra.

This simply connected group can be thought of as the mother of all those groups having the same Lie algebra, because there are maps to all other groups with the same Lie algebra, from the simply connected group, but not vice versa. We could call it the mother group of this particular Lie algebra, but mathematicians tend to be less dramatic and call it the covering group. All other groups having the same Lie algebra are said to be covered by the simply connected one.

Sophus Lie

As a sidenote:
Nature agrees with such lines of thought! To describe the elementary particles one uses the representations of the covering group of the Poincare algebra (the Lie algebra belonging to the Lie group of special relativity, called Poincare group), instead of just naive group one derives the Poincare algebra from. To describe nature at the most fundamental level one must use the mother group instead of any of the other groups one can map to from the mother group.

The point of view (the “naive perspective”) described in this post is, in fact, how the inventor of the theory, Sophus Lie, looked at the theory, now named after him: Infinitesimal elements generating the groups of continuous transformations.

In the next posts we will take a look at a more sophisticated approach to Lie theory, that helps understanding many important features of Lie’s theory from a new angle.

Origin of the term Delta FUNCTION

In a theoretical physics exam, we had to solve the problem to draw the potential given be $V(x)= \delta(x) V_0$. The “correct” solution can be seen in this picture. A fellow student drew the potential up to infinity, which resulted in zero points. Uhhm… Sir? Even long discussions afterwards did not change the correctors mind.

Mathematicians get, at least, goosebumps if you say it. Most physicists are aware of the fact that it is somehow incorrect to say. Nevertheless, the word is so common among physicists: Delta function.

The object that extracts the value of a function at parameter value zero

$ \tilde \delta(f(x)) = f(0)$

is not a function, but a distribution. In this short posts I want to explore a bit the connection between the two notions: delta distribution and delta function (which makes some sense after all).

Let’s start with the correct, modern, abstract definition of a distribution, as a special kind of one-form:

The distributions are the dual space of one forms to the vector space of all $C ^\infty$ (=infinitely differentiable) real-valued functions defined on the interval $-1 <= x <=1$ of $R^1$. This vector space is commonly denoted $C[-1,1]$.

(It is a group under addition (the sum of two $C ^\infty$ function is again $C ^\infty$ etc.) and a vector space under the multiplication by real constants.)

Phew… maybe some things need explanation:

The vectors of this vector space are functions. One may confuse not this usage the word vector with the arrow-type vectors. In modern mathematics, anything that behaves like the arrow-type vectors (addition, multiplication with constants etc.) is called vector. This can be quite confusing, but one has to get used to the fact that the word vector isn’t used exclusively for arrows in euclidean space.

A one-form is an object $\tilde w()$ that eats a vector $V$ and spits out a number $\tilde w(V)$, in the same sense a function eats a number $x$ and spits out another number $f(x)$. The one-forms are called the dual object to vectors, because we can think of a vector as an object $V$ that eats a one-form $\tilde w()$ and spits out a number: $\tilde w(V)$.

The delta distribution $\tilde \delta()$ eats a vector (= a function living in $C[-1,1]$as explained above), $f(x)$ and spits out a number $\tilde \delta( f(x))=f(0)$: The function value at parameter value zero.

This is how mathematicians think about the delta thing.

Historically, the delta object was introduced by Dirac, who defined the delta thing by its property $\tilde \delta( f(x))=f(0)$. Physicist were the first to use the delta thing and the theory of distributions came afterwards.

Lacking a better word, the delta thing was called delta function. Of course, there is no function, in any mathematical precise sense, with the property $\tilde \delta( f(x))=f(0)$, but we can make some sense of the notion delta function.

The idea is that it is possible to transform any vector (here a function) $g(x)$ in $C[-1,1]$ into a one form. The one-form $\tilde g()$ could, for example, spit out, for an arbitrary function $f(x)$, the number

\begin{equation} \tilde g( f(x)) = \int_{-1}^1 g(x)f(x) dx.
\end{equation}

The idea behind the delta function is to turn this idea around. Given a one form $ \tilde \delta()$, we can talk about a delta function $\delta(x)$ as the object that gives the function value at parameter value zero, when integrated together with $f(x)$:

\begin{equation} \tilde \delta(f(x))=\int_{-1}^1 \delta(x)f(x) dx = f(0)
\end{equation}

This delta “function” idea does not blow up as long as it stays under the integral. Nevertheless, there is no function $R^1 \rightarrow R^1$ (map from a number to a number), in any mathematical precise sense, with this property. Physicists treated $\delta(x)$ like a normal function and differentiated and integrated it. Again, this only makes sense as long as we are talking of $\delta(x)$ in combination with an integral:

\begin{equation} \int_{-1}^1 \delta'(x) f(x) dx = – \int_{-1}^1 \delta(x) f'(x) dx= – f'(0) , \end{equation}
where we integrated by parts.