In this posts we discuss how continuous symmetries can be described mathematically. Many important features of such symmetries can be described using something simple, called Lie algebras. In the second part of this post you will see why a new object, called Lie bracket, is the defining feature of a Lie algebra.
As an aside: Group theory is the branch of mathematics one uses to work with symmetries. If you don’t know already what group theory is all about, have a look at my my post about it.
The branch of Group theory that deals with continuous symmetries is called Lie theory. This means that Lie groups have elements which are arbitrary close to the identity transformation. (The identity transformation is the transformation that changes nothing at all.)
An arbitrary group has, in general, no element close to the identity. Take for example the symmetries of a square. The set of transformations that leave the square invariant are four rotations: a rotation by 0°, by 90°´, by 180° and one by 270°, plus some mirror symmetries. A rotation by 0,000001°, which is very close to the identity transformation (a rotation by 0°), is not in the set of transformations.
Next, take a look at the symmetry transformations of a circle. Certainly, a rotation by 0,000001° is a symmetry of the circle. One says, the symmetry group of a circle is continuous, because the rotation parameter (the rotation angle) can take on arbitrary (continuous) values. Mathematically, with the identity denoted $I$, an element $g$ close to the identity is denoted \begin{equation} g(\epsilon)=I+ \epsilon X \end{equation} where the $\epsilon$ is, as always in mathematics, some really, really small number and $X$ is an object, called generator, we will talk about in a moment.
Such small transformations, when acting on some object change barely anything. In the smallest possible case, such transformations are called infinitesimal transformation. Nevertheless, repeating such an infinitesimal transformation often enough, results in a finite transformation. Think about rotations: Many small rotations in one direction is equivalent to one big rotation in the same direction.
Mathematically, we can write the idea of repeating a small transformation many times \begin{equation} h(\theta)=(I+ \epsilon X) (I+ \epsilon X) (I+ \epsilon X) … = (I+ \epsilon X) ^k, \end{equation} where $k$ denotes how often we repeat the small transformation.
If $\theta$ denotes some finite transformation parameter, e.q. 50° or so, and $N$ is some really big number that makes sure we are close to the identity, we can write the term above as \begin{equation} g(\theta)=I+ \frac{\theta}{N} X \end{equation} The transformations we want to consider are the smallest possible, which means $N$ must be the biggest possible number, i.e. $N \rightarrow \infty $. To get a finite transformation from such a infinitesimal transformation, one has to repeat the infinitesimal transformation infinitly often. Mathematically \begin{equation} g(\theta)= \lim_{N \rightarrow \infty} (I+ \frac{\theta}{N} X) ^N \end{equation} which is in the limit just the exponential function \begin{equation} g(\theta)= \lim_{N \rightarrow \infty} (I+ \frac{\theta}{N} X) ^N = e^{\theta X} \end{equation} In some sense the object $X$ generates the finite transformation $g$, which is why its called the generator. This will be made more precise in a moment, but first take a look at another way of looking at this:
If we are considering a continuous group of transformations that are given by matrices, we can make a Taylor expansion of an element of the group around the identity, as long as the element is close to the identity, which means the transformation parameter $\theta$ is small. For a big transformation parameter $\theta$, which means a bigger transformation the first terms of the Taylor series aren’t a good approximation for the transformation. The Taylor series is \begin{equation} g(\theta)=I+ \frac{ dg}{d \theta }. |_{\theta=0} \theta + \frac{ d^2g}{d \theta^2 }. |_{\theta=0} \theta^2 + … = \sum_n \frac{d^n g }{d \theta^n }\big{|}_{\theta=0 } \theta^n \end{equation} This series expansion can be written in a more compact form, remembering the series expansion of the exponential series \begin{equation} g(\theta)= e ^{ \frac{ dg}{d \theta }. |_{\theta=0} \theta } \equiv \sum_n \frac{d^n g }{d \theta^n }\big{|}_{\theta=0 } \theta^n. \end{equation} and we can see now, how to make the connection to the previous description. By comparision with the result above: $ X = \frac{ dg}{d \theta }. |_{\theta=0} $
The idea behind such lines of thought is, that one can learn a lot about a group by looking at the important part of the infinitesimal elements (denoted $X$ above): the generators.
In the following posts, using the language of differential geometry, the reason for why this is possible and how it makes things much easier, will be explained in great detail.
For matrix Lie groups one defines the Lie algebra corresponding to the Lie group as the collection of objects that give an element of the group when exponentiated. (This is an easy definition one can use when restricting to matrix Lie groups. Later we will introduce a more general definition.)
In mathematical terms:
For a Lie Group $G$ (given by a $n\times n$ matrix), the Lie algebra $\frak{g}$ of $G$ is given by those $n\times n $ matrices $X$ such that $e^{tX} \in G$ for $t \in R$.
We know from the definition of a group that a group is more than just a collection of transformations: The definition of a group includes a binary operation $\circ$ for combining transformations. For matrix Lie groups this is just ordinary matrix multiplication. Naivly one may think that the same combination rule $\circ$ is valid for elements of the Lie algebra. This is not the case! The elements of the Lie algebra are given by matrices (a famous theorem of Lie group theory, called Ado’s Theorem, states that every Lie algebra is isomorph to a matrix Lie algebra.), but the multiplication of two matrices of the Lie algebra doesn’t has to be an element of the Lie algebra. Instead, there is another combination rule for the Lie algebra that is directly connected to the combination rule of the corresponding Lie group.
The connection between the combination rule of the Lie group and the combination rule of the Lie algebra is given by the famous Baker-Campbell-Hausdorff-Formula:
\begin{equation} \mathrm{e}^{X} \circ \mathrm{e}^{Y} = \mathrm{e}^{X+Y+\frac{1}{2}[X,Y]+\frac{1}{12}[X,[X,Y]]-\frac{1}{12}[Y,[X,Y]]+\ldots} \end{equation}
- On the left hand side, we have the multiplication of two elements of the Lie group, lets name them $g$ and $h$, which we can write in terms of the corresponding generators:
\begin{equation} gh= \mathrm{e}^{X} \circ \mathrm{e}^{Y} \end{equation}
- On the right-hand side we have a single object of the group $\mathrm{e}^{X+Y+\frac{1}{2}[X,Y]+\frac{1}{12}[X,[X,Y]]+\ldots} $ and the multiplication of the group elements have been translated to a sum of Lie algebra elements. The new symbol appearing in this sum $[,]$ is called Lie bracket and for matrix Lie groups it is given by $[X,Y]=XY-YX$, which is called the commutator of $X$ and $Y$.The elements $XY$ and $YX$ need not to be part of the Lie algebra, but their difference always is!
We learn from the Baker-Campbell-Hausdorff-Formula that the natural product of the Lie algebra is , not as one would naivly think, ordinary matrix multiplication, but the Lie bracket $[,]$. One says, the Lie algebra is closed under the Lie bracket, just as the group is closed under the corresponding composition rule $\circ$, e.q. matrix multiplication.
The abstract definiton of a Lie algebra will use the Lie bracket as the defining object. A Lie algebra is defined by its Lie bracket. As always, things become much easier when we look at some examples.
Example: Rotations in three dimensions and the group SO(3)
We start with rotations in three dimensions. The defining feature of a rotation is that it leaves the length of any vector unchanged. The length of a vector in three-dimensions is given by the standard scalar-product of the vector with itself
$ length = \vec v\cdot \vec v = \vec v^T \vec v,$
where the upper $T$ denotes transposing. We exchanged the scalar product $\cdot$ by ordinary matrix (a three dimensional vector is then just a (1,3) matrix) multiplication, which requires that we transpose (exchanging columns and lines of a matrix) the vector to the left, to get the same result that we get using the rules of the scalar product. In what ways can we transform the arbitrary vector $\vec v $ such that its length remains unchanged? We denote the transformation by a matrix $O$ and therefore $v’=O \vec v$. The length of this transformed vector is
$ length’ = \vec v’\cdot \vec v’ = \vec v’^T \vec v’ = \vec v ^T O^T O \vec v,$ (using that for two matrices $A$ and $B$, $(AB)^T= B^T A^T$ holds)
and requiring that the matrix $O$ doesn’t change the length of the vector means
$ length’ = length \rightarrow \vec v’^T \vec v’= \vec v^T \vec v \rightarrow \vec v ^T O^T O \vec v = \vec v^T \vec v$
and therefore $O^T O =1$ must hold. This is defining condition of a matrix Lie group, called $O(3)$, where the $O$ stands for orthogonal and $3$ for 3 dimensions.
Unfortunately, leaving length unchanged is not an exclusive feature of rotations. Mirror transformations have this property, too. Nevertheless, rotations preserve the orientation (right-handed coordinate system $\rightarrow$ right-handed coordinate system) , whereas mirror transformations change it, by definition (right-handed coordinate system $\rightarrow$ left-handed coordinate system). Therefore we restrict the transformation to rotations if we demand $\det(O)=1$. (This is a basic result of linear algebra, you can read about for example at Wikipedia if you don’t know already about it.)
Therefore, all $3\times3$ matrices fulfilling the two condition $O^T O =1$ and $\det(O)=1$ describe rotations in three dimensions. These two conditions define a matrix Lie group called $SO(3)$, where $O$ stands for orthogonal ($O^T O =1$) and $S$ for special ($\det(O)=1$ ). A basis for this group are the usual rotation matrices in three dimensions. (You can check that these matrices fulfil the defining conditions and that they are linearly independent.)
\begin{equation} R_x= \begin{pmatrix}
1&0 &0 \\ 0&\cos(\theta) &- \sin(\theta) \\ 0 & \sin(\theta)& \cos(\theta)
\end{pmatrix} \qquad R_y= \begin{pmatrix}
\cos(\theta)&0 &-\sin(\theta) \\ 0& 1 & 0 \\ \sin(\theta)& 0& \cos(\theta)
\end{pmatrix} \end{equation}
\begin{equation} R_z= \begin{pmatrix}
\cos(\theta)& \sin(\theta) &0 \\ -\sin(\theta) &\cos(\theta) & 0 \\ 0 & 0& 1
\end{pmatrix} \end{equation}
For example, if we want to rotate the vector
\begin{equation} \vec v = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix}
\end{equation} around the z-axis, we multiply it with the corresponding rotation matrix
\begin{equation} R_z(\theta) \vec v = \begin{pmatrix}
\cos(\theta)& \sin(\theta) &0 \\ -\sin(\theta) &\cos(\theta) & 0 \\ 0 & 0& 1
\end{pmatrix} \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} = \begin{pmatrix} \cos(\theta) \\-\sin(\theta) \\ 0 \end{pmatrix} \end{equation}
Lie algebra of SO(3)
We can now derive the Lie algebras of $SO(3)$. The definition given in the last section for the Lie algebra was that its the set of matrices that gives elements of the group by exponentiation.
The defining conditions for the $SO(3)$ group are
\begin{equation} \label{eq:defSO3} O^T O \stackrel{!}{=} I \qquad {\mathrm{ and }} \qquad \det(O)\stackrel{!}{=}1\end{equation}
Consider an infinitesimal transformation
\begin{equation} O_{\mathrm{inf }} = I+ \epsilon J,\end{equation}
where we get a finite transformation as usual by exponentiating. Putting this into the first defining condition we get
\begin{equation}( I+ \epsilon J)^T (I+ \epsilon J)=I \qquad \rightarrow \qquad I + \epsilon J^T + \epsilon J + \underbrace{\epsilon^2 J^T J}_{\approx 0 \text{ because } \epsilon^2\approx0} = I \end{equation}
\begin{equation} \label{eq:bed1} \rightarrow J^T + J \stackrel{!}{=}0 \end{equation}
A user named assjtt at reddit pointed out correctly that $J^T + J \stackrel{!}{=}0$ already implies $tr(J) = 0$ and therefore the following lines aren’t necessary to derive this.
Using the second condition and the identity $\det({\mathrm{e }}^{A})= {\mathrm{e }}^{{\mathrm{tr }}(A)}$ for the matrix exponential function, we see
\begin{equation} \det(O) \stackrel{!}{=} 1 \qquad \rightarrow \qquad \det({\mathrm{e }}^{\Phi J}) = \mathrm{e }^{\Phi{\mathrm{tr }}(J)} \stackrel{!}{=} 1
\end{equation}
\begin{equation} \label{eq:bed2} \rightarrow {\mathrm{tr }}(J) \stackrel{!}{=} 0 \end{equation}
Three linear independent (therefore these matrices are a basis of all 3×3 matrices fulfilling these conditions, from which all other can be constructed by linear combinations.) matrices fulfilling the conditions eq.\ref{eq:bed1} and eq.\ref{eq:bed2} are
\begin{equation} \label{eq:SO3-generators} J_1= \begin{pmatrix}
0&0 &0 \\ 0&0 & -1 \\ 0 & 1& 0
\end{pmatrix} \qquad J_2= \begin{pmatrix}
0&0 &1 \\ 0&0 & 0 \\ -1 & 0& 0
\end{pmatrix} \qquad J_3= \begin{pmatrix}
0&-1 &0 \\ 1&0 & 0 \\ 0 & 0& 0
\end{pmatrix}. \end{equation}
These are the generators of the $SO(3)$ group! We can compute the Lie-Bracket of these generators by brute-force computation, which yields
\begin{equation} [J_i, J_j]= i \epsilon_{ijk} J_k, \end{equation}
where $\epsilon_{ijk}$ is the Levi-Civita symbol.
Lets see what finite transformation matrix we get from the first generator. We can focus on the lower right 2×2 matrix $j_1$ and ignore the zeroes for a moment.
\begin{equation} J_1= \begin{pmatrix}
0 & \\
& \underbrace{\begin{pmatrix} 0& -1 \\ 1& 0 \end{pmatrix}}_{\equiv j_1} \\
\end{pmatrix} \end{equation}
We can immediately compute
\begin{equation} (j_1)^2= -1, \end{equation}
therefore
\begin{equation} (j_1)^3= \underbrace{(j_1)^2}_{=-1} j_1 = -j_1 \quad , \quad (j_1)^4= +1 \quad , \quad (j_1)^5= +j. \end{equation} In general we have
\begin{equation} (j_1)^{2n} = (-1)^n I \qquad {\mathrm{and }} \qquad (j_1)^{2n+1} = (-1)^n j_1 \end{equation}
which we can use when evaluating the exponential function as series expansion
\begin{equation}R_{1 \mathrm{ \ fin }} = \mathrm{e }^{\Phi j_1} = \sum_{n=0}^\infty \frac{\Phi^n j_1^n}{n!}
\end{equation}
\begin{equation}= \sum_{n=0}^\infty \frac{\Phi^{2n}}{(2n)!} \underbrace{(j_1)^{2n }}_{(-1)^n I} + \sum_{n=0}^\infty \frac{\Phi^{2n+1} }{(2n+1)!} \underbrace{(j_1)^{2n+1 }}_{(-1)^n j_1}
\end{equation}
\begin{equation}= \underbrace{\left(\sum_{n=0}^\infty \frac{\Phi^{2n}}{(2n)!} (-1)^n \right)}_{=\cos(\phi)}I + \underbrace{\left(\sum_{n=0}^\infty \frac{\Phi^{2n+1} }{(2n+1)!} (-1)^n \right)}_{=\sin(\phi)} j_1
\end{equation}
\begin{equation} = \cos(\phi) \begin{pmatrix} 1& 0 \\ 0& 1 \end{pmatrix} + \sin(\phi) \begin{pmatrix} 0& -1 \\ 1& 0 \end{pmatrix} = \begin{pmatrix} \cos(\phi)& -\sin(\phi) \\ \sin(\phi)& \cos(\phi) \end{pmatrix}
\end{equation}
And therefore the complete finite transformation matrix is
\begin{equation}R_1= \begin{pmatrix}
1&0 &0 \\ 0&\cos(\theta) & – \sin(\theta) \\ 0 & \sin(\theta)& \cos(\theta)
\end{pmatrix}
\end{equation}
which we can recognize as the well-known rotation matrices in 3-dimensions.
The group $SU(2)$ and its Lie algebra
Next we look at a Lie group, called $SU(2)$, which has a close connection to rotations in three dimensions, too. Unfortunately, exploring this connection takes some time and in this post I want to focus on a different aspect. I will treat here $SU(2)$ as an abstract group defined by the conditions
\begin{equation} U^\dagger U = (U^\star )^T = 1 \end{equation}
\begin{equation} \det(U) =1. \end{equation}
and describe the connectionbetween $SU(2)$ and rotations in a different post. At this point you have to believe me that $SU(2)$ is incredibly important in physics and therefore worth a look.
$S$ in $SU(2)$ stands again for special ($\det(U) =1$), $U$ for unitary ($U^\dagger U = 1$) and $2$ for the dimensions. $SU(2)$ is the set of $2 \times 2$ matrices, fulfilling these conditions.
If we next want to derive the Lie algebra of $SU(2)$, we start again with the defining conditions of the group. If we write them in terms of the generators we get
\begin{equation} ({\mathrm{e }}^{iX})^{\dagger} {\mathrm{e }}^{iX} = 1 \end{equation}
\begin{equation} \det({\mathrm{e }}^{iX})=1 \end{equation}
The first condition tells us, using the Baker-Champell-Hausdorf Theorem and $[X,X]=0$, that that $X$ must be hermitian ($ X^\dagger = X$). If we use the identity $\det({\mathrm{e }}^{A})= {\mathrm{e }}^{{\mathrm{tr }}(A)}$ we see from the second condition.
\begin{equation} \label{eq:SU2gencondition} \det({\mathrm{e }}^{iX})= {\mathrm{e }}^{i {\mathrm{tr }}(X)} = 1 \end{equation}
that tr($X$)$\overset{!}{=} 0$. Therefore our generators must be traceless hermitian matrices. A basis for the hermitian traceless $2×2$ matrices is formed by $3$ matrices. (A complex $2×2$ matrix has 4 complex entries and therefore 8 degrees of freedom. Because of the two conditions only three degrees of freedom remain.) Arbitrary hermitian traceless $2×2$ matrices can be built of linear combinations of these matrices, that are often called Pauli matrices.
\begin{equation}
\label{eq:Paulimatrices}
\sigma_1 =
\begin{pmatrix}
0&1 \\ 1&0
\end{pmatrix}
\qquad , \qquad
\sigma_2 =
\begin{pmatrix}
0&-i \\ i&0
\end{pmatrix}
\qquad , \qquad
\sigma_3=
\begin{pmatrix}
1&0\\0&-1
\end{pmatrix}
\end{equation}
Using this basis we are able to derive the Lie Bracket. Computing yields
\begin{equation} \label{eq:sigma-warum2weg} [\sigma_i, \sigma_j]= 2 i \epsilon_{ijk} \sigma_k \end{equation}
where $\epsilon_{ijk}$ is again the Levi-Civita symbol. To get rid of the nasty $2$ its conventional to define the generators of $SU(2)$ as $J_i= \frac{1}{2} \sigma_i$. Therefore the Lie algebra reads
\begin{equation} [J_i, J_j]= i \epsilon_{ijk} J_k \end{equation}
We see that $SU(2)$ and $SO(3)$ share the same Lie bracket relation between its generators. The abstract definition of a Lie algebra uses the Lie bracket relation as its defining feature and therefore one says: $SU(2)$ and $SO(3)$ have the same Lie algebra.
We learn:
There is one Lie algebra to each Lie group, but there can be many groups having the same Lie algebra. Therefore, the reverse identification (Lie algebra $\rightarrow$ Lie group) is, in general, not possible.
This can be quite confusing and I want to state one advanced result at this point: There is precisely one distinguished Lie group for each Lie algebra. This group is distinguished because it has the property of beeing simply connected. (This notion will become clear in the following posts, when we look at Lie group theory from the differential geometrical point of view.)
To emphasize this important point:
There is precisly one distinguished (simply connected) Lie group corresponding to each Lie algebra.
This simply connected group can be thought of as the mother of all those groups having the same Lie algebra, because there are maps to all other groups with the same Lie algebra, from the simply connected group, but not vice versa. We could call it the mother group of this particular Lie algebra, but mathematicians tend to be less dramatic and call it the covering group. All other groups having the same Lie algebra are said to be covered by the simply connected one.
Sophus Lie
As a sidenote:
Nature agrees with such lines of thought! To describe the elementary particles one uses the representations of the covering group of the Poincare algebra (the Lie algebra belonging to the Lie group of special relativity, called Poincare group), instead of just naive group one derives the Poincare algebra from. To describe nature at the most fundamental level one must use the mother group instead of any of the other groups one can map to from the mother group.
The point of view (the “naive perspective”) described in this post is, in fact, how the inventor of the theory, Sophus Lie, looked at the theory, now named after him: Infinitesimal elements generating the groups of continuous transformations.
In the next posts we will take a look at a more sophisticated approach to Lie theory, that helps understanding many important features of Lie’s theory from a new angle.