in Lie Group Theory

How is a Lie Algebra able to describe a Group?

If you understand the idea Lie Group= Manifold, you can easily understand one of the most curious facts of Lie theory:

The Lie algebra $\frak{g}$, which is defined as the tangent space at the identity  $T_eG$, is able to tell us almost everything about a given Lie group $G$.

The connection between Lie algebra elements and Lie group elements is established by the exponential map. We can understand this easily from the “naive”, historical approach to Lie theory that deals with infitesimal transformations. The differential geometry perspective enables us to understand this connection in a more pictorial way.

There are four simple concepts one needs to know in order to understand the connection between Lie algebra elements and Lie group elements: Curves, Functions, Vector Fields, and Integral Curves.



Illsutration of a curve on M, as a map from $R^1$ onto M. The point $\lambda$ of $R^1$ is mapped onto P in M.

One way to think about a curve on a manifold $M$ is as continuous series of points in $M$. The definition we want to use is, that a curve is a mapping from an open set of $R^1$ into $M$. Therefore, a curve associates with each point $\lambda$ of $R^1$ (which is just a real number) a point in $M$. The curved is said to be parametrized by $\lambda$. The point in $M$ is called the image point of $\lambda$. Two curves can be different even though they have the same image points in $M$, if they assign a different parameter value to the image points.



Illustration of a function on a manifold as map from M onto $R^1$

Functions are in some sense inverse to curves. A function on a manifold $M$ assigns a real number (= a element of $R^1$) to each point of $M$. To make sense of the word differentiable when talking about functions on $M$, it helps to think about differentiability in terms of coordinates.

Remember that a manifold is defined as a set for which in the neighborhood of any point a map onto $R^n$ exists. In other words: We can assign in the neighborhood of any point coordinate values to the points in the neighborhood. Therefore we can combine this map onto $R^n$, with the map (function) from $M$ to $R^1$ to get a map from $R^n$ onto $R^1$. A function is said to be differentiable if the it is differentiable in $R^n$.

In the abstract sense a function is a map $f(P)$, where $P$ denotes some point in $M$. By assigning coordinate values to this point, we have a map $f(x^1,x^2,…,x^n)$. If this function is differentiable in its arguments, the function is said to be differentiable. The coordinate map itself gives us a function for each coordinate. For example $x^2(P)$ is a function that maps each point $P$ to the corresponding coordinate value $x^2$.

Tangent Vectors:

Now we head to something that is quite hard to grasp when stumbling about it for the first time: The modern, abstract definition of a tangent vector.

We start with a curve that passes through some point $P$ of $M$. This curve is described, using the coordinate map, by the equations $x^i(\lambda)$. A differentiable function $f=f(x^1,x^2,…,x^n)$, abbreviated $f=f(x^i)$, on $M$ assigns a value to each point of the curve. To be precise we call this function, $g(\lambda)$, because $f$ is a function of $(x^1,x^2,…,x^n)$ and $g$ is a function of $\lambda$.

\begin{equation} g(\lambda) = f(x^1(\lambda),x^2(\lambda),…,x^n(\lambda))= f(x^i(\lambda)) \end{equation}

We can differentiate, which yields, using the chain rule

\begin{equation} \frac{dg}{d\lambda} = \sum_i \frac{dx^i}{d \lambda}\frac{ \delta f }{ \delta x^i} .\end{equation}

Because this equation holds for any function $f$, we can write

\begin{equation} \frac{d}{d\lambda} = \sum_i \frac{dx^i}{d \lambda}\frac{ \delta }{ \delta x^i} .\end{equation}

(If this seems strange to you, you may check out chapter 8.1 of the Feynman Lectures about Quantum Mechanics, which has a great explanation regarding this, “removing objects from an equation, if it holds for arbitrary objects of this kind)

If we would be talking exclusively about Euclidean space, we would interpret $\frac{dx^i}{d \lambda}$ as the components of a vector tangent to the curve. $dx^i$ are infinitesimal displacement along the curve by dividing them by a real number $\lambda$, gives the rate of change in this direction. Dividing by a real number does only change the scale, not the direction of the displacement.


Illustration of two different curves having the same tangent vector at some point P

Every curve has a unique tangent vector at any point. In contrast a tangent vector can be the tangent vector for infinitely many curves. For example, if we take a look at the simple curve

\begin{equation} x^i(\lambda)=\lambda a^i \end{equation}
with constants $a^i$. The tangent vector at the point $P$, $\lambda=0$ is $\frac{dx^i}{d \lambda} = a^i$. A slightly different curve
\begin{equation} x^i(\mu)=\mu^2 b^i+ \mu a^i \end{equation}
goes through the same point $P$ for $\mu=0$ and has the same tangent vector at $P$, $\frac{dx^i}{d \mu} = a^i$.

Illustration of two curves having the same path but different parameterizations.

Illustration of two curves having the same path but different parameterizations.

Furthermore, we can reparametrize the first curve by $x^i(\nu)=(\nu^3 + \nu) a^i$. This curve goes through the same points, but has different values $\nu$ associated with them. At $\nu=0$, this curve goes again through $P$ and the tangent vector is $\frac{dx^i}{d \nu} = a^i$.

We conclude: Every vector belongs to a whole equivalence classes of curves at this point. More dramatically one can say: The vector characterizes a whole equivalence classes of curves at this point. The defining feature of this equivalence class is the vector.

This observation motivates the modern definition of a tangent vector. In Euclidean space we have no problem with talking about displacements. A vector in Euclidean space points from one point to another. On a manifold there is, in general, no distance relation between points. Therefore, we need a more sophisticated idea to be able to talk about vectors.

Consider a curve $x^i=x^i(\mu)$ through $P$. Analogous to the considerations above we have

\begin{equation} \frac{d}{d\mu} = \sum_i \frac{dx^i}{d \mu}\frac{ \delta }{ \delta x^i} .\end{equation}

Using two numbers $a$ and $b$, we can take a look at the linear combination

\begin{equation} a\frac{d}{d\lambda} + b \frac{d}{d\mu} = \sum_i \left( a \frac{dx^i}{d \lambda} + b \frac{dx^i}{d \mu} \right) \frac{ \delta } { \delta x^i} .\end{equation}

$ a \frac{dx^i}{d \lambda} + b \frac{dx^i}{d \mu} $ are components of a new vector that is tangent to some curves through $P$. Therefore, for one of these curves that is parametrized by $\Phi$, we can write at $P$

\begin{equation} \frac{d}{d\Phi}= \sum_i \left( a \frac{dx^i}{d \lambda} + b \frac{dx^i}{d \mu} \right) \frac{ \delta } { \delta x^i} .\end{equation}

and therefore

\begin{equation} \frac{d}{d\Phi}= a\frac{d}{d\lambda} + b \frac{d}{d\mu} . \end{equation}

At this point we can see that we have a one-to-one correspondence between the space of all tangent vectors at some point $P$ and the space of all derivatives along curves at $P$.

The directional derivatives, like $\frac{d}{d\lambda}$, behave like the usual vectors under addition and are therefore said to form a vector space. A basis for this vector space is always given by $\{ \frac{ \delta } { \delta x^i} \}$, because we have seen that any directional derivative can be written as a linear combination of the derivatives $\frac{ \delta } { \delta x^i} $, the derivatives along the coordinate lines. The components in this basis are $ \{ \frac{dx^i}{d \lambda} \}$.

The definition of a tangent vector that mathematicians prefer is that $\frac{d}{d\lambda}$ is the tangent vector to the curve $x^i(\lambda)$. This definition is advantageous, because it involves no displacements over finite separations and it makes reference to coordinates. The definition makes sense, because, as we can see above, $\frac{d}{d\lambda}$ has the components $ \{ \frac{dx^i}{d \lambda} \}$, if we pick some coordinate system, which are exactly the numbers one associates naively with an “arrow” tangent to a curve.


Illustration of the tangent space for some point on the sphere

The subtle point is that we are no longer allowed to add vectors located at different points of the manifold. The vectors are said to live in the tangent space of $M$ at $P$, denoted $T_P M$. Therefore vectors at different points live in different spaces and are not allowed to be added together. Using the sphere as an example, one may picture this tangent space as a plane lying tangent to the sphere this point.

Vector Fields and Integral Curves

Another important term is vector field. A vector field is a rule that defines a vector at each point of the manifold. We have one tangent space at each point of the manifold and a vector field is a rule that picks one vector from each tangent space. As already noted above: Every curve has at each point a tangent vector. We can turn this a bit upside-down. Given a vector field, i.e. a tangent vector at each point of the manifold, we can find, if we pick one point $P$, precisely one curve going through $P$ that has exactly the tangent vector the vector field picks as a tangent vector at any point the curves passes through. Such a curve is called a integral curve, because given a vector field $V^i(P)$, in coordinates $V^i(P)=v^i(x^j)$, the statement that these vectors are tangent vectors to curves is mathematically

\begin{equation} \frac{dx^i}{d \lambda} = v^i(x^j) .\end{equation}

This is just a first order differential equation for $x^i(\lambda)$ that has always a unique solution in some neighborhood of $P$.

The Exponential Function on a Manifold

We are now in the position to understand the tool that connects Lie algebra elements and Lie group elements: The exponential function. Given a vector field $ Y = \frac{d}{d \lambda}$, we can find the corresponding integral curve $x^i(\lambda)$ through some point. The coordinates of two points ($\lambda_0$ and $\lambda_0 + \epsilon$) on this curve are related by the Taylor series

\begin{align} x^i (\lambda_0 + \epsilon)
&= x^i(\lambda_0) + \epsilon \left( \frac{d x^i}{d \lambda} \right)_{\lambda_0} + \frac{1}{2!}\epsilon^2 \left( \frac{d^2 x^i}{d \lambda^2} \right)_{\lambda_0} + … \\
&= \left( 1 + \epsilon \frac{d}{d \lambda} + \frac{1}{2!} \epsilon^2 \frac{d^2}{d \lambda^2} + … \right) x_i \Big |_{\lambda_0} \\
&= \exp \left( \epsilon \frac{d}{d \lambda} \right) x^i \Big |_{\lambda_0}

The $\exp$ notation here should be understood as a shorthand notation for series of differential operators that act on $x^i$, evaluated at $\lambda_0$.

The Lie algebra

The Lie algebra is defined as the tangent space at the identity $T_eG$, of the manifold $G$. We will now see that each element of $T_eG$, i.e., each tangent vector at the identity defines a unique vector field $V$ on $G$.

We can then find the corresponding integral curve for this vector field $V$ that goes through $e$ and has tangent vector $V_e$ at the identity. By using the $\exp$ operator, as defined above, we can get to any point on this curve. $V$ is completely determined by $V_e$ and therefore the points of $G$ on this curve can be denoted by

\begin{equation} g_{v_e(t)} = \exp{t V}|_e \end{equation}

This is how we are able to get Lie group elements from Lie algebra elements. We have a one-to-one correspondence between tangent vectors and differential operators and are therefore able to use the $\exp$ notation to describe the corresponding Taylor series. This Taylor series connects different points on the same curve and therefore we are able to get from each element of the Lie algebra $T_eG$, elements of the Lie group $G$.

We will now take a look at how each tangent vector at the identity defines a unique vector field.

The Lie algebra and left-invariant vector fields

Given a Lie group $G$,  natural maps (diffeomorphisms) are left- and right-translations. That is any element $g$ of $G$ defines a map from a point $h$ of the manifold $G$ onto some new point $gh$ (left-translation) or $hg$ (right-translation).

\begin{equation} h \rightarrow  gh \mathrm{ \  \ \ \  left-translation} \qquad \mathrm{ or } \qquad h \rightarrow  hg \mathrm{ \  \ \ \ right-translation} \end{equation}


Left translations along g, map a neighborhood of e onto a neighborhood of g. In addition curves through e are mapped onto curves through g, and therefore tangent vectors at e onto tangent vectors at g


Here the group acts on itself.  Of course we are able to examine how a group acts on different manifolds, too, and this idea is investigated further by representation theory. For now it suffices to have a look at these maps of $G$ onto itself.

Everything that follows can be done with right-translations, but its convention to use left-translations.

As always we denote the identity element of the group by $e$. The left-translation map along a particular $g$, maps any neighborhood of $e$ onto a neighborhood of $g$ (See the figue below). In addition, this map, maps curves into curves and therefore tangent vectors into tangent vectors.

The map between tangent vectors is commonly denoted $L_g : T_e \rightarrow T_g$. A vector field  $V$ is said to be left-invariant if $L_g$ maps $V$ at $e$ to $V$ at $g$ (and not to some different vector field at $g$, for example, some $W$), for all $g$. Because of the group composition law this is equivalent to saying that a vector field is left-invariant if $L_g$ maps $V$ at $h$ to $V$ at $gh$, for any $h$.

Every vector in $T_e$ defines a left-invariant vector field. The rule for getting a vector at each point of the manifold is given by $L_g$.  By multiplication with some $g\in G$, we get from any vector in $T_eG$ a vector in $T_gG$, and by definition this is what we call a left-invariant vector field.

Therefore, the set of left-invariant vector fields is isomorphic to the elements of $T_eG$, i.e., the Lie algebra of $G$. Mathematicians say the set of left-invariant vector fields is the Lie algebra of $G$. ( The defining feature of a Lie algebra, the  Lie bracket can be easily proved using coordinates. One needs to show that $L_g$ maps $[V,W]$ at $e$ into $[V,W]$ at $g$, which means the vector field $[V,W]$  is again left-invariant. )

The distinction is necessary, because of the abstract definition of a Lie algebra. The set of all vector fields on $G$ form a Lie algebra, too, but this is not the Lie algebra one has normally in mind when talking about the Lie algebra of a given Lie group.


In the last section we saw, how each element of $T_eG$ defines uniquely a vector field on $G$. Each element of the Lie algebra can be identified with a left-invariant vector field.

To each left-invariant vector field, and therefore to each element of $T_eG$ belongs one unique integral curve through the identity element $e$ of $G$.  We can get to points on this curve, by using the exponential map, which is shorthand for the Taylor series

\begin{equation} g_{v_e(t)} = \exp{(t V)}|_e. \end{equation}

Starting with a different element of $T_eG$ we have a different integral curve and are able to reach different points of $G$. (This does not hold in general. Two different elements of $T_eG$, would be $v$ and $2v$. The corresponding integral curves go through the same points of $G$, only the “speed” (parameter value) differs. Nevertheless, for some elements we can get to different points)

This is how the Lie algebra $T_eG$ of group $G$ is able to describe to describe the group $G$.




P.S. I wrote a textbook which is in some sense the book I wished had existed when I started my journey in physics. It's called "Physics from Symmetry" and you can buy it, for example, at Amazon. And I'm now on Twitter too if you'd like to get updates about what I'm recently up to.

If you want to get an update whenever I publish something new, simply put your email address in the box below.

My email address is...

No spam guaranteed. Unsubscribe at any time.

Write a Comment



  • A complete guide to the adjoint representation of a Lie group? 24. December 2018

    […] written a long post about how and why this […]