What’s so special about the adjoint representation of a Lie group?

A representation is a map that maps each element of the set of abstract groups element to a matrix that acts on a vector space (see this post). The problem here is that at the beginning this can be quite confusing: If we can study the representation of any group on any vector space, where should we start?

Luckily, there exists exactly one distinguished representation, commonly called the adjoint representation.

First, recall some technicalities: The modern definition of a Lie group $G$ is that it’s a manifold whose elements satisfy the group axioms. Consequently, the Lie group looks in the neighborhood of any point (group element) like flat Euclidean space $R^n$ because that’s how a manifold is defined.

Now recall that the Lie algebra of a group is defined as the tangent space at the identity element $T_eG$ and Lie algebras are important, because, to quote from John Stillwell’s brilliant book Naive Lie Theory:

“The miracle of Lie theory is that a curved object, a Lie group G, can be almost completely captured by a a flat one, the Tangent space $T_eG$ of $G$ at the identity.”

I’ve written a long post about how and why this works.

It’s often a good idea to look at the Lie algebra of a group to study its properties, because working with a vector space, like $T_eG$, is in general easier than working with some curved space, like $G$. An important theoremc alled Ado’s Theorem, tells us that every Lie algebra is isomorphic to a matrix Lie algebra. This tells us that the knowledge of ordinary linear algebra is enough to study Lie algebras because every Lie algebra can be viewed as a set of matrices.

A natural idea is now to have a look at the representation of the group $G$ on the only distinguished vector space that comes automatically with each Lie group: The representation on its own tangent vector space at the identity $T_eG$, i.e. the Lie algebra of the group!

In other words, in principle, we can look at representations of a given group on any vector space. But there is exactly one distinguished vector space that comes automatically with each group: Its own Lie algebra. This representation is the adjoint representation.

In more technical terms the adjoint representation is a special map that satisfies $T(gh)=T(g)T(h)$, which is called a homomorphism, from $G$ to the space of linear operators on the tangent space at the identity $T_eG$. How does this representation look like?

A group can act on itself by left- and right-translation, given by the usual group multiplication $h \rightarrow gh$ and $h \rightarrow hg$. Both actions are not homomorphism because, denoting for example left-translation by $L_g$, i.e. $L_g(h) = gh$, we have $L_g (hj) \neq L_g(h) L_g(j)$, because $L_g(hj)=ghj \neq gh gj= L_g(h) L_g(j)$. Instead a homomorphism is given by a combination of left-translation by $g$ and right-translation by $g^{-1}$, commonly denoted by $I_g(h)=ghg^{-1}$ and called adjoint action. This is a homomorphism because

$I_g(hj)=I_g(h) I_g(j) = g h g^{-1} g j g^{-1} = g hj g^{-1} \quad \checkmark$, (using the definition of the inverse $g^{-1} g = 1$).

We have found a homomorphism that maps group elements to new group elements $I : G \rightarrow G$, but this is not a representation because $G$ is no vector space. Such a homomorphism to an arbitrary space (not necessarily a vector space) is called a realization. Nevertheless, we can use this homomorphism to derive a homomorphism to a vector space.

Firstly, take note that this homomorphism maps the identity to the identity for every group element $g$:

$I_g(e)= g e g^{-1}= g g^{-1} =e.$

If you haven’t already you should now read this post, because we will need in the following notions like curves and tangent vectors that are explained there in detail.

The property $I_g(e)=e$ means that any curve through $e$ on the manifold $G$ is mapped by this homomorphism to another (not necessarily the same) curve through $e$.Therefore the adjoint representation maps any tangent vector (of a curve on $G$) in $T_eG$ to another tangent vector in $T_eG$. In contrast left- (and right-)translations $L_g$ map tangent vectors in $T_eG$ to tangent vectors in $T_gG$.


Left-translations $L_g$ map the identity $e$ to the point $g$. Therefore any curve through $e$ is mapped by $L_g$ to a curve through $g$.

The (by $I_g$) induced map of any tangent vector in $T_eG$ (an element of the Lie algebra) to another tangent vector in $T_eG$ is called the adjoint transformation of $T_eG$ induced by g. This induced map defines a representation of the group $G$ on $T_eG$, because $T_eG$ is a vector space.

In the same spirit, we can consider Lie algebra representations (in contrast to Lie group representations). Analogous this means that the elements of the Lie algebra act on some vector space as linear transformations. Again a distinguished representation is given by the action of the Lie algebra elements on the distinguished vector space $T_eG$ (the Lie algebra itself).

The corresponding homomorphism can be derived from the homomorphism that defined the representation of the group $G$ on $T_eG$. The idea goes as follows:

Consider a curve $\gamma(t)$ on the manifold $G$ with $\gamma(0)=e \in G$ and tangent vector $\gamma'(0)=X \in T_eG$. Furthermore let the curve go through some arbitrary element $g \in G$. Using this curve we can rewrite the above adjoint action

$Ad_g (X)= \underbrace{g}_{\in G} \underbrace{X}_{\in T_eG} \underbrace{g^{-1}}_{\in G}$ as $Ad_g (Y)= Ad_{\gamma(t)}(Y) = \gamma(t) Y \gamma(t)^{-1} $

We get the Lie algebra homomorphism we are searching, called $ad$ (small $a$!) by differentiating this map at the identity $t=0$. Differentiating yields

$ \frac{d}{dt} Ad_{\gamma(t)}(Y) \big |_{t=0}= \frac{d}{dt} \gamma(t) Y \gamma(t)^{-1} \big |_{t=0} = \gamma'(0) Y \gamma(0)^{-1} + \gamma(0) Y \frac{d}{dt} \gamma(t)^{-1} \big |_{t=0}$

For a matrix Lie group we can easily calculate $ \frac{d}{dt} \gamma(t)^{-1}$, because of the matrix identity

$\frac{d}{dt} A^{-1}(t)=-A^{-1}(t) \big(\frac{d}{dt} A(t)\big) \,A^{-1}(t)$.

This identity follows from

$\frac{d}{dt} \big(A(t)\,A^{-1}(t)\big)=\frac{d}{dt} \big(e \big)=0$. Using the product rule

$\big( \frac{d}{dt} A(t) \big) A^{-1}(t) + A(t) \big( \frac{d}{dt} A^{-1}(t) \big) =0$ and multiplying from the left with $A^{-1}(t)$ yields

$ \rightarrow A^{-1}(t) \big( \frac{d}{dt} A(t) \big) A^{-1}(t) = – A^{-1}(t) A(t) \big( \frac{d}{dt} A^{-1}(t) \big) = \big( \frac{d}{dt} A^{-1}(t) \big) \quad \checkmark $

Therefore we have

$ \frac{d}{dt} Ad_{\gamma(t)}(Y) \big |_{t=0}= \frac{d}{dt} \gamma(t) Y \gamma(t)^{-1} \big |_{t=0} $

$= \gamma'(0) Y \gamma(0)^{-1} + \gamma(0) Y (-\gamma(0)^{-1} \gamma'(0) \gamma(0)^{-1}) = XY e – e Y eX e $

$ = XY-YX$, where we used the definitions for the curve we made above ($\gamma(0)=\gamma(0)^{-1}=e$ and $\gamma'(0)=X)$.

We see that the adjoint action of the Lie algebra on itself is given by commutator. Thus, this is a way of seeing that the Lie bracket is the natural product of the tangent space $T_eG$, i.e. of the Lie algebra. The representation of the Lie algebra on itself is given by the adjoint action $ad_X$, i.e. by the Lie bracket! (Recall that a representation is, by definition, a map.)

This a way of figure out the Lie bracket of a given group. If we aren’t considering matrix Lie groups (for which the Lie bracket is the commutator) the Lie bracket may be something different and the steps we followed above let us calculate the corresponding Lie bracket. Nevertheless, we know from Ado’s theorem that every Lie algebra can be considered as matrix Lie algebra with the commutator as Lie bracket.

In addition, the Lie algebra representation we derived above is often used as a model for all Lie algebra representations. A Lie algebra homomorphism (and therefore representation) can be defined as a map, respecting the adjoint action! To be precise:

A Lie algebra representation $(\Phi,V)$ of a Lie algebra $T_eG$ on some vector space $V$ is defined as a linear map
$\Phi$ between any Lie algebra element $X\in T_eG$ and a linear transformation $T(g)$ of some vector space $V$
satisfying $\Phi([X,Y])= [\Phi(X), \Phi(Y)]$

In the same way, a Lie group representation is defined as a linear map (to some vector space) respecting the group element combination rule $\Phi(gh)=\Phi(g)\Phi(h)$, a Lie algebra representation is a linear map (to some vector space) respecting the natural product of the Lie algebra, i.e. the Lie bracket.

How is a Lie Algebra able to describe a Group?

If you understand the idea Lie Group= Manifold, you can easily understand one of the most curious facts of Lie theory:

The Lie algebra $\frak{g}$, which is defined as the tangent space at the identity  $T_eG$, is able to tell us almost everything about a given Lie group $G$.

The connection between Lie algebra elements and Lie group elements is established by the exponential map. We can understand this easily from the “naive”, historical approach to Lie theory that deals with infitesimal transformations. The differential geometry perspective enables us to understand this connection in a more pictorial way.

There are four simple concepts one needs to know in order to understand the connection between Lie algebra elements and Lie group elements: Curves, Functions, Vector Fields, and Integral Curves.



Illsutration of a curve on M, as a map from $R^1$ onto M. The point $\lambda$ of $R^1$ is mapped onto P in M.

One way to think about a curve on a manifold $M$ is as continuous series of points in $M$. The definition we want to use is, that a curve is a mapping from an open set of $R^1$ into $M$. Therefore, a curve associates with each point $\lambda$ of $R^1$ (which is just a real number) a point in $M$. The curved is said to be parametrized by $\lambda$. The point in $M$ is called the image point of $\lambda$. Two curves can be different even though they have the same image points in $M$, if they assign a different parameter value to the image points.



Illustration of a function on a manifold as map from M onto $R^1$

Functions are in some sense inverse to curves. A function on a manifold $M$ assigns a real number (= a element of $R^1$) to each point of $M$. To make sense of the word differentiable when talking about functions on $M$, it helps to think about differentiability in terms of coordinates.

Remember that a manifold is defined as a set for which in the neighborhood of any point a map onto $R^n$ exists. In other words: We can assign in the neighborhood of any point coordinate values to the points in the neighborhood. Therefore we can combine this map onto $R^n$, with the map (function) from $M$ to $R^1$ to get a map from $R^n$ onto $R^1$. A function is said to be differentiable if the it is differentiable in $R^n$.

In the abstract sense a function is a map $f(P)$, where $P$ denotes some point in $M$. By assigning coordinate values to this point, we have a map $f(x^1,x^2,…,x^n)$. If this function is differentiable in its arguments, the function is said to be differentiable. The coordinate map itself gives us a function for each coordinate. For example $x^2(P)$ is a function that maps each point $P$ to the corresponding coordinate value $x^2$.

Tangent Vectors:

Now we head to something that is quite hard to grasp when stumbling about it for the first time: The modern, abstract definition of a tangent vector.

We start with a curve that passes through some point $P$ of $M$. This curve is described, using the coordinate map, by the equations $x^i(\lambda)$. A differentiable function $f=f(x^1,x^2,…,x^n)$, abbreviated $f=f(x^i)$, on $M$ assigns a value to each point of the curve. To be precise we call this function, $g(\lambda)$, because $f$ is a function of $(x^1,x^2,…,x^n)$ and $g$ is a function of $\lambda$.

\begin{equation} g(\lambda) = f(x^1(\lambda),x^2(\lambda),…,x^n(\lambda))= f(x^i(\lambda)) \end{equation}

We can differentiate, which yields, using the chain rule

\begin{equation} \frac{dg}{d\lambda} = \sum_i \frac{dx^i}{d \lambda}\frac{ \delta f }{ \delta x^i} .\end{equation}

Because this equation holds for any function $f$, we can write

\begin{equation} \frac{d}{d\lambda} = \sum_i \frac{dx^i}{d \lambda}\frac{ \delta }{ \delta x^i} .\end{equation}

(If this seems strange to you, you may check out chapter 8.1 of the Feynman Lectures about Quantum Mechanics, which has a great explanation regarding this, “removing objects from an equation, if it holds for arbitrary objects of this kind)

If we would be talking exclusively about Euclidean space, we would interpret $\frac{dx^i}{d \lambda}$ as the components of a vector tangent to the curve. $dx^i$ are infinitesimal displacement along the curve by dividing them by a real number $\lambda$, gives the rate of change in this direction. Dividing by a real number does only change the scale, not the direction of the displacement.


Illustration of two different curves having the same tangent vector at some point P

Every curve has a unique tangent vector at any point. In contrast a tangent vector can be the tangent vector for infinitely many curves. For example, if we take a look at the simple curve

\begin{equation} x^i(\lambda)=\lambda a^i \end{equation}
with constants $a^i$. The tangent vector at the point $P$, $\lambda=0$ is $\frac{dx^i}{d \lambda} = a^i$. A slightly different curve
\begin{equation} x^i(\mu)=\mu^2 b^i+ \mu a^i \end{equation}
goes through the same point $P$ for $\mu=0$ and has the same tangent vector at $P$, $\frac{dx^i}{d \mu} = a^i$.

Illustration of two curves having the same path but different parameterizations.

Illustration of two curves having the same path but different parameterizations.

Furthermore, we can reparametrize the first curve by $x^i(\nu)=(\nu^3 + \nu) a^i$. This curve goes through the same points, but has different values $\nu$ associated with them. At $\nu=0$, this curve goes again through $P$ and the tangent vector is $\frac{dx^i}{d \nu} = a^i$.

We conclude: Every vector belongs to a whole equivalence classes of curves at this point. More dramatically one can say: The vector characterizes a whole equivalence classes of curves at this point. The defining feature of this equivalence class is the vector.

This observation motivates the modern definition of a tangent vector. In Euclidean space we have no problem with talking about displacements. A vector in Euclidean space points from one point to another. On a manifold there is, in general, no distance relation between points. Therefore, we need a more sophisticated idea to be able to talk about vectors.

Consider a curve $x^i=x^i(\mu)$ through $P$. Analogous to the considerations above we have

\begin{equation} \frac{d}{d\mu} = \sum_i \frac{dx^i}{d \mu}\frac{ \delta }{ \delta x^i} .\end{equation}

Using two numbers $a$ and $b$, we can take a look at the linear combination

\begin{equation} a\frac{d}{d\lambda} + b \frac{d}{d\mu} = \sum_i \left( a \frac{dx^i}{d \lambda} + b \frac{dx^i}{d \mu} \right) \frac{ \delta } { \delta x^i} .\end{equation}

$ a \frac{dx^i}{d \lambda} + b \frac{dx^i}{d \mu} $ are components of a new vector that is tangent to some curves through $P$. Therefore, for one of these curves that is parametrized by $\Phi$, we can write at $P$

\begin{equation} \frac{d}{d\Phi}= \sum_i \left( a \frac{dx^i}{d \lambda} + b \frac{dx^i}{d \mu} \right) \frac{ \delta } { \delta x^i} .\end{equation}

and therefore

\begin{equation} \frac{d}{d\Phi}= a\frac{d}{d\lambda} + b \frac{d}{d\mu} . \end{equation}

At this point we can see that we have a one-to-one correspondence between the space of all tangent vectors at some point $P$ and the space of all derivatives along curves at $P$.

The directional derivatives, like $\frac{d}{d\lambda}$, behave like the usual vectors under addition and are therefore said to form a vector space. A basis for this vector space is always given by $\{ \frac{ \delta } { \delta x^i} \}$, because we have seen that any directional derivative can be written as a linear combination of the derivatives $\frac{ \delta } { \delta x^i} $, the derivatives along the coordinate lines. The components in this basis are $ \{ \frac{dx^i}{d \lambda} \}$.

The definition of a tangent vector that mathematicians prefer is that $\frac{d}{d\lambda}$ is the tangent vector to the curve $x^i(\lambda)$. This definition is advantageous, because it involves no displacements over finite separations and it makes reference to coordinates. The definition makes sense, because, as we can see above, $\frac{d}{d\lambda}$ has the components $ \{ \frac{dx^i}{d \lambda} \}$, if we pick some coordinate system, which are exactly the numbers one associates naively with an “arrow” tangent to a curve.


Illustration of the tangent space for some point on the sphere

The subtle point is that we are no longer allowed to add vectors located at different points of the manifold. The vectors are said to live in the tangent space of $M$ at $P$, denoted $T_P M$. Therefore vectors at different points live in different spaces and are not allowed to be added together. Using the sphere as an example, one may picture this tangent space as a plane lying tangent to the sphere this point.

Vector Fields and Integral Curves

Another important term is vector field. A vector field is a rule that defines a vector at each point of the manifold. We have one tangent space at each point of the manifold and a vector field is a rule that picks one vector from each tangent space. As already noted above: Every curve has at each point a tangent vector. We can turn this a bit upside-down. Given a vector field, i.e. a tangent vector at each point of the manifold, we can find, if we pick one point $P$, precisely one curve going through $P$ that has exactly the tangent vector the vector field picks as a tangent vector at any point the curves passes through. Such a curve is called a integral curve, because given a vector field $V^i(P)$, in coordinates $V^i(P)=v^i(x^j)$, the statement that these vectors are tangent vectors to curves is mathematically

\begin{equation} \frac{dx^i}{d \lambda} = v^i(x^j) .\end{equation}

This is just a first order differential equation for $x^i(\lambda)$ that has always a unique solution in some neighborhood of $P$.

The Exponential Function on a Manifold

We are now in the position to understand the tool that connects Lie algebra elements and Lie group elements: The exponential function. Given a vector field $ Y = \frac{d}{d \lambda}$, we can find the corresponding integral curve $x^i(\lambda)$ through some point. The coordinates of two points ($\lambda_0$ and $\lambda_0 + \epsilon$) on this curve are related by the Taylor series

\begin{align} x^i (\lambda_0 + \epsilon)
&= x^i(\lambda_0) + \epsilon \left( \frac{d x^i}{d \lambda} \right)_{\lambda_0} + \frac{1}{2!}\epsilon^2 \left( \frac{d^2 x^i}{d \lambda^2} \right)_{\lambda_0} + … \\
&= \left( 1 + \epsilon \frac{d}{d \lambda} + \frac{1}{2!} \epsilon^2 \frac{d^2}{d \lambda^2} + … \right) x_i \Big |_{\lambda_0} \\
&= \exp \left( \epsilon \frac{d}{d \lambda} \right) x^i \Big |_{\lambda_0}

The $\exp$ notation here should be understood as a shorthand notation for series of differential operators that act on $x^i$, evaluated at $\lambda_0$.

The Lie algebra

The Lie algebra is defined as the tangent space at the identity $T_eG$, of the manifold $G$. We will now see that each element of $T_eG$, i.e., each tangent vector at the identity defines a unique vector field $V$ on $G$.

We can then find the corresponding integral curve for this vector field $V$ that goes through $e$ and has tangent vector $V_e$ at the identity. By using the $\exp$ operator, as defined above, we can get to any point on this curve. $V$ is completely determined by $V_e$ and therefore the points of $G$ on this curve can be denoted by

\begin{equation} g_{v_e(t)} = \exp{t V}|_e \end{equation}

This is how we are able to get Lie group elements from Lie algebra elements. We have a one-to-one correspondence between tangent vectors and differential operators and are therefore able to use the $\exp$ notation to describe the corresponding Taylor series. This Taylor series connects different points on the same curve and therefore we are able to get from each element of the Lie algebra $T_eG$, elements of the Lie group $G$.

We will now take a look at how each tangent vector at the identity defines a unique vector field.

The Lie algebra and left-invariant vector fields

Given a Lie group $G$,  natural maps (diffeomorphisms) are left- and right-translations. That is any element $g$ of $G$ defines a map from a point $h$ of the manifold $G$ onto some new point $gh$ (left-translation) or $hg$ (right-translation).

\begin{equation} h \rightarrow  gh \mathrm{ \  \ \ \  left-translation} \qquad \mathrm{ or } \qquad h \rightarrow  hg \mathrm{ \  \ \ \ right-translation} \end{equation}


Left translations along g, map a neighborhood of e onto a neighborhood of g. In addition curves through e are mapped onto curves through g, and therefore tangent vectors at e onto tangent vectors at g


Here the group acts on itself.  Of course we are able to examine how a group acts on different manifolds, too, and this idea is investigated further by representation theory. For now it suffices to have a look at these maps of $G$ onto itself.

Everything that follows can be done with right-translations, but its convention to use left-translations.

As always we denote the identity element of the group by $e$. The left-translation map along a particular $g$, maps any neighborhood of $e$ onto a neighborhood of $g$ (See the figue below). In addition, this map, maps curves into curves and therefore tangent vectors into tangent vectors.

The map between tangent vectors is commonly denoted $L_g : T_e \rightarrow T_g$. A vector field  $V$ is said to be left-invariant if $L_g$ maps $V$ at $e$ to $V$ at $g$ (and not to some different vector field at $g$, for example, some $W$), for all $g$. Because of the group composition law this is equivalent to saying that a vector field is left-invariant if $L_g$ maps $V$ at $h$ to $V$ at $gh$, for any $h$.

Every vector in $T_e$ defines a left-invariant vector field. The rule for getting a vector at each point of the manifold is given by $L_g$.  By multiplication with some $g\in G$, we get from any vector in $T_eG$ a vector in $T_gG$, and by definition this is what we call a left-invariant vector field.

Therefore, the set of left-invariant vector fields is isomorphic to the elements of $T_eG$, i.e., the Lie algebra of $G$. Mathematicians say the set of left-invariant vector fields is the Lie algebra of $G$. ( The defining feature of a Lie algebra, the  Lie bracket can be easily proved using coordinates. One needs to show that $L_g$ maps $[V,W]$ at $e$ into $[V,W]$ at $g$, which means the vector field $[V,W]$  is again left-invariant. )

The distinction is necessary, because of the abstract definition of a Lie algebra. The set of all vector fields on $G$ form a Lie algebra, too, but this is not the Lie algebra one has normally in mind when talking about the Lie algebra of a given Lie group.


In the last section we saw, how each element of $T_eG$ defines uniquely a vector field on $G$. Each element of the Lie algebra can be identified with a left-invariant vector field.

To each left-invariant vector field, and therefore to each element of $T_eG$ belongs one unique integral curve through the identity element $e$ of $G$.  We can get to points on this curve, by using the exponential map, which is shorthand for the Taylor series

\begin{equation} g_{v_e(t)} = \exp{(t V)}|_e. \end{equation}

Starting with a different element of $T_eG$ we have a different integral curve and are able to reach different points of $G$. (This does not hold in general. Two different elements of $T_eG$, would be $v$ and $2v$. The corresponding integral curves go through the same points of $G$, only the “speed” (parameter value) differs. Nevertheless, for some elements we can get to different points)

This is how the Lie algebra $T_eG$ of group $G$ is able to describe to describe the group $G$.




Short Introduction to and Motivation for Representation Theory

What may seem at a first glance like just another mathematical gimmick of group theory, is of incredible importance in physics. One can consider the Poincaré group (the set of all transformations that leave the speed of light constant) and use the framework of representation theory to construct the irreducible representations of this group. (The irreducible representations are the basic building blocks all other representations can be built of.) The straight-forward examination of the irreducible representations of the Poincaré group gives us physicists the appropriate mathematical tools needed to describe nature at the most fundamental level.

  1. The lowest dimensional representation is trivial and called scalar or spin $0$ representation,  because the objects (scalars) the group acts on in this representation are used to describe elementary particles of spin $0$. (In this representation the group doesn’t changes the objects in question at all.)
  2. The next higher-dimensional representation is called spin $\frac{1}{2}$ or spinor representation, because the objects (spinors) the group acts on in this representation are used to describe elementary particles of spin $\frac{1}{2}$.
  3. The third representation is called spin $1$ or vector representation, because the objects (vectors) the group acts on in this representation are used to describe elementary particles of spin $1$.

But what exactly is a representation?

For theoretical considerations its often useful to regard any group as an abstract group. This means defining the group by its manifold structure and the group operation. For example $SU(2)$ is the three sphere $S^3$, the elements of the group are points of the manifold and the rule associating a product point $ab$ with any two points $b$ and $a$ satisfies the usual group axioms. In physical applications one is more interested in what the group actually does, i.e. the group action.

An important idea is that one group can act on many different kinds of objects (this will make much more sense in a moment). This idea motivates the definition of a representation: A representation is a map between any group element $g$ of a group $G$ and a linear transformation $T(g)$ of some vector-space $V$ in such a way that the group properties are preserved:

  1. $T(e)=I$ (The identity element of the group transforms nothing at all)
  2. $T(g^{-1})=\big ( T(g) \big )^{-1} $ (Every inverse element is mapped to the corresponding inverse transformation
  3. $T(g)\circ T(h) = T(gh)$ (The combination of transformations corresponding to $g$ and $h$ is the same as the transformation corresponding to the point $gh$)

This concept can be formulated more general if one accepts arbitrary (not linear) transformations of an arbitrary (not vector) space. Such a map is called a realization.

In physics one is concerned most of the time with linear transformations of objects living in some vector space (for example, Hilbert space in Quantum Mechanics or Minkowski space for Special Relativity), therefore the concept of a representation is more relevant to physics than the general concept called realization.

A representation identifies with each point (abstract group element) of the group manifold (the abstract group) a linear transformation of a vector space. The framework of representation theory enables one to examine the group action on very different vector spaces.

One of the most important examples in physics is $SU(2)$. For example one can examine how $SU(2)$ acts on the complex vector space of dimension two: $C^2$ (the action on $C^1$ is trivial). The objects living in this space are complex vectors of dimension two. Therefore $SU(2)$ acts on these objects as $2\times2$ matrices. The matrices (=linear transformations) acting on $C^2$ are just the usual matrices one identifies with $SU(2)$. Nevertheless we can examine how $SU(2)$ acts on $C^3$. There is a well defined framework for constructing such representations and as a result $SU(2)$ acts on complex vectors of dimension three as $3\times 3$ matrices for which a basis is given by

\begin{equation} J_1 = \frac{1}{\sqrt{2}}  \begin{pmatrix}  0& 1 & 0 \\ 1&0 & 1 \\ 0 & 1 & 0  \end{pmatrix}  , \qquad  J_2 = \frac{1}{\sqrt{2}}  \begin{pmatrix}  0& -i & 0 \\ i&0 & -i \\ 0 & i & 0  \end{pmatrix}  , \qquad J_3 =  \begin{pmatrix}  1& 0 & 0 \\ 0&1 & 0 \\ 0 & 0 & 1  \end{pmatrix}  \end{equation}


One can go on and inspect how $SU(2)$ acts on higher dimensional vectors. This can be quite confusing and maybe its better to call the group $S^3$ instead of $SU(2)$, because usually $SU(2)$ is defined as the set of complex $2\times 2$ (!!) matrices satisfying

$U^\dagger U = 1$ and $\det(U)=1$

and now we write $SU(2)$ as $3 \times 3$ matrices. Therefore one must always keep in mind that one means the abstract group, instead of the $2 \times 2 $ definition, when one talks about higher dimensional representation of $SU(2)$ or any other group.

Typically a group is defined in the first place by a representation. This enables one to study the group properties concretely. After this initial study its often more helpful to regard the group as an abstract group, because its possible to find other, useful representations of the group.