The Non-Zero Quaternions as a Lie Group

Introduction

Quaternions are well-known to people working in robotics and aerospace. They (the unit quaternions, specifically) provide a smooth representation of attitude in using only four numbers, in contrast to rotation matrices that require 9 and Euler angles that are not smooth. In this post, I will explore the quaternions from a slightly different perspective: the quaternions (excluding zero) form a Lie group under multiplication. We will not restrict ourselves to the unit quaternions, instead exploring the full four-dimensional Lie group.

Basic group properties

Throughout this article, we will write a quaternion $q \in \mathbb{H}$ as $q = (r, u),$ where $r \in \mathbb{R}_{\neq 0}$ and $u \in \mathbb{R}^3$ represent the real and imaginary parts of $q$, respectively. The product is defined by

\[\begin{aligned} q_1 * q_2 &= (r_1, u_1) * (r_2, u_2) \\ &= (r_1 r_2 - u_1^\top u_2, \; r_1 u_2 + r_2 u_1 + u_1 \times u_2). \end{aligned}\]

The inverse of a quaternion is defined by

\[q^{-1} = (r^2 + \vert u \vert^2)^{-1} (r, -u).\]

And the group identity is given by $ e := (1, 0_3) $.

The quaternions act on themselves by conjugation. Specifically,

\[\begin{aligned} \mathrm{Cn}_{q_1}(q_2) &= q_1 * q_2 * q_1^{-1} \\ % ----- &= (r_1^2 + \vert u_1 \vert^2)^{-1} (r_1 r_2 - u_1^\top u_2, \; r_1 u_2 + r_2 u_1 + u_1 \times u_2) * (r_1, -u_1)\\ % ----- &= (r_1^2 + \vert u_1 \vert^2)^{-1} ((r_1 r_2 - u_1^\top u_2) r_1 + (r_1 u_2 + r_2 u_1 + u_1 \times u_2)^\top u_1, \\ &\hspace{1cm} r_1(r_1 u_2 + r_2 u_1 + u_1 \times u_2) - (r_1 r_2 - u_1^\top u_2)u_1 -(r_1 u_2 + r_2 u_1 + u_1 \times u_2) \times u_1 )\\ % ----- &= (r_1^2 + \vert u_1 \vert^2)^{-1} (r_1^2 r_2 - r_1 u_1^\top u_2 + r_1 u_2^\top u_1 + r_2 u_1^\top u_1 , \\ &\hspace{1cm} r_1^2 u_2 + r_1 r_2 u_1 + r_1 u_1 \times u_2 - r_1 r_2 u_1 + u_1 u_1^\top u_2 - r_1 u_2 \times u_1 - (u_1 \times u_2) \times u_1 )\\ % ----- &= (r_1^2 + \vert u_1 \vert^2)^{-1} (r_1^2 r_2 + r_2 u_1^\top u_1 , \\ &\hspace{1cm} r_1^2 u_2 + 2 r_1 u_1 \times u_2 + u_1 u_1^\top u_2 + u_1 \times (u_1 \times u_2) )\\ % ----- &= (r_2 , \; (r_1^2 + \vert u_1 \vert^2)^{-1}(r_1^2 u_2 + \vert u_1 \vert^2 u_2 + 2 r_1 u_1 \times u_2 + 2 u_1 \times (u_1 \times u_2)) )\\ % ----- &= (r_2 , \; u_2 + (2 r_1 u_1 \times u_2 + 2 u_1 \times (u_1 \times u_2))(r_1^2 + \vert u_1 \vert^2)^{-1}). \end{aligned}\]

Let us denote $ \vert q_1 \vert = \sqrt{r_1^2 + \vert u_1 \vert^2}$ and define $u_1^\times \in \mathbb{R}^{3\times 3}$ to be the `skew’ matrix such that $u_1^\times u_2 = u_1 \times u_2$. Then we end up with a nice and simple formula:

\[\mathrm{Cn}_{q_1}(q_2) = (r_2 , \; (I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2})u_2 ).\]

The Quaternion Lie Algebra

There are many ways to think of the Lie algebra of a given Lie group. Since our main interest is computation, we will choose the way that is easiest to work with for computation. The Lie algebra $\mathfrak{h}$ of $\mathbb{H}$ can identified with the tangent space at the identity $e$. This definition is abstract, so we assign some coordinates. A Lie algebra element is described as $w^\vee := (s, v) \in \mathbb{R}^4$, where the $\vee$ operator is the map from the abstract Lie algebra to the coordinates in $\mathbb{R}^4$. Near the identity, quaternion group elements can be written as

\[q = e + t w, \quad (r,u) = (1+t s, t v),\]

for small values of $t \in \mathbb{R}$.

Exponential and Logarithm

The exponential relates the Lie algebra to the Lie group. We will use the `1-parameter subgroup’ definition here. Given a Lie algebra element $w^\vee = (s,v)$, the exponential $\exp(w)$ is defined as the solution to the initial value problem

\[q(0) = e, \quad \dot{q}(t) = q(t) * w,\]

at $t = 1$. Let us evaluate the differential equation to find

\[\begin{aligned} \dot{q} &= q * w \\ &:= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} (r,u) * (1+t s, t v) \\ &= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} (r (1+ts) - t u^\top v, \; r t v + (1+ts) u + t u \times v) \\ (\dot{r}, \dot{u}) &= (r s - u^\top v, \; r v + s u + u \times v). \end{aligned}\]

This ODE is not straightforward to solve, unless we realise that this system is, in fact, linear! Writing $q$ as a vector in $\mathbb{R}^4$, we have

\[\begin{aligned} \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} \begin{pmatrix} r \\ u \end{pmatrix} &= \begin{pmatrix} s & - v^\top \\ v & s I_3 - v^\times \end{pmatrix} \begin{pmatrix} r \\ u \end{pmatrix} = \begin{pmatrix} 0 & - v^\top \\ v & - v^\times \end{pmatrix} \begin{pmatrix} r \\ u \end{pmatrix} + s \begin{pmatrix} r \\ u \end{pmatrix}. \end{aligned}\]

Since $s$ acts as a scaling factor, we can pull it out of the equation for now, and solve the problem without it. Specifically,

\[\left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} e^{-t s} q = e^{-t s} \dot{q} - s e^{-t s} q = A e^{-t s} q,\]

so if we solve the problem while ignoring $s$, we can add it back in at the end. To solve the ODE now, we only have to compute the matrix exponential

\[\begin{aligned} A &:= \begin{pmatrix} 0 & - v^\top \\ v & - v^\times \end{pmatrix} & \exp(A) &= \sum_{k=0}^\infty \frac{1}{k!} A^k. \end{aligned}\]

Examining the first nontrivial power of $A$ reveals that

\[\begin{aligned} A^2 &= \begin{pmatrix} 0 & - v^\top \\ v & - v^\times \end{pmatrix}^2 = \begin{pmatrix} -\vert v \vert^2 & 0_{1\times 3} \\ 0_{3\times 1} & (v^\times)^2 - v v^\top \end{pmatrix} = \begin{pmatrix} -\vert v \vert^2 & 0_{1\times 3} \\ 0_{3\times 1} & - \vert v \vert^2 I_3 \end{pmatrix} = - \vert v \vert^2 I_4. \end{aligned}\]

Substituting this into the exponential formula yields

\[\begin{aligned} \exp(A) &= \sum_{k=0}^\infty \frac{1}{k!} A^k \\ &= \sum_{k=0}^\infty \frac{1}{(2k)!} A^{2k} + \sum_{k=0}^\infty \frac{1}{(2k+1)!} A^{2k+1} \\ &= \sum_{k=0}^\infty \frac{1}{(2k)!} (- \vert v \vert^2 I_4)^k + \sum_{k=0}^\infty \frac{1}{(2k+1)!} (- \vert v \vert^2 I_4)^{k} A \\ &= \sum_{k=0}^\infty \frac{(-1)^k}{(2k)!} \vert v \vert^{2k} I_4+ \vert v \vert^{-1} \sum_{k=0}^\infty \frac{(-1)^k}{(2k+1)!} \vert v \vert^{2k+1} A \\ &= \cos(\vert v \vert) I_4 + \frac{\sin(\vert v \vert)}{\vert v \vert} A. \end{aligned}\]

Therefore, we have our final solution,

\[\begin{aligned} \exp(w) &= e^s \exp(A) \begin{pmatrix} 1 \\ 0_3 \end{pmatrix} \\ &= \left( \cos(\vert v \vert) I_4 + \frac{\sin(\vert v \vert)}{\vert v \vert} A \right) \begin{pmatrix} e^s \\ 0_3 \end{pmatrix} \\ &= \cos(\vert v \vert)\begin{pmatrix} e^s \\ 0_3 \end{pmatrix} + \frac{\sin(\vert v \vert)}{\vert v \vert} \begin{pmatrix} 0 & - v^\top \\ v & - v^\times \end{pmatrix} \begin{pmatrix} e^s \\ 0_3 \end{pmatrix} \\ &= \begin{pmatrix} e^s \cos(\vert v \vert) \\ e^s \sin(\vert v \vert) \frac{v}{\vert v \vert} \end{pmatrix}. \end{aligned}\]

Note that, if $\vert v \vert = 0$, then the whole computation simplifies and the solution is simply $\exp(w) = ( e^s, 0_3)$.

The logarithm is found by inverting this formula, although there may be multiple solutions for a given $q \in \mathbb{H}$. Suppose that $q = \exp(w)$. Then we wish to determine the components of $w = (s, v)$ in terms of $q = (r, u)$. We have

\[\begin{aligned} q &= \exp(w), \\ (r, u) &= (e^s \cos(\vert v \vert), e^s \sin(\vert v \vert) \frac{v}{\vert v \vert}). \end{aligned}\]

Immediately, we see that $e^s = r / \cos(\vert v \vert)$. Substituting this into the $u$-component,

\[\begin{aligned} u &= r \tan(\vert v \vert) \frac{v}{\vert v \vert}, \\ \frac{u}{\vert u \vert} \vert u \vert &= r \tan(\vert v \vert) \frac{v}{\vert v \vert}, \\ r^{-1} \vert u \vert \frac{u}{\vert u \vert} &= \tan(\vert v \vert) \frac{v}{\vert v \vert}, \\ v &= \frac{\arctan(r^{-1} \vert u \vert)}{\vert u \vert} u \end{aligned}\]

Rather than substitute this back into the formula for $e^s$, we observe that the norm of both sides of the original equation satisfies

\[\begin{aligned} \vert q \vert &= \vert \exp(w) \vert, \\ \sqrt{r^2 + \vert u \vert^2} &= e^{s} , \\ s &= \ln(\sqrt{r^2 + \vert u \vert^2}). \end{aligned}\]

In summary, we have thus found the logarithm to be

\[\begin{aligned} \log(q) &= \left( \frac{1}{2} \ln(r^2 + \vert u \vert^2), \; \frac{\arctan(r^{-1} \vert u \vert)}{\vert u \vert} u \right). \end{aligned}\]

Similarly to the exponential formula, we should note that, if $\vert u \vert = 0$, the formula simplifies to $\log(q) = (\ln(r), 0_3)$.

Adjoint Operators and Lie Bracket

The big and little Adjoint operators are another important aspect of the Quaternion Lie algebra. The `big’ Adjoint operator $\mathrm{Ad} : \mathbb{H} \times \mathfrak{h} \to \mathfrak{h}$ is defined by

\[\begin{aligned} \mathrm{Ad}_q (w) &= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} \mathrm{Cn}_q(e + t w) \\ &= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} (1+ t s , \; (I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2})(t v) ) \\ &= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} (s , \; (I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2})v ). \end{aligned}\]

In matrix form,

\[\begin{aligned} \mathrm{Ad}_q \simeq \begin{pmatrix} 1 & 0_{1\times 3} \\ 0_{3\times 1} & I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2} \end{pmatrix}. \end{aligned}\]

The `little’ adjoint operator $\mathrm{ad} : \mathfrak{h} \times \mathfrak{h} \to \mathfrak{h}$ is defined as the derivative of the big Adjoint operator,

\[\begin{aligned} \mathrm{ad}_{w_1} (w_2) &= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} \mathrm{Ad}_{e+ t w_1}w_2 \\ &= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} (s_2 , \; (I_3 + (2 (1+t s_1) (t v_1)^\times + 2 ((t v_1)^\times)^2 )\vert e + t w_1 \vert^{-2})v_2 ) \\ &= (0 , \; 2 v_1^\times v_2 ). \end{aligned}\]

Once more, in matrix form,

\[\begin{aligned} \mathrm{ad}_w \simeq \begin{pmatrix} 0 & 0_{1\times 3} \\ 0_{3\times 1} & 2v^\times \end{pmatrix}. \end{aligned}\]

The Lie bracket is equivalent to the adjoint operator, in the sense that

\[\begin{aligned} \left[w_1, w_2\right] := \mathrm{ad}_{w_1}(w_2) = (0 , \; 2 v_1^\times v_2 ). \end{aligned}\]

Matrix Representation

The final topic of interest for computations is the matrix representation of $\mathbb{H}$. Matrix representations are rarely unique, but sometimes can be nice. The matrix representation we consider is $\rho : \mathbb{H} \to \mathbf{GL}(4)$, given by

\[\begin{aligned} \rho(q) := \begin{pmatrix} r& u_1& u_2& u_3 \\ -u_1& r& -u_3& u_2 \\ -u_2& u_3& r& -u_1 \\ -u_3&-u_2&u_1&r \\ \end{pmatrix}. \end{aligned}\]

The matrix representation of the Lie algebra $\mathfrak{h}$ is basically the same. Verifying that these are indeed representations is a messy and time-consuming computation. However, working out a matrix representation is very rewarding in that it provides a way to check all the other computations we have done. Specifically, we can check things like the inverse $\rho(q)^{-1} = \rho(q^{-1})$, the exponential $\mathrm{expm}(\mathrm{d}\rho(w)) = \rho(\exp(w))$, and the adjoint operators $\mathrm{Ad}_q w = \rho(q) \mathrm{d}\rho(w) \rho(q)^{-1}$.

Summary

I decided to write this post when I needed these formulas for the $n$th time, and I realised that deriving them every time I needed them was taking too long. I hope they are helpful to anyone else who reads them, and please let me know if you spot any mistakes!