<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2026-03-11T03:51:10+00:00</updated><id>/feed.xml</id><title type="html">Pieter van Goor</title><subtitle>This is my personal website I use for hosting news and updates regarding my research. If you want to contact me, please reach out by email or any of my linked social media.</subtitle><entry><title type="html">PhD Position in Geometric Estimation and Control for Aerial Robotics at the University of Sydney</title><link href="/group/2026/03/10/PhD_position_open.html" rel="alternate" type="text/html" title="PhD Position in Geometric Estimation and Control for Aerial Robotics at the University of Sydney" /><published>2026-03-10T00:00:00+00:00</published><updated>2026-03-10T00:00:00+00:00</updated><id>/group/2026/03/10/PhD_position_open</id><content type="html" xml:base="/group/2026/03/10/PhD_position_open.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h2 id="position-overview">Position Overview</h2>

<p>A fully funded PhD position is open at the Australian Centre for Robotics (ACFR) at the University of Sydney! The topic is on geometric techniques in the estimation and control of aerial robotic systems and will be supervised by Dr Pieter van Goor.</p>

<p>Aerial robotic system hardware has become increasingly capable of high precision tasks involving interaction with their environment, including delivery, maintenance, and (physical) inspection. However, outside of laboratory environments, many of these tasks require human control or intervention due to the sensitivity of automatic control algorithms to the accuracy of on-board measurements.</p>

<p>The aim of this project is to develop new estimation and control algorithms that are faster and more robust than the state of the art, and that can unlock the hardware’s full capability in real-world outdoor environments. Emerging theoretical techniques that exploit the mathematical symmetry of aerial robot dynamics provide a solution, but they are yet to be implemented in real-world systems. Depending on the candidate’s skills and interests, there are opportunities to explore:</p>

<ul>
  <li>Hardware implementations and testing of new control and estimation algorithms.</li>
  <li>Theoretical development of symmetry-based approaches.</li>
  <li>High-speed estimation using novel sensors (e.g. cameras, LIDAR, UWB range, etc.).</li>
</ul>

<p>The Australian Centre for Robotics (ACFR) is a part of the School of Aerospace, Mechanical, and Mechatronics Engineering (AMME). The ACFR offers specialised labs and facilities, robotic platforms, and robotic field labs on-campus and in nearby off-campus sites. You will have access to mechanical and electronics workshops and a pool of technical staff to help realise your research ambitions. The University of Sydney offers a rich academic setting in a world class city, and the ACFR has strong ties to a network of nearby and international academic and industrial collaborators.</p>

<h2 id="desirable-qualities-in-a-candidate">Desirable Qualities in a Candidate</h2>

<ul>
  <li>Bachelor’s degree with honours or master’s degree in a relevant discipline, or evidence that a relevant degree will be obtained within the next six months.</li>
  <li>Interest in robotics and control.</li>
  <li>Strong background in mathematics and programming.</li>
  <li>Excellent communication and interpersonal skills.</li>
  <li>Hands-on experience with (aerial) robotic platforms, ROS, Python, C++.</li>
  <li>Creativity, curiosity, rigour and passion.</li>
</ul>

<h2 id="how-to-apply">How to Apply</h2>

<p>Your application must include</p>

<ul>
  <li>A cover letter (at most 1 page in length) stating your prior experience and research interest in relation to the topic of geometric methods for aerial robotics.</li>
  <li>A curriculum vitae with three references and a GPA.</li>
  <li>A transcript of bachelor’s and master’s degree grades (if applicable).</li>
</ul>

<p>Please send an email to <a href="mailto:pieter.vangoor@sydney.edu.au">pieter.vangoor@sydney.edu.au</a> with the title [Aerial Robotics PhD Application] to submit your application or if you would like further information. The deadline for applications is Sunday 29 March 2026.</p>]]></content><author><name></name></author><category term="Group" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The Special Euclidean Group SE(3)</title><link href="/mathematics/2026/01/27/special_euclidean_se3.html" rel="alternate" type="text/html" title="The Special Euclidean Group SE(3)" /><published>2026-01-27T00:00:00+00:00</published><updated>2026-01-27T00:00:00+00:00</updated><id>/mathematics/2026/01/27/special_euclidean_se3</id><content type="html" xml:base="/mathematics/2026/01/27/special_euclidean_se3.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h3 id="introduction">Introduction</h3>

<p>The special Euclidean group  \(\mathbf{SE}(3)\) described the symmetry of rigid bodies and frames of reference.
Aside from describing their symmetry, it can also be used to represent the position and orientation of a frame in one handy matrix.
While the elements of this group can be viewed as consisting of a rotation matrix and a position vector, most people prefer to use the \(4\times 4\) matrix form, sometimes called a <em>homogeneous matrix</em>.
Like my other articles on Lie groups, my goal here is to provide derivations of some of the most useful and important formulas and identities that come up for \(\mathbf{SE}(3)\).
That said, the previous <a href="/mathematics/2025/02/15/special_orthogonal_so3.html">post on the special orthogonal group</a> should be read before reading this one, since I will draw on a number of the results therein.
For an excellent explanation of the relationships between frames of reference, their velocities, and the special Euclidean group, I recommend this <a href="https://shiraz-k.com/posts/robotics/">blog post by Shiraz Khan</a>.</p>

<p>The special Euclidean group is defined by</p>

\[\begin{aligned}
    \mathbf{SE}(3) = \left\{
        P = \begin{pmatrix}
            R &amp; x \\ 0_{1\times 3} &amp; 1
        \end{pmatrix} \in \mathbb{R}^{4\times 4}
        \; \middle| \;
        R \in \mathbf{SO}(3), \; x \in \mathbb{R}^3
    \right\}.
\end{aligned}\]

<p>On first impressions, this looks as simple as just combining \(\mathbf{SO}(3)\) with \(\mathbb{R}^3\), but things get interesting when we look at the multiplication of two \(\mathbf{SE}(3)\) elements.
We will typically use the notation \(P \in \mathbf{SE}(3)\) to refer to the whole matrix, and the notations \(R \in \mathbf{SO}(3), x \in \mathbb{R}^3\) to refer to the rotation and translation components, respectively.
Given \(P_1, P_2 \in \mathbf{SE}(3)\), their product is given by</p>

\[\begin{aligned}
    P_1 P_2 &amp;=
        \begin{pmatrix} R_1 &amp; x_1 \\ 0_{1\times 3} &amp; 1 \end{pmatrix}
        \begin{pmatrix} R_2 &amp; x_2 \\ 0_{1\times 3} &amp; 1 \end{pmatrix}
        = \begin{pmatrix} R_1 R_2 &amp; x_1 + R_1 x_2 \\ 0_{1\times 3} &amp; 1 \end{pmatrix}.
\end{aligned}\]

<p>Note that the rotation components \(R_1,R_2\) are simply multiplied together, and their product is not affected by the translation component.
On the other hand, the translation components \(x_1,x_2\) are not simply summed together, rather, the second translation vector \(x_2\) is rotated by \(R_1\) before being added to \(x_1\).
This property, that one component of the product is affected by the other component, but not vice versa, is why \(\mathbf{SE}(3)\) is called a <strong>semi-direct</strong> product of \(\mathbf{SO}(3)\) and \(\mathbb{R}^3\), which is sometimes written as \(\mathbf{SE(3)} = \mathbf{SO(3)} \ltimes \mathbb{R}^3\).</p>

<p>The group properties are easily verified.
The identity matrix \(I_4\) lies in \(\mathbf{SE}(3)\), with its rotation given by \(I_3\) and its translation vector given by \(0_{3\times 1}\).
For matrix inversion, examining the product leads to the conclusion that</p>

\[\begin{aligned}
    P^{-1}
    = \begin{pmatrix} R &amp; x \\ 0_{1\times 3} &amp; 1 \end{pmatrix}^{-1}
    = \begin{pmatrix} R^\top &amp; - R^\top x \\ 0_{1\times 3} &amp; 1 \end{pmatrix}.
\end{aligned}\]

<p>This formula yields another element of \(\mathbf{SE}(3)\), and we should note that it is entirely expressed in terms of sums and products of the original matrix components.
This is useful, since it is usually much easier and computationally cheaper to take sums and products of vectors and matrices than it is to invert them.
All of this shows that \(\mathbf{SE}(3)\) is indeed a group.</p>

<h3 id="lie-algebra">Lie algebra</h3>

<p>The Lie algebra \(\mathfrak{se}(3)\) is obtained by differentiating the defining conditions of the Lie group \(R \in \mathbf{SO}(3)\) and \(x \in \mathbb{R}^3\) near the identity,</p>

\[\begin{aligned}
    \mathfrak{se}(3)
    =  \left\{
        \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \in \mathbb{R}^{4\times 4}
        \; \middle| \;
        \omega^\times \in \mathfrak{so}(3), \; v \in \mathbb{R}^3
    \right\}.
\end{aligned}\]

<p>Recall that \(\mathfrak{so}(3)\) is the Lie algebra of \(\mathbf{SO}(3)\), defined by the property that \(\omega^\times \in \mathfrak{so}(3)\) if and only if \((\omega^\times)^\top = - \omega^\times\).
Recall also that the skew map \(\cdot^\times : \mathbb{R}^3 \to \mathbb{R}^{3\times 3}\) such that \(\omega^\times\) is the unique matrix for which \(\omega^\times a = \omega \times a\) for all vectors \(a\in \mathbb{R}^3\), that is,</p>

\[\begin{aligned}
\omega^\times
:= \begin{pmatrix}
    0 &amp; -\omega_3 &amp; \omega_2 \\
    \omega_3 &amp; 0 &amp; -\omega_1 \\
    -\omega_2 &amp; \omega_1 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>From here, we define the basis of $\mathfrak{se}(3)$ to be</p>

\[\begin{aligned}
    E_1 &amp;= \begin{pmatrix}
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; -1 &amp; 0 \\
    0 &amp; 1 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0
    \end{pmatrix}, &amp;
    E_2 &amp;= \begin{pmatrix}
    0 &amp; 0 &amp; 1 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0 \\
    -1 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0
    \end{pmatrix}, &amp;
    E_3 &amp;= \begin{pmatrix}
    0 &amp; -1 &amp; 0 &amp; 0 \\
    1 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0
    \end{pmatrix}, \\
    E_4 &amp;= \begin{pmatrix}
    0 &amp; 0 &amp; 0 &amp; 1 \\
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0
    \end{pmatrix}, &amp;
    E_5 &amp;= \begin{pmatrix}
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 1 \\
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0
    \end{pmatrix}, &amp;
    E_6 &amp;= \begin{pmatrix}
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0 &amp; 1 \\
    0 &amp; 0 &amp; 0 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>Having defined this basis, we can also write down the wedge map \(\cdot^\wedge : \mathbb{R}^6 \to \mathfrak{se}(3)\).
For convenience of notation, we will typically write our elements of \(\mathbb{R}^6\) as consisting of two concatenated \(\mathbb{R}^3\) vectors.
Specifically, given \(\omega, v \in \mathbb{R}^3\), the wedge map is given by</p>

\[\begin{aligned}
\begin{pmatrix} \omega \\ v \end{pmatrix}^\wedge
&amp;:= \omega_1 E_1 + \omega_2 E_2 + \omega_3 E_3
+ v_1 E_4 + v_2 E_5 + v_3 E_6 \\
&amp;= \begin{pmatrix}
    0 &amp; -\omega_3 &amp; \omega_2 &amp; v_1 \\
    \omega_3 &amp; 0 &amp; -\omega_1 &amp; v_2 \\
    -\omega_2 &amp; \omega_1 &amp; 0 &amp; v_3 \\
    0 &amp; 0 &amp; 0 &amp; 0
    \end{pmatrix}
= \begin{pmatrix}
    \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>The final compact form is particularly useful, but if it ever looks confusing, just remember that it is only a convenient way to write things down.
The object we are describing here is ultimately a \(4 \times 4\) matrix, even if it has an elegant \(2 \times 2\) block matrix structure.</p>

<h4 id="adjoint-and-lie-bracket">Adjoint and Lie bracket</h4>

<p>To study the adjoint maps, we will denote arbitrary elements by \(P \in \mathbf{SE}(3)\) and \(U \in \mathfrak{se}(3)\).
The rotation and translation components of \(P\) are written \(R\) and \(x\), respectively, while the rotation and translation components of \(U\) are written \(\omega^\times\) and \(v\), respectively.
Then, the adjoint operator is defined by</p>

\[\begin{aligned}
\mathrm{Ad}_{RP}(U)
&amp;= P U P^{-1} \\
&amp;= \begin{pmatrix} R &amp; x \\ 0_{1\times 3} &amp; 1 \end{pmatrix}^{-1}
\begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}
\begin{pmatrix} R^\top &amp; - R^\top x \\ 0_{1\times 3} &amp; 1 \end{pmatrix} \\
&amp;= \begin{pmatrix} R \omega^\times &amp; R v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}
\begin{pmatrix} R^\top &amp; - R^\top x \\ 0_{1\times 3} &amp; 1 \end{pmatrix} \\
&amp;= \begin{pmatrix} R \omega^\times R^\top &amp; R \omega^\times (- R^\top x) + R v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \\
&amp;= \begin{pmatrix} (R \omega)^\times &amp; - (R \omega)^\times x + R v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \\
&amp;= \begin{pmatrix} (R \omega)^\times &amp; x^\times R \omega + R v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}.
\end{aligned}\]

<p>Here, we have used a few cross-product identities, that were derived in the <a href="/mathematics/2025/02/15/special_orthogonal_so3.html">post on \(\mathbf{SO}(3)\)</a>.
From simply reading the entries of the result, the matrix form of the Adjoint operator is found to be</p>

\[\begin{aligned}
\mathrm{Ad}_{P}^\vee
&amp;= \begin{pmatrix}
    R &amp; 0_{3\times 3} \\ x^\times R &amp; R
\end{pmatrix} \in \mathbb{R}^{6\times 6}
\end{aligned}\]

<p>The little adjoint matrix and the Lie bracket are easily obtained by differentiating \(\mathrm{Ad}_P^\vee\) with respect to \(P\) at the identify \(I_4\),</p>

\[\begin{aligned}
\mathrm{ad}_{U}^\vee
&amp;= \begin{pmatrix}
    \omega^\times &amp; 0_{3\times 3} \\ v^\times &amp; \omega^\times
\end{pmatrix} \in \mathbb{R}^{6\times 6}, \\
[U_1, U_2] &amp;= \begin{pmatrix} (\omega_1 \times \omega_2)^\times &amp; v_1\times \omega_2 + \omega_1 \times v_2 \\ 0_{1\times 3} &amp; 0 \end{pmatrix}
\end{aligned}\]

<p>As always, the Lie bracket can also be obtained by computing the matrix commutator \([U_1, U_2] = U_1 U_2 - U_2 U_1\).
It is useful to be aware of these equivalences so that you can pick the technique you find easiest in your situation.</p>

<h4 id="exponential-and-logarithm">Exponential and Logarithm</h4>

<p>The exponential and logarithm of \(\mathbf{SE}(3)\) admit elegant terms involving trigonometric expressions, similarly to \(\mathbf{SE}(2)\).
Recall from the previous post on \(\mathbf{SO}(3)\) that, for any \(\omega \in \mathbb{R}^3\),</p>

\[\begin{aligned}
(\omega^\times)^2 &amp;= \omega \omega^\top - \omega^\top \omega I_3, \\
(\omega^\times)^3 &amp;= - \vert \omega \vert^2 \omega^\times.
\end{aligned}\]

<p>Now consider an arbitrary element \(U \in \mathfrak{se}(3)\) with rotational and translational components \(\omega\) and \(v\).
Looking at the powers of \(U\), we find</p>

\[\begin{aligned}
U^2 &amp;= \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}^2
= \begin{pmatrix} (\omega^\times)^2 &amp; \omega^\times v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}, \\
U^3 &amp;= \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}^3
= \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \begin{pmatrix} (\omega^\times)^2 &amp; \omega^\times v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}
= \begin{pmatrix} (\omega^\times)^3 &amp; (\omega^\times)^2 v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}.
\end{aligned}\]

<p>From here we can prove the pattern by induction.
Assume that</p>

\[\begin{aligned}
U^n &amp;= \begin{pmatrix} (\omega^\times)^n &amp; (\omega^\times)^{n-1} v \\ 0_{1\times 3} &amp; 0 \end{pmatrix},
\end{aligned}\]

<p>for a given \(n \geq 2\). Then</p>

\[\begin{aligned}
U^{n+1} &amp;= \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \begin{pmatrix} (\omega^\times)^n &amp; (\omega^\times)^{n-1} v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}
= \begin{pmatrix} (\omega^\times)^{n+1} &amp; (\omega^\times)^{(n+1)-1} v \\ 0_{1\times 3} &amp; 0 \end{pmatrix}.
\end{aligned}\]

<p>Since the formula holds for \(n = 2\), we have therefore shown that it holds for all \(n \geq 2\) by induction.
From here we can compute the matrix exponential explicitly. By definition,</p>

\[\begin{aligned}
\exp(U)
    &amp;= \sum_{n=0}^\infty \frac{1}{n!} U^n \\
    &amp;= I_4 + U + \sum_{n=2}^\infty \frac{1}{n!} U^n \\
    &amp;= I_4 + U + \sum_{n=2}^\infty \frac{1}{n!} \begin{pmatrix} (\omega^\times)^n &amp; (\omega^\times)^{n-1} v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \\
% ------------
    &amp;= \begin{pmatrix} I_3 &amp; 0_{3\times 1} \\ 0_{1\times 3} &amp; 1 \end{pmatrix} 
    + \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} 
    + \begin{pmatrix} \sum_{n=2}^\infty \frac{1}{n!} (\omega^\times)^n &amp; \sum_{n=2}^\infty \frac{1}{n!} (\omega^\times)^{n-1} v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \\
% ------------
    &amp;= \begin{pmatrix} I_3 + \omega^\times + \sum_{n=2}^\infty \frac{1}{n!} (\omega^\times)^n &amp; 
    v + \sum_{n=2}^\infty \frac{1}{n!} (\omega^\times)^{n-1} v \\ 
    0_{1\times 3} &amp; 1 \end{pmatrix} \\
% ------------
    &amp;= \begin{pmatrix} \sum_{n=0}^\infty \frac{1}{n!} (\omega^\times)^n &amp; 
    \sum_{n=1}^\infty \frac{1}{n!} (\omega^\times)^{n-1} v \\ 
    0_{1\times 3} &amp; 1 \end{pmatrix} \\
% ------------
    &amp;= \begin{pmatrix} \exp(\omega^\times) &amp; 
    \left( \sum_{n=1}^\infty \frac{1}{n!} (\omega^\times)^{n-1} \right) v \\ 
    0_{1\times 3} &amp; 1 \end{pmatrix} \\
\end{aligned}\]

<p>Let \(P = \exp(U)\) denote the result, with rotation and translation terms \(R \in \mathbf{SO}(3)\) and \(x \in \mathbb{R}^3\), respectively.
Then \(R = \exp(\omega^\times)\), which is simply the \(\mathbf{SO}(3)\) exponential computed previously.
To determine \(x = \left( \sum_{n=1}^\infty \frac{1}{n!} (\omega^\times)^{n-1} \right) v\), we first define \(M(\omega) = \sum_{n=1}^\infty \frac{1}{n!} (\omega^\times)^{n-1}\) and then attempt to simplify it using <a href="https://en.wikipedia.org/wiki/Trigonometric_functions#Power_series_expansion">trigonometric power series expansions</a>.
Assume for now that \(\omega \neq 0\). Then</p>

\[\begin{aligned}
M(\omega)
&amp;= \sum_{n=1}^\infty \frac{1}{n!} (\omega^\times)^{n-1} \\
&amp;= I_3 + \sum_{n=2}^\infty \frac{1}{n!} (\omega^\times)^{n-1} \\
&amp;= I_3 + \sum_{k=1}^\infty \frac{1}{(2k)!} (\omega^\times)^{2k-1} + \sum_{k=1}^\infty \frac{1}{(2k+1)!} (\omega^\times)^{2k} \\
&amp;= I_3 + \sum_{k=1}^\infty \frac{(-1)^{k-1}}{(2k)!} \vert \omega \vert^{2k-2} \omega^\times + \sum_{k=1}^\infty \frac{(-1)^{k}}{(2k+1)!} \vert \omega \vert^{2k-2} (\omega^\times)^{2} \\
% ------------
&amp;= I_3
- \vert \omega \vert^{-2} \sum_{k=1}^\infty \frac{(-1)^{k}}{(2k)!} \vert \omega \vert^{2k} \omega^\times
+ \vert \omega \vert^{-3} \sum_{k=1}^\infty \frac{(-1)^{k}}{(2k+1)!} \vert \omega \vert^{2k+1} (\omega^\times)^{2} \\
% ------------
&amp;= I_3
- \vert \omega \vert^{-2} \left(-1 + \sum_{k=0}^\infty \frac{(-1)^{k}}{(2k)!} \vert \omega \vert^{2k} \right) \omega^\times
+ \vert \omega \vert^{-3} \left(-\vert \omega \vert + \sum_{k=0}^\infty \frac{(-1)^{k}}{(2k+1)!} \vert \omega \vert^{2k+1} \right) (\omega^\times)^{2} \\
% ------------
&amp;= I_3
- \vert \omega \vert^{-2} \left(-1 + \cos( \vert \omega \vert ) \right) \omega^\times
+ \vert \omega \vert^{-3} \left(-\vert \omega \vert + \sin( \vert \omega \vert ) \right) (\omega^\times)^{2} \\
% ------------
&amp;= I_3
+ \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \omega^\times
+ \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times)^{2}
\end{aligned}\]

<p>In summary, the full exponential is given by</p>

\[\begin{aligned}
\exp(U)
% ------------
&amp;= \begin{pmatrix} \exp(\omega^\times) &amp;
M(\omega) v \\
0_{1\times 3} &amp; 1 \end{pmatrix}, \\
% ------------
\exp(\omega^\times)
&amp;= I_3
+ \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
+ \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2}, \\
% ------------
M(\omega)
&amp;= I_3
+ \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \omega^\times
+ \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times)^{2}.
\end{aligned}\]

<p>In the case where \(\omega = 0\), the result is simply</p>

\[\begin{aligned}
\exp(U)
% ------------
&amp;= \exp \begin{pmatrix} 0_{3\times 3} &amp;
v \\
0_{1\times 3} &amp; 0 \end{pmatrix}
% ------------
= \begin{pmatrix} I_3 &amp;
v \\
0_{1\times 3} &amp; 1 \end{pmatrix}.
\end{aligned}\]

<p>The logarithm is simply the inverse of the exponential, so suppose that \(\exp(U) = P\).
Then \(\exp(\omega) = R\), which we already know how to solve using</p>

\[\begin{aligned}
\vert \omega \vert &amp;= \cos^{-1}\left( \frac{\mathrm{tr}(R) - 1}{2} \right) \\
\omega  &amp;= \frac{\vert \omega \vert}{2 \sin(\vert \omega \vert)}(R - R^\top)^\vee,
\end{aligned}\]

<p>for \(\vert \omega \vert \in (0, \pi)\).
In this case, we have that \(x = M(\omega) v\), which is solved by \(v = M(\omega)^{-1} x\), as long as \(M(\omega)\) is invertible.
When is \(M(\omega)\) invertible?
Rather, if \(M(\omega)\) is not invertible, then there must be a vector \(v\) for which \(M(\omega) v = 0\).
This would mean that</p>

\[\begin{aligned}
0 &amp;= M(\omega) v \\
&amp;= v
+ \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \omega^\times v
+ \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times)^{2} v.
\end{aligned}\]

<p>If \(v\) is not parallel to \(\omega\), then the second term \(\frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \omega^\times v\) is non-zero and is also orthogonal to the other terms, so the full expression cannot be zero.
Concretely, we can premultiply both sides by \((\omega^\times v)^\top\) to obtain</p>

\[\begin{aligned}
0 &amp;= (\omega^\times v)^\top M(\omega) v \\
&amp;= (\omega^\times v)^\top v
+ \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } (\omega^\times v)^\top \omega^\times v
+ \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times v)^\top (\omega^\times)^{2} v \\
&amp;= - v^\top \omega^\times v
+ \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \vert \omega^\times v \vert^2
+ \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times v)^\top \omega^\times (\omega^\times v) \\
&amp;= \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \vert \omega^\times v \vert^2.
\end{aligned}\]

<p>This is zero if and only if \(\frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \vert \omega^\times v \vert^2 = 0\), which can only occur if \(\omega^\times v = 0\).
Considering the case that \(v = a \omega\) for some scalar \(a\), then we would require</p>

\[\begin{aligned}
0 &amp;= M(\omega) (a\omega) \\
&amp;= a \omega
+ a \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \omega^\times \omega
+ a \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times)^{2} \omega \\
&amp;= a \omega,
\end{aligned}\]

<p>which is only solved by \(a \omega = 0\).
What this all means is that \(M(\omega)\) is always invertible, no matter the value of \(\omega\).
Thus the full logarithm is given by</p>

\[\begin{aligned}
 \log \begin{pmatrix} R &amp; x \\ 0_{1\times 3} &amp; 1 \end{pmatrix} 
 &amp;= \begin{pmatrix} \log(R) &amp; M(\log(R))^{-1} x \\ 0_{1\times 3} &amp; 0 \end{pmatrix}, \\
\log(R)  &amp;= \frac{\theta}{2 \sin(\theta)}(R - R^\top)^\vee, \hspace{2cm}
\theta := \cos^{-1}\left( \frac{\mathrm{tr}(R) - 1}{2} \right), \\
M(\omega) &amp;:= I_3
+ \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \omega^\times
+ \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times)^{2}.
\end{aligned}\]

<h3 id="conclusion">Conclusion</h3>

<p>The 3D special Euclidean group is an important Lie group to understand for robotics, where it is used to describe transformations between frames of reference of different rigid bodies.
The goal of this post is not to explain this application, but rather, to go into detail of how the important formulas for \(\mathbf{SE}(3)\) can be derived.
If you want to look at an implementation of some of the functions I have described, I suggest you to look at <a href="https://github.com/pvangoor/pylie">pylie</a>.
Remember that a lot of the formulas here are derived from the formulas for \(\mathbf{SO}(3)\), which you can find in my <a href="/mathematics/2025/02/15/special_orthogonal_so3.html">previous post</a>.
Finally, I will leave a summary of the formulas below for quick reference.</p>

<h3 id="quick-reference">Quick Reference</h3>

<p>The 3D special Euclidean group and its Lie algebra</p>

\[\begin{aligned}
        \mathbf{SE}(3) &amp;= \left\{
        P = \begin{pmatrix}
            R &amp; x \\ 0_{1\times 3} &amp; 1
        \end{pmatrix} \in \mathbb{R}^{4\times 4}
        \; \middle| \;
        R \in \mathbf{SO}(3), \; x \in \mathbb{R}^3
    \right\}, \\
        \mathfrak{se}(3)
    &amp;=  \left\{
        \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0 \end{pmatrix} \in \mathbb{R}^{4\times 4}
        \; \middle| \;
        \omega^\times \in \mathfrak{so}(3), \; v \in \mathbb{R}^3
    \right\}.
\end{aligned}\]

<p>The Lie algebra wedge map</p>

\[\begin{aligned}
    \cdot^\wedge &amp;: \mathbb{R}^6 \to \mathfrak{se}(3), \\
    \begin{pmatrix} \omega \\ v \end{pmatrix}^\wedge
    &amp;= \begin{pmatrix}
    \omega^\times &amp; v \\ 0_{1\times 3} &amp; 0
    \end{pmatrix}, \\
    \omega^\times &amp;= \begin{pmatrix}
        \omega_1 \\ \omega_2 \\ \omega_3
    \end{pmatrix}^\times
    = \begin{pmatrix}
        0 &amp; -\omega_3 &amp; \omega_2 \\
        \omega_3 &amp; 0 &amp; -\omega_1 \\
        -\omega_2 &amp; \omega_1 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>Adjoint matrices</p>

\[\begin{aligned}
\mathrm{Ad}_{P}^\vee
&amp;= \begin{pmatrix}
    R &amp; 0_{3\times 3} \\ x^\times R &amp; R
\end{pmatrix} \in \mathbb{R}^{6\times 6}, \\
\mathrm{ad}_{U}^\vee
&amp;= \begin{pmatrix}
    \omega^\times &amp; 0_{3\times 3} \\ v^\times &amp; \omega^\times
\end{pmatrix} \in \mathbb{R}^{6\times 6}.
\end{aligned}\]

<p>Exponential formula (for \(\omega \neq 0\))</p>

\[\begin{aligned}
\exp \begin{pmatrix} \omega^\times &amp;
v \\
0_{1\times 3} &amp; 0 \end{pmatrix}
% ------------
&amp;= \begin{pmatrix} \exp(\omega^\times) &amp;
M(\omega) v \\
0_{1\times 3} &amp; 1 \end{pmatrix}, \\
% ------------
\exp(\omega^\times)
&amp;= I_3
+ \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
+ \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2}, \\
% ------------
M(\omega)
&amp;:= I_3
+ \frac{ 1 - \cos(\vert \omega \vert) }{ \vert \omega \vert^2 } \omega^\times
+ \frac{ \sin( \vert \omega \vert ) - \vert \omega \vert }{ \vert \omega \vert^3 } (\omega^\times)^{2}.
\end{aligned}\]

<p>Logarithm formula (for \(R \neq R^\top\))</p>

\[\begin{aligned}
 \log \begin{pmatrix} R &amp; x \\ 0_{1\times 3} &amp; 1 \end{pmatrix} 
 &amp;= \begin{pmatrix} \log(R) &amp; M(\log(R))^{-1} x \\ 0_{1\times 3} &amp; 0 \end{pmatrix}, \\
\log(R)  &amp;= \frac{\theta}{2 \sin(\theta)}(R - R^\top)^\vee, \hspace{2cm}
\theta := \cos^{-1}\left( \frac{\mathrm{tr}(R) - 1}{2} \right).
\end{aligned}\]]]></content><author><name></name></author><category term="Mathematics" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Example of the Computation of EqF Matrices</title><link href="/papers/2025/06/26/EqF_derivation_example.html" rel="alternate" type="text/html" title="Example of the Computation of EqF Matrices" /><published>2025-06-26T00:00:00+00:00</published><updated>2025-06-26T00:00:00+00:00</updated><id>/papers/2025/06/26/EqF_derivation_example</id><content type="html" xml:base="/papers/2025/06/26/EqF_derivation_example.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h2 id="introduction">Introduction</h2>

<p>This post is inspired by a PhD student who emailed me for help in deriving the \(C\) and \(C^\star\) matrices in the Equivariant Filter equations.
The Equivariant Filter (EqF) is an approach to state estimation of systems with Lie group symmetries (see the paper here: <a href="https://arxiv.org/abs/2010.14666">https://arxiv.org/abs/2010.14666</a>).
Such systems are very common in robotics and aerospace, particularly due to the need to consider 3D rotations.
In this post, I will explain in detail the example presented also in the <a href="https://arxiv.org/abs/2010.14666">paper</a>, which is the problem of estimating a single bearing.
The main purpose is to expand some of the derivations from the original paper, and to provide an example of how to do practical computations with manifolds and Lie groups.</p>

<h2 id="problem-description-and-set-up">Problem Description and Set-up</h2>

<p>Before we can apply the EqF methodology, we need to understand the problem we are studying.
We consider a robot (e.g. a quadrotor) equipped with a gyroscope and a magnetometer.
The gyroscope measures the angular velocity \(\Omega \in \mathbb{R}^3\) of the robot and the magnetometer measures the direction of the Earth’s magnetic field \(\eta \in S^2\).
Both measurements are taken in the body-fixed frame of the robot.
The kinematics of the magnetic field direction can then be expressed as</p>

\[\begin{aligned}
\dot{\eta} = -\Omega \times \eta = -\Omega^\times \eta,
\end{aligned}\]

<p>where \(\Omega^\times \in \mathbb{R}^{3\times 3}\) is the skew-symmetric matrix defined by</p>

\[\begin{aligned}
    \Omega^\times :=
    \begin{pmatrix}
        0 &amp; -\Omega_3 &amp; \Omega_2 \\
        \Omega_3 &amp; 0 &amp; -\Omega_1 \\
        -\Omega_2 &amp; \Omega_1 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>The measurement function is simply \(y = h(\eta) = c_m\eta\), where \(c_m &gt; 0\) is the magnetic field strength.</p>

<p>With this set-up out of the way, we can start to apply the EqF methodology.
We will follow the same steps as in the <a href="https://arxiv.org/abs/2010.14666">paper</a>, but I will give a bit more detail at each step.</p>

<h2 id="eqf-design-procedure">EqF Design Procedure</h2>

<h3 id="state-space-symmetry">State Space Symmetry</h3>

<p>To design an EqF, the first and most fundamental ingredient is a Lie group that acts transitively on the state space of the system we are studying.
In this case, the state space is the sphere \(S^2\), and a natural choice for the Lie group is the set of 3D rotations, \(\mathbf{SO}(3)\).
This is a matrix Lie group and is defined by</p>

\[\begin{aligned}
    \mathbf{SO}(3) = \left\{
        R \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        R^\top R = I_3, \; \det(R) = 1
    \right\}.
\end{aligned}\]

<p>This group has a right-handed group action on \(S^2\) defined by</p>

\[\begin{aligned}
    \phi : \mathbf{SO}(3) \times S^2 \to S^2, &amp;&amp; \phi(R, \eta) := R^\top \eta.
\end{aligned}\]

<p>To verify that this is a group action, we just need to check that the <strong>identity</strong> and <strong>compatibility</strong> properties are satisfied:</p>

\[\begin{gathered}
    \phi(I, \eta) = I^\top \eta = I \eta = \eta, \\
    \phi(R_2, \phi(R_1, \eta)) = \phi(R_2, R_1^\top \eta) = R_2^\top R_1^\top \eta = (R_1 R_2)^\top \eta = \phi(R_1 R_2, \eta).
\end{gathered}\]

<p>In other words, the group action applied with the identity matrix does nothing, and the group action applied twice with different group elements is the same as the group action applied once with the product of those elements.
A group action is called transitive if, for any \(\eta_1, \eta_2 \in S^2\), there exists a matrix \(R \in \mathbf{SO}(3)\) such that \(\phi(R, \eta_1) = \eta_2\).
This is certainly the case here, although the proof is a little bit more detailed then I would like to include in this post.</p>

<h3 id="equivariant-lift">Equivariant Lift</h3>

<p>The lift of the system is a map from the inputs and state of the system to the Lie group we have chosen.
It enable the trajectories of the system to be replicated on the Lie group.
The lift is a map \(\Lambda : S^2 \times \mathbb{R}^3 \to \mathfrak{so}(3)\).
That is, it is a map from the state space and the input space to the Lie algebra of our chosen symmetry group.
To ensure that trajectories generated on the group by the lift will match trajectories on the original state space, we need to verify the <strong>lift condition</strong>:</p>

\[\begin{aligned}
    \mathrm{D}_R|_I \phi(R, \eta) [\Lambda(\eta, \Omega)] = f_\Omega(\eta) = - \Omega^\times \eta.
\end{aligned}\]

<p>To find a solution to this, we evaluate the left-hand side of the lift condition to</p>

\[\begin{aligned}
    \mathrm{D}_R|_I \phi(R, \eta) [\Lambda(\eta, \Omega)]
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0}  \phi(\exp(s \Lambda(\eta, \Omega)), \eta) \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0}  \exp(s \Lambda(\eta, \Omega))^\top \eta \\
    &amp;= \Lambda(\eta, \Omega)^\top \eta \\
    &amp;= - \Lambda(\eta, \Omega) \eta.
\end{aligned}\]

<p>The last line is due to the fact that \(\Lambda(\eta, \Omega) \in \mathfrak{so}(3)\).
Now the lift condition is simplified to</p>

\[\begin{aligned}
    - \Lambda(\eta, \Omega) \eta = -\Omega^\times \eta.
\end{aligned}\]

<p>The general solution to this is given by</p>

\[\begin{aligned}
    \Lambda(\eta, \Omega) = \Omega^\times + \alpha(\eta, \Omega) \eta,
\end{aligned}\]

<p>where \(\alpha(\eta, \Omega) \in \mathbb{R}\).
For simplicity, we will choose \(\alpha \equiv 0\), so that the lift becomes</p>

\[\begin{aligned}
    \Lambda(\eta, \Omega) = \Omega^\times,
\end{aligned}\]

<p>for all \(\eta \in S^2\) and \(\Omega \in \mathbb{R}^3\).
It is worth noting that this is quite a special case.
It is not always possible to find a lift independent of the state \(\eta\), but if it is then this is usually desirable since it greatly simplifies some of the later computations.</p>

<h3 id="output-equivariance">Output Equivariance</h3>

<p>Sometimes the measurement function \(h\) associated with a system exhibits equivariance with respect to the action \(\phi\) on the state space.
To find out whether this is the case, we can evaluate \(h(\phi(R, \eta))\) and see if it is possible to express the result in terms of \(R\) and \(h(\eta)\):</p>

\[\begin{aligned}
    h(\phi(R, \eta)) = h(R^\top \eta) = c_m R^\top \eta = R^\top (c_m \eta) = R^\top h(\eta).
\end{aligned}\]

<p>So, indeed, the measurement function \(h\) is compatible with the group action \(\phi\).
Formally, this is expressed as another group action on the output space:</p>

\[\begin{aligned}
    \rho : \mathbf{SO}(3) \times \mathbb{R}^3 \to \mathbb{R}^3, &amp;&amp;
    \rho(R, y) := R^\top y.
\end{aligned}\]

<p>It is not required for this group action to be transitive, but it should also be right-handed.
Output equivariance is desirable because of the ability to use it in deriving the \(C^\star\) matrix, which improves filter performance.</p>

<h3 id="origin-and-state-error">Origin and State Error</h3>

<p>Implementing the EqF requires us to choose an origin on the state space \(S^2\).
This choice is arbitrary, although it is generally useful to pick something easy to work with.
For this example, we choose the origin to be the unit vector \(\mathbf{e}_1 = (1,0,0) \in S^2\).</p>

<p>Next, we need to choose a subspace of the Lie algebra \(\mathfrak{so}(3)\) to define normal coordinates on the manifold.
The reason this is needed is because the Lie group \(\mathbf{SO}(3)\) is larger (in dimension) than the state space \(S^2\), and the normal coordinates are a way to choose a two dimensional subset of \(\mathbf{SO}(3)\) that represents the state space \(S^2\) (at least near the chosen origin \(\mathbf{e}_1\) ).
The subspace we choose will be labelled \(\mathfrak{m} \subset \mathfrak{so}(3)\), and it must satisfy \(\mathrm{D}_R|_I \phi(R, \mathbf{e}_1)[\mathfrak{m}] = \mathrm{T}_{\mathbf{e}_1}S^2\).
In other words, the differential map \(\mathrm{D}_R|_I \phi(R, \mathbf{e}_1) : \mathfrak{so}(3) \to \mathrm{T}_{\mathbf{e}_1}S^2\) must be full rank when constrained to \(\mathfrak{m}\).
For any \(\Omega^\times \in \mathfrak{so}(3)\), we have</p>

\[\begin{aligned}
\mathrm{D}_R|_I \phi(R, \mathbf{e}_1)[\Omega^\times]
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0}  \phi(\exp(s \Omega^\times), \mathbf{e}_1) \\
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \exp(s \Omega^\times)^\top \mathbf{e}_1 \\
&amp;= -\Omega^\times \mathbf{e}_1.
\end{aligned}\]

<p>This is nonzero as long as \(\Omega\) is not a scalar multiple of \(\mathbf{e}_1\).
Thus, to define \(\mathfrak{m}\) it suffices to choose a basis of two linearly independent elements \(\Omega_1^\times, \Omega_2^\times \in \mathfrak{so}(3)\) such that \(\Omega_1, \Omega_2\) are not scalar multiples of \(\mathbf{e}_1\).
A very simple choice is then to select the unit vectors \(\Omega_1 = \mathbf{e}_2, \Omega_2 = \mathbf{e}_3\), so that \(\mathfrak{m}\) is defined by</p>

\[\begin{aligned}
    \mathfrak{m} &amp;= \left\{ v^\wedge \in \mathfrak{so}(3) \; \middle| \; v \in \mathbb{R}^2  \right\}, \\
    (v_1, v_2)^\wedge &amp;:= (0, v_1, v_2)^\times.
\end{aligned}\]

<p>Having made this choice, we define the <strong>normal coordinates</strong> \(\vartheta : \mathcal{U} \subset S^2 \to \mathcal{V} \subset \mathbb{R}^2\) for \(S^2\) about \(\mathbf{e}_1\) by</p>

\[\begin{aligned}
    \vartheta(e) &amp;:= -\mathrm{atan2}(\vert \mathbf{e}_1 \times e \vert, \mathbf{e}_1^\top e) \begin{pmatrix} 0_{2\times 1} &amp; I_2 \end{pmatrix} \frac{\mathbf{e}_1 \times e}{\vert \mathbf{e}_1 \times e \vert}, \\
    \vartheta^{-1}(\varepsilon^\wedge) &amp;:= \phi(\exp(\varepsilon), \mathbf{e}_1).
\end{aligned}\]

<p>While \(\vartheta\) looks complicated, \(\vartheta^{-1}\) is very simple in terms of the Lie group exponential and the group action \(\phi\).
Also, while it can be useful for analysis, the expressions for \(\vartheta\) and \(\vartheta^{-1}\) are not strictly necessary for implementation.
Only the derivatives at \(e=\mathbf{e}_1\) and \(\varepsilon = 0_{2\times 1}\), respectively, are required.
The derivative of \(\vartheta^{-1}\) is computed by letting \(\varepsilon\) be arbitrary and computing the directional derivative</p>

\[\begin{aligned}
    \mathrm{D} \vartheta^{-1}(0)[\varepsilon]
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0}  \vartheta^{-1}(s \varepsilon) \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0}  \phi(\exp(s\varepsilon^\wedge), \mathbf{e}_1) \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \exp(s\varepsilon^\wedge)^\top \mathbf{e}_1 \\
    &amp;= -(0, \varepsilon_1, \varepsilon_2)^\times \mathbf{e}_1 \\
    &amp;= \mathbf{e}_1^\times (0, \varepsilon_1, \varepsilon_2) \\
    &amp;= \begin{pmatrix}
        0 &amp; 0 &amp; 0 \\ 0 &amp; 0 &amp; -1 \\ 0 &amp; 1 &amp; 0
    \end{pmatrix} \begin{pmatrix}
        0 \\ \varepsilon_1 \\ \varepsilon_2
    \end{pmatrix} \\
    &amp;= \begin{pmatrix}
        0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0
    \end{pmatrix} \begin{pmatrix}
        \varepsilon_1 \\ \varepsilon_2
    \end{pmatrix}.
\end{aligned}\]

<p>In other words, we can express the differential \(\mathrm{D} \vartheta^{-1}(0)\) as the \(3\times 2\) matrix shown.
The differential \(\mathrm{D} \vartheta(\mathbf{e}_1)\) can be obtained as the left-inverse of \(\mathrm{D} \vartheta^{-1}(0)\), i.e.</p>

\[\begin{aligned}
    \mathrm{D} \vartheta(\mathbf{e}_1) &amp;= \begin{pmatrix}
        0 &amp; 0 &amp; 1 \\ 0 &amp; -1 &amp; 0
    \end{pmatrix},
\end{aligned}\]

<p>which satisfies \(\mathrm{D} \vartheta(\mathbf{e}_1) \cdot \mathrm{D} \vartheta^{-1}(0) = I_2\).</p>

<h3 id="eqf-matrix-computation">EqF Matrix Computation</h3>

<p>The EqF matrices \(\mathring{A}_t, B_t, C_t, C^\star_t\) can now be computed by specialising their formulas.</p>

<h4 id="a-matrix">A Matrix</h4>

<p>Beginning with \(\mathring{A}_t\), we use the formula (51) in the paper <a href="https://arxiv.org/abs/2010.14666">[EqF]</a>:</p>

\[\begin{aligned}
    \mathring{A}_t
    = \mathrm{D}_e|_{\mathbf{e}_1} \vartheta(e)
    \cdot \mathrm{D}_\eta|_{\hat{\eta}} \phi(\hat{R}^{-1}, \eta)
    \cdot \mathrm{D}_E|_{I} \phi(E, \hat{\eta})
    \cdot \mathrm{D}_\eta|_{\hat{\eta}} \Lambda(\eta, \Omega)
    \cdot \mathrm{D}_e|_{\mathbf{e}_1} \phi(\hat{R}, e)
    \cdot \mathrm{D}_\varepsilon|_0 \vartheta^{-1}(\varepsilon).
\end{aligned}\]

<p>This formula can be intimidating, but in our case it simplifies greatly due to the fact that the lift \(\Lambda\) is independent of \(\eta\).
This means that the fourth term in the formula, \(\mathrm{D}_\eta|_{\hat{\eta}} \Lambda(\eta, \Omega)\), is zero and therefore the whole matrix \(\mathring{A}_t\) is zero as well.</p>

<h4 id="b-matrix">B Matrix</h4>

<p>The \(B_t\) matrix is used in tuning the filter gains (i.e. the covariance of the process noise) and is given by formula (42) in <a href="https://arxiv.org/abs/2010.14666">[EqF]</a>:</p>

\[\begin{aligned}
    B_t
    = \mathrm{D}_e|_{\mathbf{e}_1} \vartheta(e)
    \cdot \mathrm{D}_E|_{I} \phi(E, \mathbf{e}_1)
    \cdot \mathrm{Ad}_{\hat{R}}
    \cdot \mathrm{D}_\Omega|_{\Omega_m} \Lambda(\hat{\eta}, \Omega).
\end{aligned}\]

<p>There are many ways to perform this computation.
I will demonstrate here the method using an arbitrary input vector \(\mu\).
Let \(\mu \in \mathbb{R}^3\) and compute:</p>

\[\begin{aligned}
    B_t \mu
    = \mathrm{D}_e|_{\mathbf{e}_1} \vartheta(e)
    \cdot \mathrm{D}_E|_{I} \phi(E, \mathbf{e}_1)
    \cdot \mathrm{Ad}_{\hat{R}}
    \cdot \mathrm{D}_\Omega|_{\Omega_m} \Lambda(\hat{\eta}, \Omega)[\mu].
\end{aligned}\]

<p>To avoid this computation from becoming extremely long, we can work in stages starting from the right-most term.
In other words, we start by working out \(\mathrm{D}_\Omega|_{\Omega_m} \Lambda(\hat{\eta}, \Omega)[\mu]\) and then carry on to the next term until we reach the end:</p>

\[\begin{aligned}
    \mathrm{D}_\Omega|_{\Omega_m} \Lambda(\hat{\eta}, \Omega)[\mu]
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \Lambda(\hat{\eta}, \Omega_m + s \mu) \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} (\Omega_m + s \mu)^\times \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \Omega_m^\times + s \mu^\times \\
    &amp;= \mu^\times.
\end{aligned}\]

<p>Now we compute the formula up to the second last term:</p>

\[\begin{aligned}
\mathrm{Ad}_{\hat{R}}
    \cdot \mathrm{D}_\Omega|_{\Omega_m} \Lambda(\hat{\eta}, \Omega)[\mu]
    &amp;= \mathrm{Ad}_{\hat{R}}[\mu^\times] \\
    &amp;= \hat{R} \mu^\times \hat{R}^\top \\
    &amp;= (\hat{R} \mu)^\times.
\end{aligned}\]

<p>We carry on like this for the next term:</p>

\[\begin{aligned}
    \mathrm{D}_E|_{I} \phi(E, \mathbf{e}_1)
    \cdot \mathrm{Ad}_{\hat{R}}
    \cdot \mathrm{D}_\Omega|_{\Omega_m} \Lambda(\hat{\eta}, \Omega)[\mu]
    &amp;= \mathrm{D}_E|_{I} \phi(E, \mathbf{e}_1)[(\hat{R} \mu)^\times] \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \phi(\exp(s(\hat{R} \mu)^\times), \mathbf{e}_1) \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \exp(s(\hat{R} \mu)^\times)^\top \mathbf{e}_1 \\
    &amp;= -(\hat{R} \mu)^\times \mathbf{e}_1 \\
    &amp;= \mathbf{e}_1^\times \hat{R} \mu.
\end{aligned}\]

<p>Note that I could have stopped at the second last line, but I chose to rewrite the expression with \(\mu\) at the front. This extra step is helpful in simplifying everything at the end.
With these computations, we arrive at the final step:</p>

\[\begin{aligned}
    B_t \mu
    &amp;= \mathrm{D}_e|_{\mathbf{e}_1} \vartheta(e)
    \cdot \mathrm{D}_E|_{I} \phi(E, \mathbf{e}_1)
    \cdot \mathrm{Ad}_{\hat{R}}
    \cdot \mathrm{D}_\Omega|_{\Omega_m} \Lambda(\hat{\eta}, \Omega)[\mu] \\
    &amp;= \mathrm{D}_e|_{\mathbf{e}_1} \vartheta(e)[\mathbf{e}_1^\times \hat{R} \mu] \\
    &amp;= \begin{pmatrix} 0 &amp; 0 &amp; 1 \\ 0 &amp; -1 &amp; 0 \end{pmatrix}
    \mathbf{e}_1^\times \hat{R} \mu \\
    &amp;= \begin{pmatrix} 0 &amp; 0 &amp; 1 \\ 0 &amp; -1 &amp; 0 \end{pmatrix}
    \begin{pmatrix} 0 &amp; 0 &amp; 0 \\ 0 &amp; 0 &amp; -1 \\ 0 &amp; 1 &amp; 0 \end{pmatrix}
    \hat{R} \mu \\
    &amp;= \begin{pmatrix} 0 &amp; 1 &amp; 0 \\ 0 &amp; 0 &amp; 1 \end{pmatrix}
    \hat{R} \mu \\
    &amp;= \begin{pmatrix} 0_{2\times 1} &amp; I_2 \end{pmatrix}
    \hat{R} \mu.
\end{aligned}\]

<p>Since this all applies for an arbitrary direction \(\mu\), we obtain the resulting \(B_t\) matrix, which is</p>

\[\begin{aligned}
    B_t &amp;= \begin{pmatrix} 0_{2\times 1} &amp; I_2 \end{pmatrix}\hat{R}.
\end{aligned}\]

<h4 id="c-matrix">C Matrix</h4>

<p>The \(C_t\) matrix can be obtained in a similar fashion to the \(B_t\) matrix.
We first recall formula (33) from <a href="https://arxiv.org/abs/2010.14666">[EqF]</a>, specialised to the notation we are using for this example:</p>

\[\begin{aligned}
    C_t
    := \mathrm{D}_{\eta} |_{\hat{\eta}} h(\eta)
    \cdot \mathrm{D}_e|_{\mathbf{e}_1} \phi(\hat{R}, e)
    \cdot \mathrm{D}_\varepsilon|_0 \vartheta^{-1}(\varepsilon).
\end{aligned}\]

<p>Then, we can compute \(C_t \varepsilon\) for an arbitrary \(\varepsilon \in \mathbb{R}^2\), although this time we will do it without breaking it up into parts.</p>

\[\begin{aligned}
    C_t \varepsilon
    &amp;= \mathrm{D}_{\eta} |_{\hat{\eta}} h(\eta)
    \cdot \mathrm{D}_e|_{\mathbf{e}_1} \phi(\hat{R}, e)
    \cdot \mathrm{D} \vartheta^{-1}(0)[\varepsilon] \\
    &amp;= \mathrm{D}_{\eta} |_{\hat{\eta}} h(\eta)
    \cdot \mathrm{D}_e|_{\mathbf{e}_1} \phi(\hat{R}, e)
    \left[
        \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \varepsilon
    \right] \\
    &amp;= \mathrm{D}_{\eta} |_{\hat{\eta}} h(\eta)\left[
    \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \phi\left( \hat{R}, \mathbf{e}_1 + s \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \varepsilon \right)
    \right]\\
    &amp;= \mathrm{D}_{\eta} |_{\hat{\eta}} h(\eta)\left[
    \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \hat{R}^\top \left( \mathbf{e}_1 + s \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \varepsilon \right)
    \right] \\
    &amp;= \mathrm{D}_{\eta} |_{\hat{\eta}} h(\eta)\left[
    \hat{R}^\top \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \varepsilon
    \right]\\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0}  
    h\left(\hat{\eta} + s\hat{R}^\top \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \varepsilon \right) \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0}  
    c_m\left(\hat{\eta} + s\hat{R}^\top \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \varepsilon \right) \\
    &amp;=
    c_m \hat{R}^\top \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \varepsilon.
\end{aligned}\]

<p>Alternative expressions can be found by further manipulations. We have that</p>

\[\begin{aligned}
    C_t
    &amp;= c_m \hat{R}^\top \begin{pmatrix} 0 &amp; 0 \\ 0 &amp; -1 \\ 1 &amp; 0 \end{pmatrix} \\
    &amp;= c_m \hat{R}^\top \mathbf{e}_1^\times \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \\
    &amp;= c_m \hat{R}^\top \mathbf{e}_1^\times \hat{R} \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \\
    &amp;= c_m (\hat{R}^\top \mathbf{e}_1)^\times \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \\
    &amp;= c_m \hat{\eta}^\times \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \\
    &amp;= \hat{y}^\times \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix}.
\end{aligned}\]

<p>Each of these expressions is equally valid, although the last one is most closely related to the \(C_t^\star\) matrix we will derive next.</p>

<h4 id="c-matrix-1">C* Matrix</h4>

<p>When a system exhibits output equivariance, we can replace the \(C_t\) matrix with the \(C_t^\star\) matrix to obtain improved performance.
The formula is given by (35) in <a href="https://arxiv.org/abs/2010.14666">[EqF]</a>.
For any \(\varepsilon \in \mathbb{R}^2\),</p>

\[\begin{aligned}
    C_t^\star \varepsilon
    := \frac{1}{2} \left(
        \mathrm{D}_{E} |_I \rho(E, y)
        + \mathrm{D}_{E} |_I \rho(E, \hat{y})
    \right)
    \cdot \mathrm{Ad}_{\hat{R}^\top} \varepsilon^\wedge.
\end{aligned}\]

<p>Note that this formula relies on the output action \(\rho\), and also involves the measurement \(y\) taken from the real system.
This ‘additional information’ is what leads to the performance advantage from using \(C_t^\star\) over \(C_t\).</p>

<p>To compute it \(C_t^\star\), we follow similar steps as when computing the \(B_t\) matrix.
First,</p>

\[\begin{aligned}
    \mathrm{Ad}_{\hat{R}^\top} \varepsilon^\wedge
    &amp;= \mathrm{Ad}_{\hat{R}^\top} \left(
        \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon
    \right)^\times \\
    &amp;= \left(
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon
    \right)^\times.
\end{aligned}\]

<p>Now consider an arbitrary \(\bar{y} \in \mathbb{R}^3\) and \(\Delta^\times \in \mathfrak{so}(3)\) and compute</p>

\[\begin{aligned}
    \mathrm{D}_{E} |_I \rho(E, \bar{y})[\Delta^\times]
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \rho(\exp(s \Delta^\times), \bar{y}) \\
    &amp;= \left. \frac{\mathrm{d}}{\mathrm{d}s} \right|_{s=0} \exp(s \Delta^\times)^\top \bar{y} \\
    &amp;= -\Delta^\times \bar{y} \\
    &amp;= \bar{y}^\times \Delta.
\end{aligned}\]

<p>Then it follows that</p>

\[\begin{aligned}
    C_t^\star \varepsilon
    &amp;= \frac{1}{2} \left(
        \mathrm{D}_{E} |_I \rho(E, y)
        + \mathrm{D}_{E} |_I \rho(E, \hat{y})
    \right)
    \cdot \mathrm{Ad}_{\hat{R}^\top} \varepsilon^\wedge \\
    &amp;= \frac{1}{2} \left(
        \mathrm{D}_{E} |_I \rho(E, y)
        + \mathrm{D}_{E} |_I \rho(E, \hat{y})
    \right) \left[ \left(
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon
    \right)^\times \right] \\
    &amp;= \frac{1}{2} \left(
        \mathrm{D}_{E} |_I \rho(E, y) \left[ \left(
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon
    \right)^\times \right]
        + \mathrm{D}_{E} |_I \rho(E, \hat{y}) \left[ \left(
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon
    \right)^\times \right]
    \right) \\
    &amp;= \frac{1}{2} \left(
        y^\times
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon
        + \hat{y}^\times
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon
    \right) \\
    &amp;= \frac{1}{2} \left(
        y^\times + \hat{y}^\times \right)
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix} \varepsilon.
\end{aligned}\]

<p>This gives us the formula for \(C^\star_t\) as</p>

\[\begin{aligned}
    C_t^\star
    &amp;= \frac{1}{2} \left(
        y^\times + \hat{y}^\times \right)
        \hat{R}^\top \begin{pmatrix} 0_{1\times 2} \\ I_2 \end{pmatrix}.
\end{aligned}\]

<h2 id="summary">Summary</h2>

<p>This post should be read as a supplement to the original Equivariant Filter (EqF) paper <a href="https://arxiv.org/abs/2010.14666">[EqF]</a>.
I have tried to provide some context so that things are relatively self-contained, but to understand the motivation behind everything I derived here, I strongly suggest reading through the paper.
If there is anything unclear from my computations, however, or if you suspect a mistake, please reach out and let me know!
I know that it can be overwhelming to start using the mathematics needed for dealing with Lie groups and manifolds, so I hope this post provides some help for understanding how to use it for computations.</p>]]></content><author><name></name></author><category term="Papers" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">New Paper: Synchronous observer design for Inertial Navigation Systems with almost-global convergence</title><link href="/papers/2025/04/25/ins_automatica_paper.html" rel="alternate" type="text/html" title="New Paper: Synchronous observer design for Inertial Navigation Systems with almost-global convergence" /><published>2025-04-25T00:00:00+00:00</published><updated>2025-04-25T00:00:00+00:00</updated><id>/papers/2025/04/25/ins_automatica_paper</id><content type="html" xml:base="/papers/2025/04/25/ins_automatica_paper.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<p>I am very happy to announce that our new work on observer design for inertial navigation with GNSS and magnetometer measurements has been published in Automatica!
An Inertial Navigation System (INS) is used to estimate the attitude, velocity, and position of a vehicle by combining measurements from an Inertial Measurement Unit (IMU) with other supporting sensors, such as GNSS, magnetometer, etc.
The main idea behind this paper was to use the fact the dynamics of INS are “group-affine” to create a completely novel observer architecture, which could then be leveraged to design correction terms that guarantee almost-global asymptotic stability of the observer error.
In other words, from practically any initial condition, the estimate provided by the observer we designed in this paper is guaranteed to converge to the true state of the system.
This contrasts with existing methods, which either have a limited domain of convergence, or require high gains to increase the domain of convergence leading to high sensitivity to noise.</p>

<p>I would like to thank my coauthors Tarek Hamel and Robert Mahony for their contributions to this work.
It really came about from a collaborative process of sharing ideas and identifying and overcoming difficulties together.
Finally, if you are interested in this work or have questions, please don’t hesitate to reach out!</p>

<p><strong>Paper</strong>:
<a href="https://www.sciencedirect.com/science/article/pii/S0947358024001079">https://www.sciencedirect.com/science/article/pii/S0947358024001079</a></p>

<p><strong>Preprint</strong>:
<a href="https://arxiv.org/abs/2311.02234">https://arxiv.org/abs/2311.02234</a></p>

<p><strong>Code</strong>:
<a href="https://github.com/pvangoor/synchronous_INS">https://github.com/pvangoor/synchronous_INS</a></p>

<p><strong>Abstract</strong>:
Inertial Navigation Systems (INS) estimate a vehicle’s navigation states (attitude, velocity, and position) by combining measurements from an Inertial Measurement Unit (IMU) with other supporting sensors, typically including a Global Navigation Satellite System (GNSS) and a magnetometer. Recent nonlinear observer designs for INS provide powerful stability guarantees but do not account for some of the real-world challenges of INS. One of the key challenges is to account for the time-delay characteristic of GNSS measurements. This paper addresses this question by extending recent work on synchronous observer design for INS. The delayed GNSS measurements are related to the state at the current time using recursively-computable delay matrices, and this is used to design correction terms that leads to almost-globally asymptotic and locally exponential stability of the error. Simulation results verify the proposed observer and show that the compensation of time-delay is key to both transient and steady-state performance.</p>]]></content><author><name></name></author><category term="Papers" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The Special Orthogonal Group SO(3)</title><link href="/mathematics/2025/02/15/special_orthogonal_so3.html" rel="alternate" type="text/html" title="The Special Orthogonal Group SO(3)" /><published>2025-02-15T00:00:00+00:00</published><updated>2025-02-15T00:00:00+00:00</updated><id>/mathematics/2025/02/15/special_orthogonal_so3</id><content type="html" xml:base="/mathematics/2025/02/15/special_orthogonal_so3.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h3 id="introduction">Introduction</h3>

<p>The special orthogonal group  \(\mathbf{SO}(3)\) is one of the most important Lie groups encountered in robotics.
It is the most natural way to represent rotations and orientations of rigid bodies in 3D.
Because of how important 3D rotations are, there are many ways to understand and approach them.
I already wrote about one way in my previous post on unit quaternions, but in this post I will focus just on space of rotation matrices itself, and its Lie algebra.
My main goal in writing this article is to provide derivations of some of the most important formulas when dealing with rotation matrices.</p>

<p>The special orthogonal group is defined by</p>

\[\begin{aligned}
    \mathbf{SO}(3) = \left\{
        R \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        R^\top R = I_3, \; \det(R) = 1
    \right\}.
\end{aligned}\]

<p>The first condition \(R^\top R = I_3\) is equivalent to saying that each column of \(R\) must have length \(1\) and be orthogonal (perpendicular) to every other column.
The second condition, \(\det(R)=1\) concerns the order of the columns of \(R\). It enforces a `right-handed’ orientation of each rotation matrix, and it is what makes them <em>special</em> orthogonal rather than simply orthogonal matrices.</p>

<p>We can verify the group properties as follows.
The identity matrix clearly satisfies \(I^\top I=I\) and \(\det(I)=1\), so \(I \in \mathbf{SO}(3)\).
As for matrix inversion, note that the first condition \(R^\top R = I_3\) means that \(R^{-1} = R^\top\).
Therefore, for any \(R \in \mathbf{SO}(3)\), we have that \((R^\top)^\top (R^\top) = R R^{-1} = I_3\), and that \(\det(R^\top) = \det(R) = 1\).
So the group is indeed closed under inversion.
Finally, to see that the group is closed under matrix products, if \(R_1, R_2 \in \mathbf{SO}(3)\), then</p>

\[\begin{aligned}
    (R_1 R_2)^\top (R_1 R_2)
    &amp;= R_2^\top R_1^\top R_1 R_2
    = R_2^\top  R_2
    = I_3, \\
    \det(R_1 R_2) &amp;= \det(R_1) \det(R_2) = 1.
\end{aligned}\]

<p>Thus \(\mathbf{SO}(3)\) is indeed a group.
What this means intuitively is that composing two or more rotations always leads to another rotation, and that any rotation can be undone by its inverse rotation.</p>

<h3 id="lie-algebra">Lie algebra</h3>

<p>The simplest way of obtaining the Lie algebra \(\mathfrak{so}(3)\) is to differentiate the condition \(R^\top R = I_3\) around \(R \approx I + \Omega\).
Doing so leads to</p>

\[\begin{aligned}
    0_{3\times 3} = R^\top R - I_3
    \approx (I + \Omega)^\top (I + \Omega) - I_3
    \approx \Omega^\top + \Omega,
\end{aligned}\]

<p>where all second-order terms have been removed.
Differentiating the condition \(\det(R) = 1\) leads only to \(\mathrm{tr}(\Omega) = 0\), which is already guaranteed by the fact that \(\Omega + \Omega^\top = 0\).
Hence, the Lie algebra of \(\mathbf{SO}(3)\) is</p>

\[\begin{aligned}
    \mathfrak{so}(3)
    =  \left\{
        \Omega \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        \Omega^\top + \Omega = 0_{3\times 3}
    \right\}.
\end{aligned}\]

<p>Next, we will define a basis for this Lie algebra.
The condition \(\Omega^\top + \Omega = 0_{3\times 3}\) is exactly saying that \(\Omega\) is a ‘skew-symmetric’ matrix.
Let us study this condition in terms of the coefficients of \(\Omega\).
We have that</p>

\[\begin{aligned}
\Omega + \Omega^\top
&amp;= \begin{pmatrix}
        \Omega_{11} &amp; \Omega_{12} &amp; \Omega_{13} \\
        \Omega_{21} &amp; \Omega_{22} &amp; \Omega_{23} \\
        \Omega_{31} &amp; \Omega_{32} &amp; \Omega_{33}
    \end{pmatrix}
    +  \begin{pmatrix}
        \Omega_{11} &amp; \Omega_{21} &amp; \Omega_{31} \\
        \Omega_{12} &amp; \Omega_{22} &amp; \Omega_{32} \\
        \Omega_{13} &amp; \Omega_{23} &amp; \Omega_{33}
    \end{pmatrix} \\
&amp;= \begin{pmatrix}
        \Omega_{11} + \Omega_{11} &amp; \Omega_{12} + \Omega_{21} &amp; \Omega_{13} + \Omega_{31} \\
        \Omega_{21} + \Omega_{12} &amp; \Omega_{22} + \Omega_{22} &amp; \Omega_{23} + \Omega_{32} \\
        \Omega_{31} + \Omega_{13} &amp; \Omega_{32} + \Omega_{23} &amp; \Omega_{33} + \Omega_{33}
    \end{pmatrix}
    = 0_{3\times 3}.
\end{aligned}\]

<p>There are nine equations here, which can be reduced to</p>

\[\begin{aligned}
    \Omega_{11} &amp;= \Omega_{22} = \Omega_{33} = 0, \\
    \Omega_{12} &amp;= -\Omega_{21}, \\
    \Omega_{13} &amp;= -\Omega_{31}, \\
    \Omega_{23} &amp;= -\Omega_{32}.
\end{aligned}\]

<p>In other words, there are three degrees of freedom, or equivalently, the Lie algebra is a 3-dimensional vector space.
We will choose the following basis for the Lie algebra:</p>

\[\begin{aligned}
    E_1 &amp;= \begin{pmatrix}
    0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; -1 \\
    0 &amp; 1 &amp; 0
    \end{pmatrix}, &amp;
    E_2 &amp;= \begin{pmatrix}
    0 &amp; 0 &amp; 1 \\
    0 &amp; 0 &amp; 0 \\
    -1 &amp; 0 &amp; 0
    \end{pmatrix}, &amp;
    E_3 &amp;= \begin{pmatrix}
    0 &amp; -1 &amp; 0 \\
    1 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>Using these definitions, any element of the Lie algebra may be uniquely written as a vector \(\omega \in \mathbb{R}^3\) by defining the ‘skew map’ \(\cdot^\times : \mathbb{R}^3 \to \mathfrak{so}(3)\) as</p>

\[\begin{aligned}
\omega^\times &amp;:= \omega_1 E_1 + \omega_2 E_2 + \omega_3 E_3
= \begin{pmatrix}
    0 &amp; -\omega_3 &amp; \omega_2 \\
    \omega_3 &amp; 0 &amp; -\omega_1 \\
    -\omega_2 &amp; \omega_1 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>We chose this particular basis and notation for a good reason, namely, that it relates the Lie algebra \(\mathfrak{so}(3)\) with the classis vector cross product.
That is also why we use the notation \(a^\times\) rather than the notation \(a^\wedge\) used for other Lie algebras, although different authors have different conventions.
The reason for these particular basis matrices is that if we have a vector \(a \in \mathbb{R}^3\), then the matrix \(a^\times \in \mathfrak{so}(3)\) is the unique matrix such that \(a \times b = a^\times b\) for all \(b \in \mathbb{R}^3\).</p>

<h4 id="an-important-cross-product-identity">An Important Cross Product Identity</h4>

<p>The following identity will be vital to understanding some of the later manipulations.
Let \(a,b \in \mathbb{R}^3\) be arbitrary.
Then we can compute that</p>

\[\begin{aligned}
(a^\times b)^\times
&amp;= \left( \begin{pmatrix}
    0 &amp; -a_3 &amp; a_2 \\
    a_3 &amp; 0 &amp; -a_1 \\
    -a_2 &amp; a_1 &amp; 0
    \end{pmatrix}
    \begin{pmatrix}
    b_1 \\ b_2 \\ b_3
    \end{pmatrix}
    \right)^\times \\
&amp;= \begin{pmatrix}
    -a_3 b_2 + a_2 b_3 \\
    a_3 b_1 - a_1 b_3 \\
    -a_2 b_1 + a_1 b_2
    \end{pmatrix}^\times \\
&amp;= \begin{pmatrix}
    0 &amp; -(-a_2 b_1 + a_1 b_2) &amp; (a_3 b_1 - a_1 b_3) \\
    (-a_2 b_1 + a_1 b_2) &amp; 0 &amp; -(-a_3 b_2 + a_2 b_3) \\
    -(a_3 b_1 - a_1 b_3) &amp; (-a_3 b_2 + a_2 b_3) &amp; 0
    \end{pmatrix} \\
&amp;= \begin{pmatrix}
    b_1 a_1 - a_1 b_1 &amp; b_1 a_2 - a_1 b_2 &amp; b_1 a_3 - a_1 b_3 \\
    b_2 a_1 - a_2 b_1 &amp; b_2 a_2 - a_2 b_2 &amp; b_2 a_3 - a_2 b_3 \\
    b_3 a_1 - a_3 b_1 &amp; b_3 a_2 - a_3 b_2 &amp; b_3 a_3 - a_3 b_3
    \end{pmatrix} \\
&amp;= \begin{pmatrix}
    b_1 a_1 &amp; b_1 a_2 &amp; b_1 a_3 \\
    b_2 a_1 &amp; b_2 a_2 &amp; b_2 a_3 \\
    b_3 a_1 &amp; b_3 a_2 &amp; b_3 a_3
    \end{pmatrix}
    - \begin{pmatrix}
    a_1 b_1 &amp; a_1 b_2 &amp; a_1 b_3 \\
    a_2 b_1 &amp; a_2 b_2 &amp; a_2 b_3 \\
    a_3 b_1 &amp; a_3 b_2 &amp; a_3 b_3
    \end{pmatrix} \\
&amp;= \begin{pmatrix} b_1 \\ b_2 \\ b_3 \end{pmatrix}
\begin{pmatrix} a_1 &amp; a_2 &amp; a_3 \end{pmatrix}
- \begin{pmatrix} a_1 \\ a_2 \\ a_3 \end{pmatrix}
\begin{pmatrix} b_1 &amp; b_2 &amp; b_3 \end{pmatrix} \\
&amp;= b a^\top - a b^\top.
\end{aligned}\]

<p>This identification is extremely useful in manipulating expressions involving cross products and skew matrices.
In the next section, we will use it in deriving the adjoint matrix.</p>

<h4 id="adjoint-and-lie-bracket">Adjoint and Lie bracket</h4>

<p>To study the adjoint maps, let \(R \in \mathbf{SO}(3)\) be arbitrary, and denote by \(R_1,R_2,R_3 \in \mathbb{R}^3\) the columns of \(R\).
Then for the basis matrix \(E_1 = e_1^\times \in \mathfrak{so}(3)\),</p>

\[\begin{aligned}
\mathrm{Ad}_{R}(e_1^\times)
&amp;= R e_1^\times R^\top \\
&amp;= \begin{pmatrix}
        R_1 &amp; R_2 &amp; R_3
    \end{pmatrix}
    \begin{pmatrix}
    0 &amp; 0 &amp; 0 \\
    0 &amp; 0 &amp; -1 \\
    0 &amp; 1 &amp; 0
    \end{pmatrix}
    R^\top \\
&amp;= \begin{pmatrix}
        R_1 &amp; R_2 &amp; R_3
    \end{pmatrix}
    \begin{pmatrix}
    0_{1\times 3} \\
    -e_3^\top \\
    e_2^\top
    \end{pmatrix}
    R^\top \\
&amp;= (- R_2 e_3^\top + R_3 e_2^\top)
    R^\top \\
&amp;= - R_2 (R e_3)^\top  + R_3 (R e_2)^\top \\
&amp;= - R_2 R_3^\top  + R_3 R_2^\top \\
&amp;= (R_2 \times R_3)^\times \\
&amp;= R_1^\times \\
&amp;= (R e_1)^\times.
\end{aligned}\]

<p>The fact that \(R_2\times R_3 = R_1\) follows from the fact that \(R_1,R_2,R_3\) are orthogonal and that \(\det(R) = 1\).
For the other basis vectors, the same process leads to \(\mathrm{Ad}_R(e_2^\times) = (Re_2)^\times\) and \(\mathrm{Ad}_R(e_3^\times) = (Re_3)^\times\).
Therefore, for an arbitrary \(\omega^\times \in \mathfrak{so}(3)\),</p>

\[\begin{aligned}
\mathrm{Ad}_R(\omega^\times)
&amp;= \mathrm{Ad}_R(\omega_1 e_1^\times + \omega_2 e_2^\times + \omega_3 e_3^\times) \\
&amp;= \omega_1 \mathrm{Ad}_R(e_1^\times)
+ \omega_2 \mathrm{Ad}_R(e_2^\times)
+ \omega_3 \mathrm{Ad}_R(e_3^\times) \\
&amp;= \omega_1 (R e_1)^\times
+ \omega_2 (R e_2)^\times
+ \omega_3 (R e_3)^\times \\
&amp;=  (\omega_1 R e_1
+ \omega_2 R e_2
+ \omega_3 R e_3)^\times \\
&amp;= (R \omega)^\times.
\end{aligned}\]

<p>So, while it may be some work to get there, we end up with a very clean result: the Adjoint applied to the skew matrix of a vector is the same as rotating the vector before applying the skew operator.
Thus the matrix form of the Adjoint operator is simply</p>

\[\begin{aligned}
\mathrm{Ad}_{R}^\vee
&amp;= R
\end{aligned}\]

<p>Differentiating this matrix in terms of the Lie group element \(R\) at the identity is an easy way to obtain the “little” adjoint matrix and, equivalently, the Lie bracket.</p>

\[\begin{aligned}
\mathrm{ad}_{\omega}^\vee
&amp;= \omega^\times, \\
[\omega_1, \omega_2] &amp;= \omega_1 \times \omega_2.
\end{aligned}\]

<p>Lastly, this Lie bracket provides us with another useful skew operator identity, namely,</p>

\[\begin{aligned}
(\omega_1^\times \omega_2)^\times
&amp;= [\omega_1, \omega_2]^\times
= [\omega_1^\times, \omega_2^\times]
= \omega_1^\times \omega_2^\times - \omega_2^\times \omega_1^\times.
\end{aligned}\]

<h4 id="exponential-and-logarithm">Exponential and Logarithm</h4>

<p>Next are the exponential and logarithm, which also have nice forms.
The exponential formula for \(\mathbf{SO}(3)\) is often referred to as the Rodrigues formula.
To derive the exponential formula, we let \(\omega^\times \in \mathfrak{so}(3)\) be an arbitrary skew-symmetric matrix.
Then we notice the following important identity:</p>

\[\begin{aligned}
(\omega^\times)^2
&amp;= \omega^\times \omega^\times \\
&amp;= \begin{pmatrix}
    0 &amp; -\omega_3 &amp; \omega_2 \\
    \omega_3 &amp; 0 &amp; -\omega_1 \\
    -\omega_2 &amp; \omega_1 &amp; 0
    \end{pmatrix}
    \begin{pmatrix}
    0 &amp; -\omega_3 &amp; \omega_2 \\
    \omega_3 &amp; 0 &amp; -\omega_1 \\
    -\omega_2 &amp; \omega_1 &amp; 0
    \end{pmatrix} \\
    &amp;= \begin{pmatrix}
    -\omega_3^2 - \omega_2^2 &amp;
    \omega_2 \omega_1 &amp;
    \omega_3 \omega_1 \\
    \omega_1 \omega_2 &amp;
    -\omega_3^2 - \omega_1^2 &amp;
    \omega_3 \omega_2 \\
    \omega_1 \omega_3 &amp;
    \omega_2 \omega_3 &amp;
    -\omega_2^2 - \omega_1^2
    \end{pmatrix} \\
    &amp;= \begin{pmatrix}
    \omega_1^2 &amp;
    \omega_2 \omega_1 &amp;
    \omega_3 \omega_1 \\
    \omega_1 \omega_2 &amp;
    \omega_2^2  &amp;
    \omega_3 \omega_2 \\
    \omega_1 \omega_3 &amp;
    \omega_2 \omega_3 &amp;
    \omega_3^2
    \end{pmatrix}
    - (\omega_1^2 + \omega_2^2 + \omega_3^2) \begin{pmatrix}
    1 &amp; 0 &amp; 0 \\
    0 &amp; 1 &amp; 0 \\
    0 &amp; 0 &amp; 1
    \end{pmatrix} \\
    &amp;= \omega \omega^\top - \omega^\top \omega I_3.
\end{aligned}\]

<p>A particular consequence of this is that</p>

\[\begin{aligned}
(\omega^\times)^3
    &amp;= \omega^\times (\omega \omega^\top - \omega^\top \omega I_3) \\
    &amp;= 0 - \omega^\top \omega \omega^\times \\
    &amp;= - \vert \omega \vert^2 \omega^\times.
\end{aligned}\]

<p>With this identity available to us, the exponential becomes relatively straightforward to compute.
Assume first that \(\omega \neq 0\).
Then, from the definition of the matrix exponential, we have</p>

\[\begin{aligned}
\exp(\omega^\times)
    &amp;= \sum_{n=0}^\infty \frac{1}{n!} (\omega^\times)^n \\
    &amp;= I_3
    + \sum_{n=0}^\infty \frac{1}{(2n+1)!} (\omega^\times)^{2n+1}
    + \sum_{n=1}^\infty \frac{1}{(2n)!} (\omega^\times)^{2n} \\
% ------------
    &amp;= I_3
    + \sum_{n=0}^\infty \frac{(-1)^n}{(2n+1)!} \vert \omega \vert^{2n} \omega^\times
    + \sum_{n=1}^\infty \frac{(-1)^{n-1}}{(2n)!} \vert \omega \vert^{2n-2} (\omega^\times)^{2} \\
% ------------
    &amp;= I_3
    + \frac{1}{\vert \omega \vert} \left(\sum_{n=0}^\infty \frac{(-1)^n}{(2n+1)!} \vert \omega \vert^{2n+1} \right) \omega^\times
    - \frac{1}{\vert \omega \vert^2} \left( \sum_{n=1}^\infty \frac{(-1)^{n}}{(2n)!} \vert \omega \vert^{2n} \right) (\omega^\times)^{2} \\
% ------------
    &amp;= I_3
    + \frac{1}{\vert \omega \vert} \left(\sum_{n=0}^\infty \frac{(-1)^n}{(2n+1)!} \vert \omega \vert^{2n+1} \right) \omega^\times
    - \frac{1}{\vert \omega \vert^2} \left(-1 + \sum_{n=0}^\infty \frac{(-1)^{n}}{(2n)!} \vert \omega \vert^{2n} \right) (\omega^\times)^{2} \\
% ------------
    &amp;= I_3
    + \frac{1}{\vert \omega \vert} \sin(\vert \omega \vert) \omega^\times
    - \frac{1}{\vert \omega \vert^2} \left(-1 + \cos(\vert \omega \vert) \right) (\omega^\times)^{2} \\
% ------------
    &amp;= I_3
    + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2}.
\end{aligned}\]

<p>To see how the infinite sums collapsed, have a look at the <a href="https://en.wikipedia.org/wiki/Trigonometric_functions#Power_series_expansion">power series expansions of sine and cosine</a>.
The fractions above are well-defined since we assumed \(\omega \neq 0\).
In the case that \(\omega = 0\), then the exponential is simply \(\exp(0^\times) = I_3\).</p>

<p>Computing the logarithm is a matter of inverting the exponential formula, although we need to be careful since there is not always a unique logarithm defined!
Suppose that \(R = \exp(\omega^\times)\) for some \(\omega \in \mathbb{R}^3\).
Then by taking the trace of both sides,</p>

\[\begin{aligned}
\mathrm{tr}(R)
&amp;= \mathrm{tr}(I_3
    + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2}) \\
% ------------
&amp;= \mathrm{tr}(I_3) + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \mathrm{tr}(\omega^\times)
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} \mathrm{tr}((\omega^\times)^{2}) \\
% ------------
&amp;= 3 + 0
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} \mathrm{tr}(\omega \omega^\top - \omega^\top \omega I_3) \\
% ------------
&amp;= 3 + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} ( \omega^\top \omega - 3 \omega^\top \omega ) \\
% ------------
&amp;= 3 - 2 (1 - \cos(\vert \omega \vert)) \\
% ------------
&amp;= 1 + 2 \cos(\vert \omega \vert).
\end{aligned}\]

<p>We will assume that \(\vert \omega \vert &lt; \pi\).
Therefore, since \(\cos\) is invertible over the domain \([0,\pi]\), we obtain</p>

\[\vert \omega \vert = \cos^{-1}\left( \frac{\mathrm{tr}(R) - 1}{2} \right)\]

<p>Then, the vector direction of \(\omega\) can be extracted from the skew-symmetric part of \(R\).
Observe that</p>

\[\begin{aligned}
R - R^\top
    &amp;= (I_3
    + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2})
    - (I_3
    + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2})^\top \\
% ------------
&amp;= 2 \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times.
% ------------
\end{aligned}\]

<p>Since we already know \(\vert \omega \vert\) from the trace formula, we now obtain</p>

\[\begin{aligned}
\omega  = \frac{\vert \omega \vert}{2 \sin(\vert \omega \vert)}(R - R^\top)^\vee.
\end{aligned}\]

<p>This formula applies when \(\vert\omega\vert \in (0, \pi)\), but what about the other cases?
If \(\vert\omega\vert = 0\) then it simply means that \(R = I_3\) and thus \(\omega = 0\) as well.
If, on the other hand, \(\vert\omega\vert = \pi\), then the logarithm is not uniquely defined.
In this case, the skew-symmetric part of \(R\) disappears, and we must look to the symmetric part of \(R\) to find a solution.
Under the condition that \(\vert \omega \vert = \pi\), we have</p>

\[\begin{aligned}
R + R^\top
&amp;= (I_3
    + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2})
    + (I_3
    + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2})^\top \\
&amp;= 2 I_3
    + 2 \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2} \\
&amp;= 2 I_3 + \frac{4}{\vert \omega \vert^2} (\omega^\times)^{2} \\
&amp;= 2 I_3 + \frac{4}{\vert \omega \vert^2} (\omega \omega^\top - \omega^\top \omega I_3 ) \\
&amp;= 2 I_3 + 4 (\frac{\omega \omega^\top}{\vert \omega \vert^2} -  I_3 ) \\
&amp;= 4 \frac{\omega \omega^\top}{\vert \omega \vert^2} - 2  I_3.
\end{aligned}\]

<p>Multiplying both sides by \(\omega\) leads to</p>

\[\begin{aligned}
(R + R^\top) \omega
&amp;= 4 \frac{\omega \omega^\top \omega}{\vert \omega \vert^2} -2\omega \\
(R + R^\top) \omega
&amp;= 2 \omega.
\end{aligned}\]

<p>In other words, \(\omega\) is an eigenvector of \(R + R^\top\) with eigenvalue 2, and it is not difficult to see that it is unique up to change of sign.
Therefore, if \(\vert \omega \vert = \pi\), then \(\omega\) is given by solving</p>

\[\begin{aligned}
(R + R^\top - 2I_3) \omega
&amp;= 0, &amp; \vert \omega \vert = \pi,
\end{aligned}\]

<p>and is unique up the choice of sign \(\pm \omega\).</p>

<h3 id="further-useful-formulas-projections">Further Useful Formulas: Projections</h3>

<p>We often use \(\mathbf{SO}(3)\) and its Lie algebra in their matrix form, and it is useful to be able to understand their relationship with the ambient space of matrices in which they are typically represented.</p>

<h4 id="lie-algebra-projection">Lie Algebra Projection</h4>

<p>For the Lie algbera \(\mathfrak{so}(3)\), the projection from \(\mathbb{R}^{3\times 3}\) to \(\mathfrak{so}(3)\) is often denoted as \(\mathbb{P}_{\mathfrak{so}}(3)\), and it is defined by</p>

\[\mathbb{P}_{\mathfrak{so}}(3)(M)
:= \mathrm{argmin}_{\Omega \in \mathfrak{so}(3)} \vert M - \Omega \vert^2.\]

<p>Here, the norm is taken as the matrix (Frobenius) norm, so \(\vert A \vert^2 := \mathrm{tr}(A^\top A)\).
The solution to the minimisation is unique and can be obtained by differentiation.
For any \(M \in \mathbb{R}^{3\times 3}\) and \(\Omega \in \mathfrak{so}(3)\), we write the cost function</p>

\[l(\Omega)
:= \frac{1}{2}\vert M - \Omega \vert^2.\]

<p>Then, differentiating this with respect to \(\Omega\) in an arbitrary direction \(\Delta \in \mathfrak{so}(3)\) yields</p>

\[\begin{aligned}
\mathrm{D} l(\Omega)[\Delta]
&amp;= \langle \Omega - M, \Delta \rangle \\
&amp;= \langle \Omega, \Delta \rangle - \langle M, \Delta \rangle \\
&amp;= \langle \Omega, \Delta \rangle - \frac{1}{2} \left( \langle M, \Delta \rangle + \langle M^\top, \Delta^\top \rangle \right) \\
&amp;= \langle \Omega, \Delta \rangle - \frac{1}{2} \left( \langle M, \Delta \rangle + \langle - M^\top, \Delta \rangle \right) \\
&amp;= \langle \Omega, \Delta \rangle - \frac{1}{2} \langle M- M^\top , \Delta \rangle \\
&amp;= \langle \Omega - \frac{1}{2} (M - M^\top), \Delta \rangle.
\end{aligned}\]

<p>This is zero for all \(\Delta \in \mathfrak{so}(3)\) exactly when \(\Omega = \frac{1}{2}(M - M^\top)\), which is also a skew symmetric matrix (as required).
Therefore, to summarise,</p>

\[\mathbb{P}_{\mathfrak{so}}(3)(M)
= \frac{1}{2} (M - M^\top).\]

<p>The projection onto the Lie group \(\mathbf{SO}(3)\) is similarly defined, but less straightforward to obtain.
For any \(M \in \mathbb{R}^3\),</p>

\[\mathbb{P}_{\mathbf{SO}}(3)(M)
:= \mathrm{argmin}_{R \in \mathbf{SO}(3)} \vert M - R \vert^2.\]

<p>To compute this, define \(l(R) = \frac{1}{2}\vert R - M \vert^2\) and differentiate to obtain</p>

\[\begin{aligned}
\mathrm{D}l(R)[R \Omega]
&amp;= \langle R - M, R \Omega \rangle \\
&amp;= \langle I - R^\top M, \Omega \rangle \\
&amp;= \frac{1}{2}\langle M^\top R - R^\top M, \Omega \rangle.
\end{aligned}\]

<p>This is zero for all \(\Omega \in \mathfrak{so}(3)\) if and only if \(M^\top R = R^\top M\).
Next, compute the singular value decomposition, \(M = U S V^\top\), where \(S\) is a diagonal matrix with the singular values of \(M\) in descending order.
Then the previous condition is equivalent to</p>

\[\begin{aligned}
M^\top R &amp;= R^\top M, \\
V S U^\top R &amp;= R^\top U S V^\top, \\
S U^\top R V &amp;= V^\top R^\top U S , \\
S Q &amp;= Q^\top S,
\end{aligned}\]

<p>where \(Q = U^\top R V\).
Note that \(Q\) is also an orthogonal (not necessarily special orthogonal matrix).
We will now assume that the singular values of \(M\) are distinct, so the values along the diagonal of \(S\) are strictly descending \(s_1 &gt; s_2 &gt; s_3 \geq 0\).
Then, for the unit vector \(e_1\),</p>

\[\begin{aligned}
S Q e_1 &amp;= Q^\top S e_1, \\
S (Q e_1) &amp;= s_1 Q^\top e_1, \\
\vert S (Q e_1) \vert^2 &amp;= s_1^2 \vert Q^\top e_1 \vert^2, \\
\sum_{i=1}^3 s_i^2 Q_{i1}^2 &amp;= s_1^2.
\end{aligned}\]

<p>Since \(Q\) is orthogonal and the \(s_i\) are descending, the sum on the left can be at most equal to \(s_1^2\) and this happens exactly when \(Q_{11} = \pm 1\) (meaning also that \(Q_{i1} = 0\) for \(i \neq 1\)).
This moreover implies that \(Q_{1i} = 0\) for all \(i \neq 1\), and then the argument can be repeated for the second eigenvalue to obtain the result that</p>

\[Q = \mathrm{diag}(\pm 1, \pm 1, \pm 1).\]

<p>If \(Q\) were only orthogonal, then any of the eight combinations of signs would be possible.
However, since \(Q = U^\top R V\), we have \(\det(Q) = \det(U^\top)\det(R) \det(V) = \det(U^\top)\det(V)\), since \(\det(R) = 1\) necessarily.
Thus \(Q_{33}\) is determined by the other two values and the fixed determinant.</p>

<p>Returning to the original problem, expanding \(l\) yields</p>

\[\begin{aligned}
l(R) &amp;:= \vert R - M \vert^2 \\
&amp;= \vert R - U S V^\top \vert^2 \\
&amp;= \vert U^\top R V - S \vert^2 \\
&amp;= \vert Q - S \vert^2 \\
&amp;= (Q_{11} - s_1)^2 + (Q_{22} - s_1)^2 + (Q_{33} - s_3)^2
\end{aligned}\]

<p>Since the \(s_i\) are descending, it is clear that the minimiser (among the possible \(Q\)) is given by \(Q = \mathrm{diag}(1,1,\det(U^\top) \det(V))\).
Then, finally, \(R\) is recovered as \(R = U Q V^\top\), or succinctly,</p>

\[\begin{aligned}
R = U \mathrm{diag}(1,1,\det(U^\top V)) V^\top.
\end{aligned}\]

<p>This argument relied on the assumption that the singular values \(s_i\) were distinct, but a similar argument is possible when they are not.
In any case, the procedure provided here will return a matrix \(R \in \mathbf{SO}(3)\) that minimises the Euclidean (Frobenius) distance to \(M\), although this matrix may not actually be a unique solution to the original problem.</p>

<h3 id="conclusion">Conclusion</h3>

<p>The special orthogonal group in 3D is a beautiful and classical example of a Lie group, and it is of interest for both theoretical and practical reasons.
My view is that, if you have a problem involving 3D rotations, your first choice should be to use the group \(\mathbf{SO}(3)\).
It is sometimes worth using Euler angles or unit quaternions, but without some motivation for this, I would stick to the matrix Lie group.
If you want to look at an implementation of some of the functions I have described, I suggest you to look at <a href="https://github.com/pvangoor/pylie">pylie</a>.
Finally, I will leave a summary of the formulas below for quick reference.</p>

<h3 id="quick-reference">Quick Reference</h3>

<p>The 3D special orthogonal group and Lie algebra</p>

\[\begin{aligned}
    \mathbf{SO}(3) &amp;= \left\{
        R \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        R^\top R = I_3, \; \det(R) = 1
    \right\}, \\
        \mathfrak{so}(3)
    &amp;=  \left\{
        \Omega \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        \Omega^\top + \Omega = 0_{3\times 3}
    \right\}.
\end{aligned}\]

<p>The Lie algebra wedge or ‘skew’ map</p>

\[\begin{aligned}
    \cdot^\times &amp;: \mathbb{R}^3 \to \mathfrak{so}(3), \\
    \omega^\times &amp;= \begin{pmatrix}
        \omega_1 \\ \omega_2 \\ \omega_3
    \end{pmatrix}^\times
    = \begin{pmatrix}
        0 &amp; -\omega_3 &amp; \omega_2 \\
        \omega_3 &amp; 0 &amp; -\omega_1 \\
        -\omega_2 &amp; \omega_1 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>Cross product and skew matrix identities</p>

\[\begin{aligned}
    (a^\times b)^\times &amp;= b a^\top - a b^\top, \\
    \omega^\times \omega^\times &amp;= \omega \omega^\top - \omega^\top \omega I_3
\end{aligned}\]

<p>Adjoint matrices</p>

\[\begin{aligned}
    \mathrm{Ad}_R^\vee &amp;= R, \\
    \mathrm{ad}_\omega^\vee &amp;= \omega^\times.
\end{aligned}\]

<p>Exponential formula (for \(\omega \neq 0\))</p>

\[\begin{aligned}
    \exp(\omega) &amp;= I_3
    + \frac{\sin(\vert \omega \vert)}{\vert \omega \vert} \omega^\times
    + \frac{1 - \cos(\vert \omega \vert)}{\vert \omega \vert^2} (\omega^\times)^{2}.
\end{aligned}\]

<p>Logarithm formula (for \(R \neq R^\top\))</p>

\[\begin{aligned}
\log(R)  &amp;= \frac{\theta}{2 \sin(\theta)}(R - R^\top)^\vee, &amp;
\theta &amp;:= \cos^{-1}\left( \frac{\mathrm{tr}(R) - 1}{2} \right).
\end{aligned}\]

<p>Projection maps</p>

\[\begin{aligned}
\mathbb{P}_{\mathfrak{so}(3)}(M)  &amp;= \frac{1}{2} (M - M^\top), \\
\mathbb{P}_{\mathbf{SO}(3)}(A) &amp;= U \mathrm{diag}(1,1,\det(U^\top V)) V^\top, &amp;
A &amp;= U S V^\top.
\end{aligned}\]]]></content><author><name></name></author><category term="Mathematics" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The Indefinite Orthogonal Group O(1,2)</title><link href="/mathematics/2024/07/09/indefinite_orthogonal_o1_2.html" rel="alternate" type="text/html" title="The Indefinite Orthogonal Group O(1,2)" /><published>2024-07-09T00:00:00+00:00</published><updated>2024-07-09T00:00:00+00:00</updated><id>/mathematics/2024/07/09/indefinite_orthogonal_o1_2</id><content type="html" xml:base="/mathematics/2024/07/09/indefinite_orthogonal_o1_2.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h3 id="introduction">Introduction</h3>

<p>The indefinite orthogonal group \(\mathbf{O}(1,2)\) is the set of \(3\times 3\) matrices preserving the \((1,2)\) indefinite form.
Specifically, define the indefinite \((1,2)\) form to be</p>

\[\begin{aligned}
    g = \begin{pmatrix}
            1 &amp; 0_{1\times 2} \\ 0_{2\times 1} &amp; -I_2
        \end{pmatrix} \in \mathbb{R}^{3\times 3}
\end{aligned}\]

<p>Then the indefinite orthogonal group is defined to be</p>

\[\begin{aligned}
    \mathbf{O}(1,2) = \left\{
        L \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        L^\top g L = g
    \right\}
\end{aligned}\]

<p>In applications, the indefinite form is important in studying the geometry of physics, where spacetime vectors have a structure like \((t,x,y,z)\).
The relevant group in that case is \(\mathbf{O}(1,3)\), but the lower-dimensional \(\mathbf{O}(1,2)\) serves as a nice toy model.
The elements of this group can also be thought of as `hyperbolic rotations’. This is because a hyperbola in \(\mathbb{R}^3\) can be written as a set \(H = \{ \xi \in \mathbb{R}^3 \; | \; \xi^\top g \xi = c \}\), where \(c\) is a constant. It follows that the indefinite orthogonal group preserves \(H\), much like how the standard orthogonal group preserves spheres.</p>

<p>The group properties are easily verified. 
The identity \(I_3\) is clearly in \(\mathbf{O}(1,2)\), and the matrix product and inverse preserve the group.
Specifically, if \(L_1,L_2 \in \mathbf{O}(1,2)\), then</p>

\[\begin{aligned}
    (L_1 L_2)^\top g (L_1 L_2) &amp;= L_2^\top L_1^\top g L_1 L_2 = L_2^\top g L_2 = g, \\
    (L_1^{-1})^\top g L_1^{-1} &amp;= (L_1^{-1})^\top (L_1^\top g L_1) L_1^{-1} = g.
\end{aligned}\]

<p>It is useful to dissect the definition of the Lie group a bit. Let</p>

\[\begin{aligned}
    L =  \begin{pmatrix} d &amp; b^\top \\ c &amp; A \end{pmatrix} 
    \in \mathbf{O}(1,2) \subset \mathbb{R}^{3\times 3},
\end{aligned}\]

<p>where \(A \in \mathbb{R}^{2\times 2}\) is the lower right block, \(d \in \mathbb{R}\), and \(b,c \in \mathbb{R}^2\).
Then, by the definition of the group,</p>

\[\begin{aligned}
    g &amp;= L^\top g L \\
    % -------
    &amp;= \begin{pmatrix} d &amp; c^\top \\ b &amp; A^\top \end{pmatrix}
    \begin{pmatrix} 1 &amp; 0_{1\times 2} \\ 0_{2\times1} &amp; -I_2 \end{pmatrix}
    \begin{pmatrix} d &amp; b^\top \\ c &amp; A \end{pmatrix} \\
    % -------
    &amp;= \begin{pmatrix} d &amp; c^\top \\ b &amp; A^\top \end{pmatrix}
    \begin{pmatrix} d &amp; b^\top \\ -c &amp; -A \end{pmatrix} \\
    % -------
    &amp;= \begin{pmatrix} 
    d^2 - c^\top c &amp;
    d b^\top - c^\top A \\
    db - A^\top c &amp;
    b b^\top - A^\top A
    \end{pmatrix}.
\end{aligned}\]

<p>This leads to three equations,</p>

\[\begin{aligned}
    d^2 - c^\top c &amp;= 1, &amp;
    A^\top c &amp;= d b, &amp;
    b b^\top - A^\top A &amp;= -I_2 \\
    % ------
    d &amp;= \sqrt{1 + c^\top c}, &amp;
    b &amp;= d^{-1} A^\top c, &amp;
    A^\top A &amp;= b b^\top + I_2 \\
    % ------
    &amp;&amp;
    c &amp;= d A^{-\top} b, &amp;
    &amp;
\end{aligned}\]

<p>For compactness of notation, it is not always useful to write out matrix \(L\) into these components, but the relationships are important for simplification of later formulas.</p>

<p>We will use these relations to compute the inverse of the matrix.
The formula for the inverse of a \(2\times 2\) block matrix gives</p>

\[\begin{aligned}
    L^{-1} = \begin{pmatrix} d &amp; b^\top \\ c &amp; A \end{pmatrix}^{-1}
    &amp;= \begin{pmatrix} d^{-1} + d^{-2} b^\top S^{-1} c &amp;
    -d^{-1} b^\top S^{-1} \\
    -d^{-1} S^{-1} c &amp; S^{-1} \end{pmatrix}, &amp;
    S &amp;:= A - d^{-1} c b^\top
\end{aligned}\]

<p>Since the inverse of \(L\) also belongs to \(\mathbf{O}(1,2)\), we only need to compute the upper- and lower-left terms. The remaining terms are determined by the relationships outlined above.
For the matrix \(S\), we have</p>

\[\begin{aligned}
    S &amp;= A - d^{-1} c b^\top \\
    &amp;= A - A^{-\top} b b^\top \\
    &amp;= A - A^{-\top} (A^\top A - I_2) \\
    &amp;= A^{-\top}
\end{aligned}\]

<p>Thus \(S^{-1} = A^\top\).
This is a nice simplification and helps to compute the remaining terms.
For the top-left term,</p>

\[\begin{aligned}
    d^{-1} + d^{-2} b^\top S^{-1} c
    &amp;= d^{-1} + d^{-2} b^\top A^\top c \\
    &amp;= d^{-1} + d^{-1} b^\top b \\
    &amp;= d^{-1} (1 + b^\top b) \\
    &amp;= d
\end{aligned}\]

<p>For the bottom-left term,</p>

<p>\(\begin{aligned}
    - d^{-1} S^{-1} c
    &amp;= - d^{-1} A^\top c = -b.
\end{aligned}\)
Then by the fact that \((L^{-1})^{-1} = L\), the top-right term must be \(-c^\top\).
In summary, the inverse of \(L\) is can be greatly simplified from a general \(3\times 3\) matrix inverse to</p>

\[\begin{aligned}
    L^{-1} &amp;= \begin{pmatrix} d &amp; b^\top \\ c &amp; A \end{pmatrix}^{-1}
    = \begin{pmatrix} d &amp; -c^\top \\ -b &amp; A^\top \end{pmatrix}
\end{aligned}\]

<h3 id="lie-algebra">Lie algebra</h3>

<p>The Lie algebra can be obtained by differentiating the condition \(L^\top g L = g\) at the identity.
Computing this, we obtain</p>

\[\begin{aligned}
    \mathfrak{o}(1,2)
    &amp;= \left\{
        U \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        U^\top g + g U = 0
    \right\}.
\end{aligned}\]

<p>We will choose a basis for this Lie algebra in order to map between it and to \(\mathbb{R}^3\). To do this, let us decompose a Lie algebra element into its parts and check the condition. We have</p>

\[\begin{aligned}
    0_{2\times 2} &amp;= U^\top g + g U \\
    % ------
    &amp;= \begin{pmatrix} U_{11} &amp; U_{12} \\ U_{21} &amp; U_{22} \end{pmatrix}^\top
    \begin{pmatrix} 1 &amp; 0_{1\times 2} \\ 0_{2\times 1} &amp; -I_2 \end{pmatrix}
    + \begin{pmatrix} 1 &amp; 0_{1\times 2} \\ 0_{2\times 1} &amp; -I_2 \end{pmatrix}
    \begin{pmatrix} U_{11} &amp; U_{12} \\ U_{21} &amp; U_{22} \end{pmatrix} \\
    % ------
    &amp;= \begin{pmatrix} U_{11} &amp; -U_{21}^\top \\ U_{12}^\top &amp; -U_{22} \end{pmatrix}
    + \begin{pmatrix} U_{11} &amp; U_{12} \\ -U_{21} &amp; -U_{22} \end{pmatrix} \\
    &amp;= \begin{pmatrix} 2 U_{11} &amp; U_{12} - U_{21}^\top \\ U_{12}^\top - U_{21} &amp; -U_{22}-U_{22}^\top \end{pmatrix},
\end{aligned}\]

<p>where \(U_{11} \in \mathbb{R}\), \(U_{12} \in \mathbb{R}^{2\times 1}\simeq \mathbb{R}^2\), \(U_{21} \in \mathbb{R}^{1\times 2}\), and \(U_{22} \in \mathbb{R}^{2\times 2}\).
It follows that</p>

\[\begin{aligned}
    U_{11} &amp;= 0 &amp;
    U_{21} &amp;= U_{12}^\top \in \mathbb{R}^2&amp; 
    U_{22} = - U_{22}^\top \in \mathfrak{so}(2)
\end{aligned}\]

<p>In other words, there are three degrees of freedom, and we define the basis of \(\mathfrak{o}(1,2)\) to be</p>

\[\begin{aligned}
    E_1 &amp;:= \begin{pmatrix}
        0 &amp; 0 &amp; 0 \\ 0 &amp; 0 &amp; -1 \\ 0 &amp; 1 &amp; 0
    \end{pmatrix}, &amp;
    E_2 &amp;:= \begin{pmatrix}
        0 &amp; 1 &amp; 0 \\ 1 &amp; 0 &amp; 0 \\ 0 &amp; 0 &amp; 0
    \end{pmatrix}, &amp;
    E_3 &amp;:= \begin{pmatrix}
        0 &amp; 0 &amp; 1 \\ 0 &amp; 0 &amp; 0 \\ 1 &amp; 0 &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>For convenience, we will also adopt the notation</p>

\[\omega^\times = \begin{pmatrix}
        0 &amp; -\omega \\ \omega &amp; 0
\end{pmatrix}\]

<p>for any \(\omega \in \mathbb{R}\).
Then we may write the wedge map \(\cdot^\wedge : \mathbb{R}^3 \to \mathfrak{o}(1,2)\) as</p>

\[\begin{aligned}
        \begin{pmatrix}
        \omega \\ u_1 \\ u_2
    \end{pmatrix}^\wedge &amp;:= \omega E_1 + u_1 E_2 + u_2 E_3
    = \begin{pmatrix} 0 &amp; u^\top \\ u &amp; \omega^\times \end{pmatrix} \in \mathbb{R}^{3\times 3},
\end{aligned}\]

<p>where \(u = (u_1,u_2) \in \mathbb{R}^2\).</p>

<p>The inverse of the wedge map is the ‘vee’ map \(\cdot^\vee : \mathfrak{o}(1,2) \to \mathbb{R}^3\).</p>

<h4 id="adjoint-and-lie-bracket">Adjoint and Lie bracket</h4>

<p>With the wedge and vee operators defined, we can get to work computing matrix representations of the adjoint operators and the Lie bracket.
The adjoint operator is computed by</p>

\[\begin{aligned}
\mathrm{Ad}_{L}(U)
&amp;= L U L^{-1} \\
% ------------
&amp;= \begin{pmatrix} d &amp; b^\top \\ c &amp; A \end{pmatrix}
\begin{pmatrix} 0 &amp; u^\top \\ u &amp; \omega^\times \end{pmatrix}
\begin{pmatrix} d &amp; b^\top \\ c &amp; A \end{pmatrix}^{-1} \\
% ------------
&amp;= \begin{pmatrix} b^\top u &amp; d u^\top + b^\top \omega^\times \\ 
A u &amp; c u^\top + A \omega^\times \end{pmatrix}
\begin{pmatrix} d &amp; -c^\top \\ -b &amp; A^\top \end{pmatrix} \\
% ------------
&amp;= \begin{pmatrix} d b^\top u - (d u^\top + b^\top \omega^\times) b &amp;
- b^\top u c^\top + (d u^\top + b^\top \omega^\times) A^\top \\
d A u - (c u^\top + A \omega^\times) b &amp;
-A u c^\top + (c u^\top + A \omega^\times) A^\top \end{pmatrix} \\
% ------------
&amp;= \begin{pmatrix} d b^\top u - d u^\top b - b^\top \omega^\times b &amp;
- u^\top b c^\top + d u^\top A^\top + b^\top \omega^\times A^\top \\
d A u - c u^\top b - A \omega^\times b &amp;
-A u c^\top + c u^\top A^\top + A \omega^\times A^\top \end{pmatrix} \\
% ------------
&amp;= \begin{pmatrix} 0 &amp;
u^\top (d A^\top - b c^\top) + \omega b^\top 1^\times A^\top \\
(d A - c b^\top) u - A 1^\times b \omega &amp;
c (Au)^\top - (A u) c^\top + A 1^\times A^\top \omega \end{pmatrix} \\
% ------------
&amp;= \begin{pmatrix} 0 &amp;
((d A - c b^\top) u - A 1^\times b \omega)^\top \\
(d A - c b^\top) u - A 1^\times b \omega &amp;
1^\times c^\top 1^\times A u - \frac{1}{2}\mathrm{tr}(1^\times A 1^\times A^\top) 1^\times \omega \end{pmatrix} \\
\end{aligned}\]

<p>We obtain a matrix expression for \(\mathrm{Ad}_X\) by using the wedge and vee isomorphisms.
From the computations above, we obtain the Adjoint matrix by extracting the coefficients of \(\omega,u\):</p>

\[\begin{aligned}
\mathrm{Ad}_{X}^\vee
&amp;= \begin{pmatrix}
    -\frac{1}{2}\mathrm{tr}(1^\times A 1^\times A^\top) &amp;
    c^\top 1^\times A \\
    -A 1^\times b &amp;
    d A - c b^\top
\end{pmatrix}
\end{aligned}\]

<p>Differentiating this matrix in terms of the Lie group element \(L\) at the identity provides the “little” adjoint matrix (and the Lie bracket)</p>

\[\begin{aligned}
\mathrm{ad}_{U}^\vee
&amp;= \begin{pmatrix}
    0 &amp;
    u^\top 1^\times \\
    -1^\times u &amp;
    \omega^\times
\end{pmatrix}, \\
[U_1, U_2] &amp;= \mathrm{ad}_{U_1}(U_2).
\end{aligned}\]

<h4 id="exponential-and-logarithm">Exponential and Logarithm</h4>

<p>As in any matrix Lie group, the exponential is given by the matrix exponential.
However, this is an expensive computation, so we will try to find a simplified form that can be computed quickly.
Let \(U \in \mathfrak{o}(1,2)\).
Looking at the characteristic polynomial of \(U\), we have</p>

\[\begin{aligned}
    p(s) &amp;:= \det(U - sI_3) \\
    &amp;= \det \begin{pmatrix} -s &amp; u^\top \\ u &amp; \omega^\times - s I_2 \end{pmatrix} \\
    % ----------
    &amp;= \det \begin{pmatrix} -s &amp; u_1 &amp; u_2 \\ 
    u_1 &amp; -s &amp; -\omega \\
    u_2 &amp; \omega &amp; -s\end{pmatrix} \\
    % ----------
    &amp;= -s \det \begin{pmatrix} -s &amp; -\omega \\ \omega &amp; -s \end{pmatrix}
    -u_1 \det \begin{pmatrix} u_1 &amp; -\omega \\ u_2 &amp; -s \end{pmatrix}
    +u_2 \det \begin{pmatrix} u_1 &amp; -s \\ u_2 &amp; \omega \end{pmatrix} \\
    % ----------
    &amp;= -s (s^2 + \omega^2)
    -u_1 (-u_1 s + \omega u_2)
    +u_2 (u_1 \omega + s u_2) \\
    % ----------
    &amp;= -s^3 - s\omega^2
    +u_1^2 s - \omega u_1 u_2
    + \omega u_1 u_2 + s u_2^2 \\
    % ----------
    &amp;= -s^3 - s\omega^2
    +s u_1^2 + s u_2^2 \\
    % ----------
    &amp;= -s^3 + s(u_1^2 + u_2^2 -\omega^2).
\end{aligned}\]

<p>From here, we can apply the Caley-Hamilton theorem, which says that every matrix satisfies its own characteristic equation. In other words,</p>

\[\begin{gathered}
p(U) = -U^3 + U (u_1^2 + u_2^2 -\omega^2) = 0_{3\times 3}, \\
U^3 = U (u_1^2 + u_2^2 -\omega^2).
\end{gathered}\]

<p>This is a very useful relationship in simplifying the exponential map.
Let 
\(q^2 = u_1^2 + u_2^2 -\omega^2,\)
noting that \(q\) may be either real or pure imaginary.
Then
\(U^3 = q^2 U,\)
and therefore</p>

\[\begin{aligned}
U^{2k+1} &amp;= q^2 U^{2(k-1) + 1} = \cdots
= q^{2k} U \\
U^{2k} &amp;= q^2 U^{2(k-1)} = \cdots = q^{2(k-1)} U^2,
\end{aligned}\]

<p>for all \(k =1,2,3,\ldots\).
We are now ready to derive the exponential formula.
We have</p>

\[\begin{aligned}
\exp(U) 
&amp;= \sum_{n=0}^\infty \frac{1}{n!} U^n \\
% ----------------
&amp;= I_3
+ \sum_{k=0}^\infty \frac{1}{(2k+1)!} U^{2k+1}
+ \sum_{k=1}^\infty \frac{1}{(2k)!} U^{2k} \\
% ----------------
&amp;= I_3
+ \sum_{k=0}^\infty \frac{1}{(2k+1)!} q^{2k} U
+ \sum_{k=1}^\infty \frac{1}{(2k)!} q^{2(k-1)} U^2 \\
% ----------------
&amp;= I_3
+ \frac{1}{q}\left( \sum_{k=0}^\infty \frac{ q^{2k+1}}{(2k+1)!} \right) U
+ \frac{1}{q^2}\left(-1 + \sum_{k=0}^\infty \frac{1}{(2k)!} q^{2k} \right) U^2 \\
% ----------------
&amp;= I_3
+ \frac{\sinh(q)}{q} U
+ \frac{\cosh(q) - 1}{q^2} U^2.
\end{aligned}\]

<p>This is already a very nice formula, but we need to be careful with the possibility that \(q=0\) or \(q\) is imaginary.
If \(q^2 &gt; 0\) then the formula can be applied directly as shown.
If \(q^2 = 0\) then the formula simplifies to
\(\exp(U) = I_3 + U + \frac{1}{2} U^2.\)
If \(q^2 &lt;0\) then we may use the identities \(\sinh(x) = - i \sin(ix)\) and \(\cosh(x) = \cos(ix)\) to obtain</p>

\[\begin{aligned}
\exp(U) &amp;= I_3
+ \frac{\sinh(q)}{q} U
+ \frac{\cosh(q) - 1}{q^2} U^2 \\
% ----------------
&amp;= I_3
+ \frac{-i\sin(i q)}{q} U
+ \frac{\cos(i q) - 1}{q^2} U^2 \\
% ----------------
&amp;= I_3
+ \frac{\sin(i q)}{i q} U
+ \frac{\cos(i q) - 1}{q^2} U^2 \\
% ----------------
&amp;= I_3
+ \frac{\sin(\sqrt{-q^2})}{\sqrt{-q^2}} U
+ \frac{\cos(\sqrt{-q^2}) - 1}{q^2} U^2.
\end{aligned}\]

<p>Finding an expression for the logarithm is a matter of inverting the exponential.
We therefore start by supposing that \(L = \exp(U)\) and then solve for the components of \(U\) (implicitly assuming that there is such a solution!).
Let \(t_1 = \frac{\sinh(q)}{q}\) and \(t_2 = \frac{\cosh(q)-1}{q^2}\) as shorthands.
Then,</p>

\[\begin{aligned}
L &amp;= \exp(U), \\
\begin{pmatrix} d &amp; b^\top \\ c &amp; A \end{pmatrix}
&amp;= \begin{pmatrix} 1 &amp; 0_{1\times 2} \\ 0_{2\times 1} &amp; I_2 \end{pmatrix}
+ t_1 \begin{pmatrix} 0 &amp; u^\top \\ u &amp; \omega^\times \end{pmatrix}
+ t_2 \begin{pmatrix} u^\top u &amp; u^\top \omega^\times \\ \omega^\times u &amp; u u^\top + (\omega^\times)^2 \end{pmatrix}.
\end{aligned}\]

<p>Taking the symmetric projection of both sides, i.e. mapping \(M \mapsto \frac{1}{2} (M + M^\top)\) yields</p>

\[\begin{aligned}
\frac{1}{2}\begin{pmatrix} d &amp; (b+c)^\top \\ b+c &amp; A+A^\top \end{pmatrix}
&amp;= \begin{pmatrix} 1 &amp; 0_{1\times 2} \\ 0_{2\times 1} &amp; I_2 \end{pmatrix}
+ t_1 \begin{pmatrix} 0 &amp; u^\top \\ u &amp; 0_{2\times 2} \end{pmatrix}
+ t_2 \begin{pmatrix} u^\top u &amp; 0_{1\times 2} \\ 0_{2\times 1} &amp; u u^\top + (\omega^\times)^2 \end{pmatrix}.
\end{aligned}\]

<p>Likewise, taking the antisymmetric projection \(M \mapsto \frac{1}{2} (M - M^\top)\) yields</p>

\[\begin{aligned}
\frac{1}{2}\begin{pmatrix} 0 &amp; (b-c)^\top \\ c-b &amp; A-A^\top \end{pmatrix}
&amp;= t_1 \begin{pmatrix} 0 &amp; 0_{1\times 2} \\  0_{2\times 1} &amp; \omega^\times \end{pmatrix}
+ t_2 \begin{pmatrix} 0 &amp; u^\top \omega^\times \\ \omega^\times u &amp; 0_{2\times 2} \end{pmatrix}.
\end{aligned}\]

<p>Extracting the bottom-left component of the symmetric projection equation, we can derive</p>

\[\begin{aligned}
\frac{1}{2}(b+c) &amp;= t_1 u, \\
\left\vert \frac{b+c}{2} \right\vert^2  &amp;= \vert u \vert^2 t_1^2.
\end{aligned}\]

<p>Extracting the bottom-right component of the antisymmetric projection equation, we also have</p>

\[\begin{aligned}
\frac{1}{2}(A-A^\top) &amp;= t_1 \omega^\times, \\
\left\vert \frac{1}{2}(A-A^\top) \right\vert^2 &amp;= t_1^2 \vert \omega^\times \vert^2, \\
\frac{1}{2}\left\vert \frac{1}{2}(A-A^\top) \right\vert^2 &amp;= t_1^2 \vert \omega \vert^2, \\
\frac{1}{4}( A_{12} - A_{21} )^2 &amp;= t_1^2 \omega^2, \\
\end{aligned}\]

<p>Recall now that \(q^2 = \vert u \vert^2 - \omega^2\), so combing these equations we have</p>

\[\begin{aligned}
\left\vert \frac{b+c}{2} \right\vert^2 - \frac{1}{4}( A_{12} - A_{21} )^2 
&amp;= t_1^2 \vert u \vert^2 - t_1^2 \omega^2
= t_1^2 q^2
= \sinh(q)^2.
\end{aligned}\]

<p>This now leads us to a solution for \(q\). If we allow \(q\) to be imaginary then</p>

\[\begin{aligned}
q = \sinh^{-1}\left( \sqrt{ \left\vert \frac{b+c}{2} \right\vert^2 - \frac{1}{4}( A_{12} - A_{21} )^2 } \right)
\end{aligned}\]

<p>If we do not want to use imaginary numbers, then we need to consider the case that the left-hand side is less than \(0\). If this is the case, then we use the identity \(\sinh(x) = -i\sin(ix)\) to obtain</p>

\[\begin{aligned}
\sinh(q)^2 &amp;= \left\vert \frac{b+c}{2} \right\vert^2 - \frac{1}{4}( A_{12} - A_{21} )^2, \\
(-i\sin(iq))^2 &amp;= \left\vert \frac{b+c}{2} \right\vert^2 - \frac{1}{4}( A_{12} - A_{21} )^2, \\
(\sin(iq))^2 &amp;= -\left( \left\vert \frac{b+c}{2} \right\vert^2 - \frac{1}{4}( A_{12} - A_{21} )^2 \right), \\
iq &amp;= \sin^{-1} \left( \sqrt{-\left( \left\vert \frac{b+c}{2} \right\vert^2 - \frac{1}{4}( A_{12} - A_{21} )^2 \right)} \right).
\end{aligned}\]

<p>The important thing is that we are able to recover \(q\) or at least \(iq\), from which we can then immediately compute \(t_1\).
Once we have \(t_1\), then we simply use the formulas obtained previously to compute</p>

\[\begin{aligned}
u &amp;= \frac{b+c}{2 t_1}, &amp;
\omega^\times &amp;= \frac{A-A^\top}{2 t_1}, &amp;
\omega &amp;= \frac{A_{21}-A_{12}}{2 t_1}, &amp;
\end{aligned}\]

<p>This completes the computation of the logarithm.</p>

<h3 id="conclusion">Conclusion</h3>

<p>The indefinite orthogonal group is not often seen in robotics or control theory, but is highly relevant to physics.
It is also a good chance to explore Lie group theory with a less familiar group, and particularly a group that is not overly similar to \(\mathbf{SO}(3)\) or \(\mathbf{SE}(3)\).
The formulas presented in this post have all been added to my python Lie group library <a href="https://github.com/pvangoor/pylie">pylie</a> as well, where they have been tested for correctness.
If you spot any mistakes in this post or have any feedback, I would love to hear it!</p>]]></content><author><name></name></author><category term="Mathematics" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The 2D Special Linear Group SL(2)</title><link href="/mathematics/2024/05/05/special_linear_sl2.html" rel="alternate" type="text/html" title="The 2D Special Linear Group SL(2)" /><published>2024-05-05T00:00:00+00:00</published><updated>2024-05-05T00:00:00+00:00</updated><id>/mathematics/2024/05/05/special_linear_sl2</id><content type="html" xml:base="/mathematics/2024/05/05/special_linear_sl2.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h3 id="introduction">Introduction</h3>

<p>The special linear group \(\mathbf{SL}(n)\) is the set of \(n\times n\) matrices with determinant \(1\):</p>

\[\begin{aligned}
    \mathbf{SL}(n) = \left\{
        H \in \mathbb{R}^{n\times n}
        \; \middle| \;
        \det(H) = 1
    \right\}
\end{aligned}\]

<p>Geometrically, these matrices represent linear transformations of \(n\)-dimensional space that preserve (oriented) volume.
In this post, we will focus solely on the case \(n=2\).
This group, \(\mathbf{SL}(2)\) thus represents the linear transformations of 2D space that preserve oriented area.
Using the definition of the determinant, the 2D Special Linear group is</p>

\[\begin{aligned}
    \mathbf{SL}(2) &amp;:= \left\{
        H = \begin{pmatrix}
            a &amp; b \\ c &amp; d
        \end{pmatrix} \in \mathbb{R}^{2\times 2}
        \; \middle| \;
        ad-bc=1
    \right\}.
\end{aligned}\]

<p>The group properties are easily verified. The identity is \(I_2 \in \mathbf{SL}(2)\) and the product and inverse are just the matrix product and inverse, which satisfy</p>

\[\begin{aligned}
    \det(H_1 H_2) &amp;= \det(H_1) \det(H_2) = 1, \\
    \det(H^{-1}) &amp;= \det(H)^{-1} = 1,
\end{aligned}\]

<p>for all \(H,H_1,H_2 \in \mathbf{SL}(2)\).
One useful consequence is that the inverse can be expressed simply as</p>

\[\begin{aligned}
    H^{-1} &amp;= 
    \begin{pmatrix}
        a &amp; b \\ c &amp; d
    \end{pmatrix}^{-1}
    = \frac{1}{\det(H)}\begin{pmatrix}
        d &amp; -b \\ -c &amp; a
    \end{pmatrix}
    = \begin{pmatrix}
        d &amp; -b \\ -c &amp; a
    \end{pmatrix}
\end{aligned}\]

<p>The conjugation can thus be written as</p>

\[\begin{aligned}
    \mathrm{Cn}_{H_1}(H_2)
    &amp;=
    H_1 H_2 H_1^{-1} \\
    &amp;=
    \begin{pmatrix}
        a_1 &amp; b_1 \\ c_1 &amp; d_1
    \end{pmatrix}
    \begin{pmatrix}
        a_2 &amp; b_2 \\ c_2 &amp; d_2
    \end{pmatrix}
    \begin{pmatrix}
        a_1 &amp; b_1 \\ c_1 &amp; d_1
    \end{pmatrix}^{-1} \\
    &amp;=
    \begin{pmatrix}
        a_1 a_2 + b_1 c_2 &amp;
        a_1 b_2 + b_1 d_2 \\
        c_1 a_2 + d_1 c_2 &amp;
        c_1 b_2 + d_1 d_2
    \end{pmatrix}
    \begin{pmatrix}
        d_1 &amp; -b_1 \\ -c_1 &amp; a_1
    \end{pmatrix} \\
    &amp;=
    \begin{pmatrix}
        (a_1 a_2 + b_1 c_2)d_1 - (a_1 b_2 + b_1 d_2) c_1&amp;
        -(a_1 a_2 + b_1 c_2)b_1 + (a_1 b_2 + b_1 d_2)a_1\\
        (c_1 a_2 + d_1 c_2)d_1 - (c_1 b_2 + d_1 d_2) c_1&amp;
        -(c_1 a_2 + d_1 c_2)b_1 + (c_1 b_2 + d_1 d_2)a_1
    \end{pmatrix}.
\end{aligned}\]

<p>Realistically, it is probably simpler just to use matrix operations if you even need to compute this.</p>

<h3 id="lie-algebra">Lie algebra</h3>

<p>The Lie algebra can be obtained by differentiating the condition \(\det(H) = 1\) at the identity.
It turns out that the Lie algebra is exactly the matrices \(U \in \mathbb{R}^{2\times 2}\) with trace zero.
We write</p>

\[\begin{aligned}
    \mathfrak{sl}(2)
    &amp;= \left\{
        U = \begin{pmatrix}
            u_1 &amp; u_2 \\ u_3 &amp; -u_1
        \end{pmatrix} \in \mathbb{R}^{2\times 2}
        \; \middle| \;
        u_1,u_2,u_3 \in \mathbb{R}
    \right\}.
\end{aligned}\]

<p>This Lie algebra is a vector space just like any other Lie algebra, and we can relate it to \(\mathbb{R}^3\) by choosing a basis, or equivalently by defining a “wedge” map \(\cdot^\wedge : \mathbb{R}^3 \to \mathfrak{sl}(2)\).
This map is required to be an isomorphism in the vector space sense, so it must be linear and invertible.
We have already suggested the wedge map in our definition of the Lie algebra, but formally,</p>

\[\begin{aligned}
    \begin{pmatrix}
        u_1 \\ u_2 \\ u_3
    \end{pmatrix}^\wedge
    &amp;:= \begin{pmatrix}
        u_1 &amp; u_2 \\ u_3 &amp; -u_1
    \end{pmatrix},
\end{aligned}\]

<p>and the ‘vee’ map \(\cdot^\vee : \mathfrak{sl}(2) \to \mathbb{R}^3\) is simply the inverse.
This choice defines a basis of \(\mathfrak{sl}(2)\) by
\(\begin{aligned}
    E_1 &amp;:= \begin{pmatrix}
        1 &amp; 0 \\ 0 &amp; -1
    \end{pmatrix}, &amp;
    E_2 &amp;:= \begin{pmatrix}
        0 &amp; 1 \\ 0 &amp; 0
    \end{pmatrix}, &amp;
    E_3 &amp;:= \begin{pmatrix}
        0 &amp; 0 \\ 1 &amp; 0
    \end{pmatrix}.
\end{aligned}\)</p>

<h4 id="adjoint-and-lie-bracket">Adjoint and Lie bracket</h4>

<p>Using the wedge and vee operators, we can obtain expressions for the Adjoint operator and Lie bracket.
While the expression for the conjugation operation was a bit complicated, the derivative (which gives the Adjoint operator) is a bit simpler.</p>

\[\begin{aligned}
\mathrm{Ad}_{H}(U)
&amp;= \mathrm{D}_Z |_{I_3} \mathrm{Cn}_{H}(Z)[U] \\
&amp;= \begin{pmatrix}
        (a u_1 + b u_3)d - (a u_2 + b (-u_1)) c&amp;
        -(a u_1 + b u_3)b + (a u_2 + b (-u_1))a\\
        (c u_1 + d u_3)d - (c u_2 + d (-u_1)) c&amp;
        -(c u_1 + d u_3)b + (c u_2 + d (-u_1))a
    \end{pmatrix} \\
&amp;= \begin{pmatrix}
    a d u_1 + b d u_3 - a c u_2 + b c u_1&amp;
    - a b u_1 - b^2 u_3 + a^2 u_2 - a b u_1\\
    c d u_1 + d^2 u_3 - c^2 u_2 + c d u_1&amp;
    -b c u_1 - b d u_3 + a c u_2 - a d u_1
\end{pmatrix} \\
&amp;= \begin{pmatrix}
    (a d + b c) u_1 - a c u_2 + b d u_3&amp;
    - 2 a b u_1 + a^2 u_2 - b^2 u_3\\
    2 c d u_1 - c^2 u_2 + d^2 u_3&amp;
    -(ad + b c) u_1 + a c u_2 - b d u_3
\end{pmatrix} \\
&amp;= \begin{pmatrix}
    (2 b c + 1) u_1 - a c u_2 + b d u_3&amp;
    - 2 a b u_1 + a^2 u_2 - b^2 u_3\\
    2 c d u_1 - c^2 u_2 + d^2 u_3&amp;
    -(2 b c + 1) u_1 + a c u_2 - b d u_3
\end{pmatrix},
\end{aligned}\]

<p>where the last line follow using the fact that \(ad-bc = 1\), so \(ad = bc+1\).
We obtain a matrix expression for \(\mathrm{Ad}_X\) by using the wedge and vee isomorphisms.
From the computations above, we obtain the Adjoint matrix by extracting the coefficients of \(u_1,u_2,u_3\):</p>

\[\begin{aligned}
\mathrm{Ad}_{X}^\vee
&amp;= \begin{pmatrix}
    2 b c + 1 &amp; - a c &amp; b d \\
    - 2 a b &amp; a^2 &amp; - b^2 \\
     2 c d &amp; - c^2 &amp; d^2
\end{pmatrix}
\end{aligned}\]

<p>Differentiating this matrix in terms of the variable \(H\) at the identity provides the “little” adjoint matrix (and the Lie bracket)</p>

\[\begin{aligned}
\mathrm{ad}_{U}^\vee
&amp;= \begin{pmatrix}
    0 &amp; - u_3 &amp; u_2 \\
    - 2 u_2 &amp; 2 u_1 &amp; 0 \\
     2 u_3 &amp; 0 &amp; -2 u_1
\end{pmatrix}, \\
[U, V] &amp;= \mathrm{ad}_{U}(V)
= \begin{pmatrix}
    -u_3 v_2 + u_2 v_3 \\
    -2 u_2 v_1 + 2 u_1 v_2 \\
    2 u_3 v_1 - 2 u_1 v_3
\end{pmatrix}^\wedge.
\end{aligned}\]

<h4 id="exponential-and-logarithm">Exponential and Logarithm</h4>

<p>The exponential is given by the matrix exponential.
This, by definition, involves an infinite power series, but we will try to ‘hide’ this inside some standard functions (in this case, \(\sinh\) and \(\cosh\)).
Let \(U \in \mathfrak{sl}(2)\). 
Following the steps in my previous post on the why the Lie exponential is not surjective, we obtain</p>

\[\begin{aligned}
    \exp(U)
    &amp;= \cosh(\sqrt{\theta})  I_2 + \frac{\sinh(\sqrt{\theta})}{\sqrt{\theta}} U,\\
    \theta &amp;= u_1^2 + u_2 u_3.
\end{aligned}\]

<p>There are two important things to note in this formula.
First, the fraction \(\frac{\sinh(\sqrt{\theta})}{\sqrt{\theta}}\) must be understood as a power series! Formally, we should define</p>

\[\mathrm{sinhc}(x) := \begin{cases}
    \frac{\sinh(x)}{x} &amp; \text{if } x \neq 0 \\
    1 &amp; \text{if } x = 0
\end{cases}\]

<p>and then write \(\exp(U) = \cosh(\sqrt{\theta}) I_2 + \mathrm{sinhc}(\sqrt{\theta}) U\).
The second tricky part of this formula, is that the square root of \(\theta\) may not be real!
That is to say, the value of \(\theta = u_1^2 + u_2 u_3\) inside the square root may, in fact, be negative, meaning that \(\sqrt{\theta}\) is an imaginary number.
One way to deal with this is to simply choose one of the imaginary square roots, and apply \(\cosh\) and \(\sinh\) to the imaginary number.
This is okay, but may be a little unsatisfying or cumbersome to implement.
Fortunately, there is an alternative using the fact that
\(\begin{aligned}
\sinh(x) &amp;= -i\sin(ix), &amp;
\cosh(x) &amp;= \cos(ix)
\end{aligned}\)
for all \(x \in \mathbb{R}\).
Suppose that \(\theta &lt; 0\). Then,</p>

\[\begin{aligned}
    \frac{\sinh(\sqrt{\theta})}{\sqrt{\theta}}
    &amp;= \frac{-i\sin(i\sqrt{\theta})}{\sqrt{\theta}}
    = \frac{-i\sin(i^2\sqrt{-\theta})}{i\sqrt{-\theta}}
    % = \frac{-\sin(-\sqrt{-\theta})}{\sqrt{-\theta}}
    = \frac{\sin(\sqrt{-\theta})}{\sqrt{-\theta}}, \\
    \cosh(\sqrt{\theta})
    &amp;= \cos(i \sqrt{\theta})
    = \cos(\sqrt{-\theta})
\end{aligned}\]

<p>This leads to the following final expression for the exponential.
The exponential of \(U\) is given by</p>

\[\begin{aligned}
\exp(U) &amp;= \begin{cases}
    \cosh(\sqrt{\theta}) I_2 + \frac{\sinh(\sqrt{\theta})}{\sqrt{\theta}} U &amp; \text{if } \theta &gt; 0 \\
    \cos(\sqrt{-\theta}) I_2 + \frac{\sin(\sqrt{-\theta})}{\sqrt{-\theta}}U &amp; \text{if } \theta &lt; 0 \\
    I_2 + U &amp; \text{if } \theta = 0
\end{cases}\\
\text{where } \theta &amp;:= u_1^2 + u_2 u_3.
\end{aligned}\]

<p>Note that we got rid of the \(\mathrm{sinhc}\) by specifying the third case.</p>

<p>Reversing this formula tells us how to find the logarithm as well.
Consider a matrix \(H = \exp(U) \in \mathbf{SL}(2)\).
From the formula for the exponential, and the fact that \(\mathrm{trace}(U)=0\), we find that</p>

\[\mathrm{trace}(H) = \begin{cases}
    2\cosh(\sqrt{\theta}) &amp; \text{if } \theta &gt; 0 \\
    2\cos(\sqrt{-\theta}) &amp; \text{if } \theta &lt; 0 \\
    2 &amp; \text{if } \theta = 0
\end{cases}\]

<p>This makes it easy to distinguish which case we are dealing with, since \(\cosh(x) &gt; 1\) for all \(x \neq 0\) and \(\cos(x) &lt; 1\) for \(x \neq 2 k \pi\).
Let $$\alpha = \frac{1}{2} \mathrm{trace}(H). Then</p>

\[\theta = \begin{cases}
    \cosh^{-1}(\alpha)^2 &amp; \text{if } \alpha \geq 1 \\
    -\cos^{-1}(\alpha)^2 &amp; \text{if } \alpha &lt; 1
\end{cases}\]

<p>Now that we have recovered \(\theta\), we can recover \(U\) easily.
In the case where \(\theta = 0\), then \(U = H - I_2\).
In the case where \(\theta &gt; 0\), then</p>

\[\begin{aligned}
    H &amp;= \cosh(\sqrt{\theta}) I_2 + \frac{\sinh(\sqrt{\theta})}{\sqrt{\theta}} U, \\
    \frac{\sinh(\sqrt{\theta})}{\sqrt{\theta}} U
    &amp;= H - \cosh(\sqrt{\theta}) I_2, \\
    U
    &amp;= \frac{\sqrt{\theta}}{\sinh(\sqrt{\theta})} (H - \cosh(\sqrt{\theta}) I_2),
\end{aligned}\]

<p>and likewise, in the case that \(\theta &lt; 0\),</p>

\[\begin{aligned}
    U
    &amp;= \frac{\sqrt{-\theta}}{\sin(\sqrt{-\theta})} (H - \cos(\sqrt{-\theta}) I_2),
\end{aligned}\]

<p>Substituting \(\theta\) in terms of \(\alpha\) into these equations yields the final formula for the logarithm:</p>

\[\begin{aligned}
    \log(H)
    &amp;= \begin{cases}
    \frac{\cosh^{-1}(\alpha)}{\sinh(\cosh^{-1}(\alpha))} (H - \alpha I_2)
    &amp; \text{if } \alpha &gt; 1 \\
    \frac{\cos^{-1}(\alpha)}{\sin(\cos^{-1}(\alpha))} (H - \alpha I_2)
    &amp; \text{if } \alpha &lt; 1 \\
    H - \alpha I_2
    &amp; \text{if } \alpha = 1
\end{cases}\\
\text{where } \alpha &amp;:= u_1^2 + u_2 u_3.
\end{aligned}\]

<h3 id="conclusion">Conclusion</h3>

<p>The special linear group comes up far less in robotics than the quaternions or the special euclidean group.
However, it does have some beautiful formulas, and the exercise of working them out is never a wasted effort.
The formulas here have all been implemented in <a href="https://github.com/pvangoor/pylie">pylie</a>, so I hope people find them helpful.
As always, please let me know if you have any suggestions for improvement!</p>]]></content><author><name></name></author><category term="Mathematics" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The 2D Special Euclidean Group SE(2)</title><link href="/mathematics/2024/03/12/special_euclidean_se2.html" rel="alternate" type="text/html" title="The 2D Special Euclidean Group SE(2)" /><published>2024-03-12T00:00:00+00:00</published><updated>2024-03-12T00:00:00+00:00</updated><id>/mathematics/2024/03/12/special_euclidean_se2</id><content type="html" xml:base="/mathematics/2024/03/12/special_euclidean_se2.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h3 id="introduction">Introduction</h3>

<p>The Special Euclidean SE(n) group describes the orientation-preserving isometries of the Euclidean space \(\mathbb{R}^n\), where \(n \in \mathbb{N}\) is some integer.
This group can be represented using matrices, as</p>

\[\begin{aligned}
    \mathbf{SE}(n) = \left\{
        X = \begin{pmatrix}
            R &amp; p \\ 0_{1 \times n} &amp; 1
        \end{pmatrix} \in \mathbb{R}^{n+1 \times n+1}
        \; \middle| \;
        R^\top R = I_n, \;
        \det(R) = 1, \;
        p \in \mathbb{R}^n
    \right\}
\end{aligned}\]

<p>In this post, we will focus solely on the 2D case.
This is particularly relevant for ground-based robotics, where the group reflects the invariance of the robot’s dynamics.
The 2D Special Euclidean group is given by</p>

\[\begin{aligned}
    \mathbf{SE}(2) &amp;:= \left\{
        X = \begin{pmatrix}
            R(\theta) &amp; p \\ 0_{1 \times 2} &amp; 1
        \end{pmatrix} \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        \theta \in (\pi,\pi], \;
        p \in \mathbb{R}^2
    \right\}, \\
    R(\theta) &amp;:=
        \begin{pmatrix}
            \cos(\theta) &amp; - \sin(\theta) \\
            \sin(\theta) &amp; \cos(\theta)
        \end{pmatrix}
\end{aligned}\]

<p>The group properties are easily verified. The identity is \(I_3 \in \mathbf{SE}(2)\) and the product and inverse are</p>

\[\begin{aligned}
    X_1 X_2 &amp;= \begin{pmatrix}
        R(\theta_1) &amp; p_1 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}
    \begin{pmatrix}
        R(\theta_2) &amp; p_2 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}
    = 
    \begin{pmatrix}
        R(\theta_1 + \theta_2) &amp; p_1 + R(\theta_1) p_2 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}, \\
    X^{-1} &amp;= \begin{pmatrix}
        R(\theta) &amp; p \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}^{-1}
    = \begin{pmatrix}
        R(-\theta) &amp; - R(-\theta) p \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}
\end{aligned}\]

<p>Putting these together, we obtain the conjugation</p>

\[\begin{aligned}
    \mathrm{Cn}_{X_1}(X_2)
    &amp;=
    X_1 X_2 X_1^{-1} \\
    &amp;=
    \begin{pmatrix}
        R(\theta_1) &amp; p_1 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}
    \begin{pmatrix}
        R(\theta_2) &amp; p_2 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}
    \begin{pmatrix}
        R(\theta_1) &amp; p_1 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}^{-1} \\
    &amp;=
    \begin{pmatrix}
        R(\theta_1 + \theta_2) &amp; p_1 + R(\theta_1) p_2 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}
    \begin{pmatrix}
        R(-\theta_1) &amp; - R(-\theta_1) p_1 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix} \\
    &amp;=
    \begin{pmatrix}
        R(\theta_1 + \theta_2 - \theta_1) &amp; p_1 + R(\theta_1) p_2  - R(\theta_1 + \theta_2) R(-\theta_1) p_1 \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix} \\
    &amp;=
    \begin{pmatrix}
        R(\theta_2) &amp; (I_2 - R(\theta_2))p_1 + R(\theta_1) p_2  \\ 0_{1 \times 2} &amp; 1
    \end{pmatrix}
\end{aligned}\]

<h3 id="lie-algebra">Lie algebra</h3>

<p>The Lie algebra can be obtained by differentiating the matrix \(X\) at \(\theta = 0\) and \(p = 0_{2\times 1}\).
We get that</p>

\[\begin{aligned}
    \mathfrak{se}(2)
    &amp;= \left\{
        U = \begin{pmatrix}
            \omega^\times &amp; v \\ 0_{1 \times 2} &amp; 0
        \end{pmatrix} \in \mathbb{R}^{3\times 3}
        \; \middle| \;
        \omega \in \mathbb{R}, \;
        v \in \mathbb{R}^2
    \right\}, \\
    \omega^\times &amp;:=
        \begin{pmatrix}
            0 &amp; - \omega \\
            \omega &amp; 0
        \end{pmatrix}.
\end{aligned}\]

<p>The notation \(\omega^\times\) is a very useful one.
The operator \(\cdot^\times : \mathbb{R} \to \mathbb{R}^{2\times 2}\) is linear, so \((\omega_1 + \omega_2)^\times = \omega_1^\times + \omega_2^\times\) and, in particular, \(\omega^\times = \omega 1^\times\).
This is a particularly nice feature since \(1^\times = R(\pi/2)\).</p>

<p>The Lie algebra \(\mathfrak{se}(2)\) is a vector space by definition, and we can relate it to \(\mathbb{R}^3\) by choosing a basis, or simply by defining a “wedge” map \(\cdot^\wedge : \mathbb{R}^3 \to \mathfrak{se}(2)\).
This map is required to be an isomorphism in the vector space sense, so it must be linear and invertible.
We choose the wedge map and its inverse, the vee map, to be</p>

\[\begin{aligned}
    \begin{pmatrix}
        \omega \\ v
    \end{pmatrix}^\wedge
    &amp;:= \begin{pmatrix}
        \omega^\times &amp; v \\
        0_{1\times 2} &amp; 0
    \end{pmatrix}, &amp;
    \begin{pmatrix}
        \omega^\times &amp; v \\
        0_{1\times 2} &amp; 0
    \end{pmatrix}^\vee
    &amp;:= \begin{pmatrix}
        \omega \\ v
    \end{pmatrix},
\end{aligned}\]

<p>where \(\omega \in \mathbb{R}\) and \(v \in \mathbb{R}^2\).
In other words, we define a basis of \(\mathfrak{se}(2)\) by
\(\begin{aligned}
    E_1 &amp;:= \begin{pmatrix}
        1^\times &amp; 0_{2 \times 1} \\
        0_{1\times 2} &amp; 0
    \end{pmatrix}, &amp;
    E_2 &amp;:= \begin{pmatrix}
        0_{2\times 2} &amp; \mathbf{e}_1 \\
        0_{1\times 2} &amp; 0
    \end{pmatrix}, &amp;
    E_3 &amp;:= \begin{pmatrix}
        0_{2\times 2} &amp; \mathbf{e}_2 \\
        0_{1\times 2} &amp; 0
    \end{pmatrix},
\end{aligned}\)
where \(\mathbf{e}_1, \mathbf{e}_2 \in \mathbb{R}^2\) are the standard basis vectors.</p>

<h4 id="adjoint-and-lie-bracket">Adjoint and Lie bracket</h4>

<p>Using the wedge and vee operators, we can obtain expressions for the Adjoint operator and Lie bracket.
The simplest way is to differentiate the conjugation operation.
We have</p>

\[\begin{aligned}
\mathrm{Ad}_{X}(U)
&amp;= \mathrm{D}_Z |_{I_3} \mathrm{Cn}_{X}(Z)[U]
&amp;= \begin{pmatrix}
        \omega^\times &amp; -\omega^\times p + R(\theta) v  \\ 0_{1 \times 2} &amp; 0
    \end{pmatrix}.
\end{aligned}\]

<p>We obtain a matrix expression for \(\mathrm{Ad}_X\) by using the wedge and vee isomorphisms. Specifically, for any linear operator \(L : \mathfrak{se}(2) \to \mathfrak{se}(2)\), we may define \(L^\vee \in \mathbb{R}^{3\times 3}\) to be the matrix such that \(L^\vee u = L(u^\wedge)^\vee\) for all \(u \in \mathbb{R}^3\).
From this definition, we obtain the Adjoint matrix</p>

\[\begin{aligned}
\mathrm{Ad}_{X}^\vee (U^\vee)
&amp;= \begin{pmatrix}
        \omega^\times &amp; -\omega^\times p + R(\theta) v  \\ 0_{1 \times 2} &amp; 0
    \end{pmatrix}^\vee \\
&amp;= \begin{pmatrix}
        \omega \\ -1^\times p \omega + R(\theta) v
    \end{pmatrix} \\
&amp;= \begin{pmatrix}
    1 &amp; 0_{1\times 2} \\ -1^\times p &amp; R(\theta)
\end{pmatrix}
\begin{pmatrix}
    \omega \\ v
\end{pmatrix}, \\
\mathrm{Ad}_X^\vee &amp;= \begin{pmatrix}
    1 &amp; 0_{1\times 2} \\ -1^\times p &amp; R(\theta)
\end{pmatrix}
\end{aligned}\]

<p>Differentiating this matrix in terms of the variable \(X\) at the identity provides the “little” adjoint matrix and the Lie bracket</p>

\[\begin{aligned}
\mathrm{ad}_{U}^\vee&amp;= \begin{pmatrix}
    0 &amp; 0_{1\times 2} \\ -1^\times v &amp; \omega^\times
\end{pmatrix}, \\
[U_1, U_2] &amp;= \mathrm{ad}_{U_1}(U_2)
= \begin{pmatrix}
    0 \\ -\omega_2^\times v_1 + \omega_1^\times v_2
\end{pmatrix}^\wedge.
\end{aligned}\]

<h4 id="exponential-and-logarithm">Exponential and Logarithm</h4>

<p>The exponential is “simply” given by the matrix exponential.
However, it is nice to have formulas that do not rely on solving infinite power series, or, at least, hide these solutions in well-known elementary functions like \(\sin\) and \(\cos\).
Let \(U \in \mathfrak{se}(2)\). Then, we have that</p>

\[\begin{aligned}
U^2 &amp;= \begin{pmatrix}
            \omega^\times &amp; v \\ 0_{1 \times 2} &amp; 0
        \end{pmatrix}^2
    = \begin{pmatrix}
            (\omega^\times)^2 &amp; \omega^\times v \\ 0_{1 \times 2} &amp; 0
        \end{pmatrix}
    = \begin{pmatrix}
        -\omega^2 I_2 &amp; \omega^\times v \\ 0_{1 \times 2} &amp; 0
    \end{pmatrix}, \\
U^3 &amp;= \begin{pmatrix}
            \omega^\times &amp; v \\ 0_{1 \times 2} &amp; 0
        \end{pmatrix}^3
    = \begin{pmatrix}
        -\omega^2 \omega^\times &amp; \omega^\times \omega^\times v \\ 0_{1 \times 2} &amp; 0
    \end{pmatrix}
    = \begin{pmatrix}
        -\omega^2 \omega^\times &amp; - \omega^2 v \\ 0_{1 \times 2} &amp; 0
    \end{pmatrix}
    = -\omega^2 U.
\end{aligned}\]

<p>This is the property that lets us simplify the exponential formula.
It follows that \(U^{2k+1} = -\omega^2 U^{2k-1} = (-1)^k\omega^{2k} U\) for all \(k \geq 0\).
We now solve the matrix exponential. We have that</p>

\[\begin{aligned}
\exp(U) &amp;= \sum_{k=0}^\infty \frac{1}{k!} U^k, \\
&amp;=
I_3 + \sum_{k=1}^\infty \frac{1}{(2k)!} U^{2k}
+ \sum_{k=0}^\infty \frac{1}{(2k+1)!} U^{2k+1}, \\
&amp;=
I_3 + \left(\sum_{k=1}^\infty \frac{1}{(2k)!} U^{2k-1}\right) U
+ \sum_{k=0}^\infty \frac{1}{(2k+1)!} U^{2k+1}, \\
&amp;=
I_3 + \left(\sum_{k=1}^\infty \frac{(-1)^{k-1}}{(2k)!} \omega^{2k-2} U\right) U
+ \sum_{k=0}^\infty \frac{(-1)^{k}}{(2k+1)!} \omega^{2k} U, \\
&amp;=
I_3 - \left(\sum_{k=1}^\infty \frac{(-1)^{k}}{(2k)!} \omega^{2k} \right) \omega^{-2}U^2
+ \left(\sum_{k=0}^\infty \frac{(-1)^{k}}{(2k+1)!} \omega^{2k+1}\right) \omega^{-1}U, \\
&amp;=
I_3 - \left(\cos(\omega) - 1 \right) \omega^{-2}U^2
+ \sin(\omega) \omega^{-1}U, \\
&amp;=
I_3 + \frac{\sin(\omega)}{\omega} U + \frac{1 - \cos(\omega)}{\omega^2} U^2.
\end{aligned}\]

<p>Written in terms of the expanded matrix, we get</p>

\[\begin{aligned}
\exp(U) &amp;= 
I_3 + \frac{\sin(\omega)}{\omega} U + \frac{1 - \cos(\omega)}{\omega^2} U^2 \\
&amp;= 
\begin{pmatrix} I_2 &amp; 0_{2\times 1} \\ 0_{1\times 2} &amp; 1 \end{pmatrix}
+ \frac{\sin(\omega)}{\omega} \begin{pmatrix} \omega^\times &amp; v \\ 0_{1\times 2} &amp; 0 \end{pmatrix}
+ \frac{1 - \cos(\omega)}{\omega^2}
\begin{pmatrix} -\omega^2 I_2 &amp; \omega^\times v \\ 0_{1\times 2} &amp; 0 \end{pmatrix} \\
&amp;= 
\begin{pmatrix}
I_2
+ \frac{\sin(\omega)}{\omega} \omega^\times
-\omega^2 \frac{1 - \cos(\omega)}{\omega^2} I_2
&amp;
\frac{\sin(\omega)}{\omega} v 
+ \frac{1 - \cos(\omega)}{\omega^2} \omega^\times v \\
0_{1\times 2} &amp; 1 \end{pmatrix} \\
&amp;= 
\begin{pmatrix}
\sin(\omega) 1^\times
+ \cos(\omega) I_2 &amp;
\frac{1}{\omega} (\sin(\omega)I_2 + (1 - \cos(\omega))1^\times ) v \\
0_{1\times 2} &amp; 1 \end{pmatrix} \\
&amp;= 
\begin{pmatrix}
R(\omega) &amp;
\frac{1}{\omega} (- \sin(\omega)1^\times + I_2 - \cos(\omega)I_2 ) 1^\times v \\
0_{1\times 2} &amp; 1 \end{pmatrix} \\
&amp;= 
\begin{pmatrix}
R(\omega) &amp;
\frac{I_2 - R(\omega)}{\omega} 1^\times v \\
0_{1\times 2} &amp; 1 \end{pmatrix} \\
\end{aligned}\]

<p>When \(\omega = 0\), the formula simplifies to</p>

\[\begin{aligned}
\exp(U) = I_3 + U.
\end{aligned}\]

<p>The expanded formula tells us how to take the logarithm as well.
Given a matrix \(X \in \mathbf{SE}(2)\), we match the terms in \(\exp(U)= X\) to obtain</p>

\[\begin{aligned}
R(\theta) &amp;= R(\omega), &amp;
p &amp;= \frac{I_2 - R(\omega)}{\omega} 1^\times v.
\end{aligned}\]

<p>The first term is solved by \(\omega = \theta + 2k \pi\) for any \(k \in \mathbb{N}\), so we choose \(\omega \in [-\pi, \pi)\) as the standard solution.
The second term is then given by solving</p>

\[\begin{aligned}
p &amp;= \frac{I_2 - R(\omega)}{\omega} 1^\times v, \\
p &amp;= \frac{I_2 - R(\omega)}{\omega} R(\pi/2) v, \\
v &amp;= \omega R(-\pi/2) (I_2 - R(\omega) )^{-1} p.
\end{aligned}\]

<p>Observe that \((I_2 - R(\omega) ) (I_2 - R(-\omega) ) = 2(1-\cos(\omega))I_2\). Thus,</p>

\[\begin{aligned}
v &amp;= \omega R(-\pi/2) (I_2 - R(\omega) )^{-1} p \\
&amp;= \omega R(-\pi/2) \frac{(I_2 - R(-\omega) )}{2 (1-\cos(\omega))} p \\
&amp;= \frac{\omega}{2 (1-\cos(\omega))} (I_2 - R(-\omega)) R(-\pi/2) p \\
\end{aligned}\]

<p>In summary,</p>

\[\begin{aligned}
X &amp;= \begin{pmatrix}
    R &amp; p \\ 0_{1\times 2} &amp; 1
\end{pmatrix}, \qquad
\log(X) = \begin{pmatrix}
    \omega^\times &amp; v \\ 0_{1\times 2} &amp; 0
\end{pmatrix}, \\
\omega &amp;:= \mathrm{atan2}(R_{2,1}, R_{1,1}) = \mathrm{atan2}(\sin(\theta), \cos(\theta)) = \theta, \\
v &amp;:= \frac{\omega}{2 (1-\cos(\omega))} (I_2 - R(-\omega)) R(-\pi/2) p
\end{aligned}\]

<h3 id="conclusion">Conclusion</h3>

<p>The formulas presented in this summary are intended to be useful and practical for implementation, which is what I have done in the <a href="https://github.com/pvangoor/pylie">pylie</a> library.
I hope you find it helpful too, and please let me know if you find any issues or mistakes, or have suggestions for improvement!</p>]]></content><author><name></name></author><category term="Mathematics" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The Non-Zero Quaternions as a Lie Group</title><link href="/mathematics/2024/02/12/quaternions.html" rel="alternate" type="text/html" title="The Non-Zero Quaternions as a Lie Group" /><published>2024-02-12T00:00:00+00:00</published><updated>2024-02-12T00:00:00+00:00</updated><id>/mathematics/2024/02/12/quaternions</id><content type="html" xml:base="/mathematics/2024/02/12/quaternions.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h3 id="introduction">Introduction</h3>

<p>Quaternions are well-known to people working in robotics and aerospace.
They (the unit quaternions, specifically) provide a smooth representation of attitude in using only four numbers, in contrast to rotation matrices that require 9 and Euler angles that are not smooth.
In this post, I will explore the quaternions from a slightly different perspective: the quaternions (excluding zero) form a Lie group under multiplication.
We will not restrict ourselves to the unit quaternions, instead exploring the full four-dimensional Lie group.</p>

<h3 id="basic-group-properties">Basic group properties</h3>

<p>Throughout this article, we will write a quaternion \(q \in \mathbb{H}\) as
\(q = (r, u),\)
where \(r \in \mathbb{R}_{\neq 0}\) and \(u \in \mathbb{R}^3\) represent the real and imaginary parts of \(q\), respectively.
The product is defined by</p>

\[\begin{aligned}
q_1 * q_2 &amp;= (r_1, u_1) * (r_2, u_2) \\
&amp;= (r_1 r_2 - u_1^\top u_2, \; r_1 u_2 + r_2 u_1 + u_1 \times u_2).
\end{aligned}\]

<p>The inverse of a quaternion is defined by</p>

\[q^{-1} =  (r^2 + \vert u \vert^2)^{-1} (r, -u).\]

<p>And the group identity is given by \(e := (1, 0_3)\).</p>

<p>The quaternions act on themselves by conjugation. Specifically,</p>

\[\begin{aligned}
\mathrm{Cn}_{q_1}(q_2)
&amp;= q_1 * q_2 * q_1^{-1} \\
% -----
&amp;= (r_1^2 + \vert u_1 \vert^2)^{-1}
(r_1 r_2 - u_1^\top u_2, \; r_1 u_2 + r_2 u_1 + u_1 \times u_2) * (r_1, -u_1)\\
% -----
&amp;= (r_1^2 + \vert u_1 \vert^2)^{-1}
((r_1 r_2 - u_1^\top u_2) r_1 + (r_1 u_2 + r_2 u_1 + u_1 \times u_2)^\top u_1, \\
&amp;\hspace{1cm}
r_1(r_1 u_2 + r_2 u_1 + u_1 \times u_2) - (r_1 r_2 - u_1^\top u_2)u_1 -(r_1 u_2 + r_2 u_1 + u_1 \times u_2) \times u_1 )\\
% -----
&amp;= (r_1^2 + \vert u_1 \vert^2)^{-1}
(r_1^2 r_2 - r_1 u_1^\top u_2 + r_1 u_2^\top u_1 + r_2 u_1^\top u_1 , \\
&amp;\hspace{1cm}
r_1^2 u_2 + r_1 r_2 u_1 + r_1 u_1 \times u_2 - r_1 r_2 u_1 + u_1 u_1^\top u_2 - r_1 u_2 \times u_1 - (u_1 \times u_2) \times u_1 )\\
% -----
&amp;= (r_1^2 + \vert u_1 \vert^2)^{-1}
(r_1^2 r_2 + r_2 u_1^\top u_1 , \\
&amp;\hspace{1cm}
r_1^2 u_2 + 2 r_1 u_1 \times u_2 + u_1 u_1^\top u_2 + u_1 \times (u_1 \times u_2) )\\
% -----
&amp;= (r_2 , \;
(r_1^2 + \vert u_1 \vert^2)^{-1}(r_1^2 u_2 + \vert u_1 \vert^2 u_2 + 2 r_1 u_1 \times u_2 + 2 u_1 \times (u_1 \times u_2)) )\\
% -----
&amp;= (r_2 , \;
u_2 + (2 r_1 u_1 \times u_2 + 2 u_1 \times (u_1 \times u_2))(r_1^2 + \vert u_1 \vert^2)^{-1}).
\end{aligned}\]

<p>Let us denote \(\vert q_1 \vert = \sqrt{r_1^2 + \vert u_1 \vert^2}\) and define \(u_1^\times \in \mathbb{R}^{3\times 3}\) to be the `skew’ matrix such that \(u_1^\times u_2 = u_1 \times u_2\).
Then we end up with a nice and simple formula:</p>

\[\mathrm{Cn}_{q_1}(q_2)
= (r_2 , \;
(I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2})u_2 ).\]

<h3 id="the-quaternion-lie-algebra">The Quaternion Lie Algebra</h3>

<p>There are many ways to think of the Lie algebra of a given Lie group.
Since our main interest is computation, we will choose the way that is easiest to work with for computation.
The Lie algebra \(\mathfrak{h}\) of \(\mathbb{H}\) can identified with the tangent space at the identity \(e\).
This definition is abstract, so we assign some coordinates.
A Lie algebra element is described as \(w^\vee := (s, v) \in \mathbb{R}^4\), where the \(\vee\) operator is the map from the abstract Lie algebra to the coordinates in \(\mathbb{R}^4\).
Near the identity, quaternion group elements can be written as</p>

\[q = e + t w, \quad (r,u) = (1+t s, t v),\]

<p>for small values of \(t \in \mathbb{R}\).</p>

<h4 id="exponential-and-logarithm">Exponential and Logarithm</h4>

<p>The exponential relates the Lie algebra to the Lie group.
We will use the `1-parameter subgroup’ definition here.
Given a Lie algebra element \(w^\vee = (s,v)\), the exponential \(\exp(w)\) is defined as the solution to the initial value problem</p>

\[q(0) = e, \quad \dot{q}(t) = q(t) * w,\]

<p>at \(t = 1\). Let us evaluate the differential equation to find</p>

\[\begin{aligned}
\dot{q} &amp;= q * w \\
&amp;:= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} (r,u) * (1+t s, t v) \\
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0}
(r (1+ts) - t u^\top v, \; r t v + (1+ts) u + t u \times v) \\
(\dot{r}, \dot{u})
&amp;=
(r s - u^\top v, \; r v + s u + u \times v).
\end{aligned}\]

<p>This ODE is not straightforward to solve, unless we realise that this system is, in fact, linear!
Writing \(q\) as a vector in \(\mathbb{R}^4\), we have</p>

\[\begin{aligned}
\left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0}
\begin{pmatrix} r \\ u \end{pmatrix}
&amp;= \begin{pmatrix} s &amp; - v^\top \\ v &amp; s I_3 - v^\times \end{pmatrix}
\begin{pmatrix} r \\ u \end{pmatrix}
= \begin{pmatrix} 0 &amp; - v^\top \\ v &amp; - v^\times \end{pmatrix}
\begin{pmatrix} r \\ u \end{pmatrix} + s \begin{pmatrix} r \\ u \end{pmatrix}.
\end{aligned}\]

<p>Since \(s\) acts as a scaling factor, we can pull it out of the equation for now, and solve the problem without it.
Specifically,</p>

\[\left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} e^{-t s} q = e^{-t s} \dot{q} - s e^{-t s} q = A e^{-t s} q,\]

<p>so if we solve the problem while ignoring \(s\), we can add it back in at the end.
To solve the ODE now, we only have to compute the matrix exponential</p>

\[\begin{aligned}
A &amp;:= \begin{pmatrix} 0 &amp; - v^\top \\ v &amp; - v^\times \end{pmatrix} &amp;
\exp(A) &amp;= \sum_{k=0}^\infty \frac{1}{k!} A^k.
\end{aligned}\]

<p>Examining the first nontrivial power of \(A\) reveals that</p>

\[\begin{aligned}
A^2 &amp;= \begin{pmatrix} 0 &amp; - v^\top \\ v &amp; - v^\times \end{pmatrix}^2
= \begin{pmatrix} -\vert v \vert^2 &amp; 0_{1\times 3} \\ 0_{3\times 1} &amp; (v^\times)^2 - v v^\top \end{pmatrix}
= \begin{pmatrix} -\vert v \vert^2 &amp; 0_{1\times 3} \\ 0_{3\times 1} &amp; - \vert v \vert^2 I_3 \end{pmatrix}
= - \vert v \vert^2 I_4.
\end{aligned}\]

<p>Substituting this into the exponential formula yields</p>

\[\begin{aligned}
\exp(A) &amp;= \sum_{k=0}^\infty \frac{1}{k!} A^k \\
&amp;= \sum_{k=0}^\infty \frac{1}{(2k)!} A^{2k} + \sum_{k=0}^\infty \frac{1}{(2k+1)!} A^{2k+1} \\
&amp;= \sum_{k=0}^\infty \frac{1}{(2k)!} (- \vert v \vert^2 I_4)^k + \sum_{k=0}^\infty \frac{1}{(2k+1)!} (- \vert v \vert^2 I_4)^{k} A \\
&amp;= \sum_{k=0}^\infty \frac{(-1)^k}{(2k)!} \vert v \vert^{2k} I_4+ \vert v \vert^{-1} \sum_{k=0}^\infty \frac{(-1)^k}{(2k+1)!} \vert v \vert^{2k+1} A \\
&amp;= \cos(\vert v \vert) I_4 + \frac{\sin(\vert v \vert)}{\vert v \vert} A.
\end{aligned}\]

<p>Therefore, we have our final solution,</p>

\[\begin{aligned}
\exp(w) &amp;= e^s \exp(A) \begin{pmatrix} 1 \\ 0_3 \end{pmatrix} \\
&amp;= \left( \cos(\vert v \vert) I_4 + \frac{\sin(\vert v \vert)}{\vert v \vert} A \right) \begin{pmatrix} e^s \\ 0_3 \end{pmatrix} \\
&amp;= \cos(\vert v \vert)\begin{pmatrix} e^s  \\ 0_3 \end{pmatrix} + \frac{\sin(\vert v \vert)}{\vert v \vert} \begin{pmatrix} 0 &amp; - v^\top \\ v &amp; - v^\times \end{pmatrix}  \begin{pmatrix} e^s \\ 0_3 \end{pmatrix} \\
&amp;= \begin{pmatrix} e^s \cos(\vert v \vert) \\ e^s \sin(\vert v \vert) \frac{v}{\vert v \vert} \end{pmatrix}.
\end{aligned}\]

<p>Note that, if \(\vert v \vert = 0\), then the whole computation simplifies and the solution is simply \(\exp(w) = ( e^s, 0_3)\).</p>

<p>The logarithm is found by inverting this formula, although there may be multiple solutions for a given \(q \in \mathbb{H}\).
Suppose that \(q = \exp(w)\). Then we wish to determine the components of \(w = (s, v)\) in terms of \(q = (r, u)\). We have</p>

\[\begin{aligned}
q &amp;= \exp(w), \\
(r, u) &amp;= (e^s \cos(\vert v \vert), e^s \sin(\vert v \vert) \frac{v}{\vert v \vert}).
\end{aligned}\]

<p>Immediately, we see that \(e^s = r / \cos(\vert v \vert)\). Substituting this into the \(u\)-component,</p>

\[\begin{aligned}
u &amp;= r \tan(\vert v \vert) \frac{v}{\vert v \vert}, \\
\frac{u}{\vert u \vert} \vert u \vert &amp;= r \tan(\vert v \vert) \frac{v}{\vert v \vert}, \\
r^{-1} \vert u \vert \frac{u}{\vert u \vert} &amp;= \tan(\vert v \vert) \frac{v}{\vert v \vert}, \\
v &amp;=  \frac{\arctan(r^{-1} \vert u \vert)}{\vert u \vert} u
\end{aligned}\]

<p>Rather than substitute this back into the formula for \(e^s\), we observe that the norm of both sides of the original equation satisfies</p>

\[\begin{aligned}
\vert q \vert &amp;= \vert \exp(w) \vert, \\
\sqrt{r^2 + \vert u \vert^2} &amp;= e^{s} , \\
s &amp;= \ln(\sqrt{r^2 + \vert u \vert^2}).
\end{aligned}\]

<p>In summary, we have thus found the logarithm to be</p>

\[\begin{aligned}
\log(q) &amp;= \left( \frac{1}{2} \ln(r^2 + \vert u \vert^2), \; \frac{\arctan(r^{-1} \vert u \vert)}{\vert u \vert} u \right).
\end{aligned}\]

<p>Similarly to the exponential formula, we should note that, if \(\vert u \vert = 0\), the formula simplifies to \(\log(q) = (\ln(r), 0_3)\).</p>

<h4 id="adjoint-operators-and-lie-bracket">Adjoint Operators and Lie Bracket</h4>

<p>The big and little Adjoint operators are another important aspect of the Quaternion Lie algebra.
The `big’ Adjoint operator \(\mathrm{Ad} : \mathbb{H} \times \mathfrak{h} \to \mathfrak{h}\) is defined by</p>

\[\begin{aligned}
\mathrm{Ad}_q (w)
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} \mathrm{Cn}_q(e + t w) \\
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0}
(1+ t s , \; (I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2})(t v) ) \\
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0}
(s , \; (I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2})v ).
\end{aligned}\]

<p>In matrix form,</p>

\[\begin{aligned}
\mathrm{Ad}_q \simeq  \begin{pmatrix}
    1 &amp; 0_{1\times 3} \\
    0_{3\times 1} &amp; I_3 + (2 r_1 u_1^\times + 2 (u_1^\times)^2 )\vert q_1 \vert^{-2}
\end{pmatrix}.
\end{aligned}\]

<p>The `little’ adjoint operator \(\mathrm{ad} : \mathfrak{h} \times \mathfrak{h} \to \mathfrak{h}\) is defined as the derivative of the big Adjoint operator,</p>

\[\begin{aligned}
\mathrm{ad}_{w_1} (w_2)
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0} \mathrm{Ad}_{e+ t w_1}w_2 \\
&amp;= \left. \frac{\mathrm{d}}{\mathrm{d} t} \right\vert_{t=0}
(s_2 , \; (I_3 + (2 (1+t s_1) (t v_1)^\times + 2 ((t v_1)^\times)^2 )\vert e + t w_1 \vert^{-2})v_2 ) \\
&amp;= (0 , \; 2 v_1^\times v_2 ).
\end{aligned}\]

<p>Once more, in matrix form,</p>

\[\begin{aligned}
\mathrm{ad}_w \simeq  \begin{pmatrix}
    0 &amp; 0_{1\times 3} \\
    0_{3\times 1} &amp; 2v^\times
\end{pmatrix}.
\end{aligned}\]

<p>The Lie bracket is equivalent to the adjoint operator, in the sense that</p>

\[\begin{aligned}
\left[w_1, w_2\right] := \mathrm{ad}_{w_1}(w_2) = (0 , \; 2 v_1^\times v_2 ).
\end{aligned}\]

<h3 id="matrix-representation">Matrix Representation</h3>

<p>The final topic of interest for computations is the matrix representation of \(\mathbb{H}\).
Matrix representations are rarely unique, but sometimes can be nice.
The matrix representation we consider is \(\rho : \mathbb{H} \to \mathbf{GL}(4)\), given by</p>

\[\begin{aligned}
\rho(q) := \begin{pmatrix}
    r&amp; u_1&amp; u_2&amp; u_3 \\
    -u_1&amp; r&amp; -u_3&amp; u_2 \\
    -u_2&amp; u_3&amp; r&amp; -u_1 \\
    -u_3&amp;-u_2&amp;u_1&amp;r \\
\end{pmatrix}.
\end{aligned}\]

<p>The matrix representation of the Lie algebra \(\mathfrak{h}\) is basically the same.
Verifying that these are indeed representations is a messy and time-consuming computation.
However, working out a matrix representation is very rewarding in that it provides a way to check all the other computations we have done.
Specifically, we can check things like the inverse \(\rho(q)^{-1} = \rho(q^{-1})\), the exponential \(\mathrm{expm}(\mathrm{d}\rho(w)) = \rho(\exp(w))\), and the adjoint operators \(\mathrm{Ad}_q w = \rho(q) \mathrm{d}\rho(w) \rho(q)^{-1}\).</p>

<h3 id="summary">Summary</h3>

<p>I decided to write this post when I needed these formulas for the \(n\)th time, and I realised that deriving them every time I needed them was taking too long.
I hope they are helpful to anyone else who reads them, and please let me know if you spot any mistakes!</p>]]></content><author><name></name></author><category term="Mathematics" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">The Unscented Transform</title><link href="/mathematics/2022/10/11/unscented_transform.html" rel="alternate" type="text/html" title="The Unscented Transform" /><published>2022-10-11T00:00:00+00:00</published><updated>2022-10-11T00:00:00+00:00</updated><id>/mathematics/2022/10/11/unscented_transform</id><content type="html" xml:base="/mathematics/2022/10/11/unscented_transform.html"><![CDATA[<!-- https://talk.jekyllrb.com/t/jekyll-and-mathjax/5514 -->
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        jax: ["input/TeX","input/MathML","output/SVG", "output/CommonHTML"],
    extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "CHTML-preview.js"],
    TeX: {
      extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
    },
      tex2jax: {
          inlineMath: [ ['$','$'], ["\\(","\\)"] ],
          displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
          processEscapes: true,
          processEnvironments: true
        },
        "HTML-CSS": { availableFonts: ["TeX"] }
      });
</script>

<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>

<h3 id="introduction">Introduction</h3>

<p>The unscented transform is a way to approximate the probability distribution of a variable that is normally distributed after it is put through a nonlinear function.
Concretely, let’s suppose that \(x \sim N(\mu, \Sigma)\), and let \(f : \mathbb{R}^n \to \mathbb{R}^m\) be some nonlinear function.
Then we might ask: how can we approximate the distribution of \(f(x)\) by another normal distribution?</p>

<p>The ‘traditional’ approach is through linearisation.
We know that if \(f\) is a linear or affine mapping, \(f(x) = A x + b\), then the distribution of \(f(x)\) is given by</p>

\[Ax+b \sim N(A\mu + b, A\Sigma A^\top)\]

<p>So one way to approximate the distribution of \(f(x)\) for a nonlinear \(f\) is to simply say</p>

\[f(x) 
\approx f(\mu) + D f(\mu)[x-\mu]
\sim N(f(\mu), D f(\mu) \Sigma D f(\mu)^\top),\]

<p>where \(Df(\mu)\) is the differential (Jacobian) of \(f\) at \(\mu\).
The problem is that this may not be a very good approximation, so what else can we do?</p>

<h3 id="sigma-points">Sigma Points</h3>

<p>What if, instead of trying to approximate \(f\) and then applying it to the distribution, we approximate the distribution and then apply \(f\)?
This is the idea of the unscented transform.
But how can we approximate the distribution?
The answer is ‘sigma points’.
A set of sigma points \(S\) for the distribution \(N(\mu, \Sigma)\) consists of points \(x^{(i)}\) and weights \(w^{(i)}\) so that</p>
<ol>
  <li>\(\sum_i w^{(i)} = 1\),</li>
  <li>\(\sum_i w^{(i)} x^{(i)} = \mu\),</li>
  <li>\(\sum_i w^{(i)} (x^{(i)} - \mu)(x^{(i)} - \mu)^\top = \Sigma\).</li>
</ol>

<p>This approximates the original distribution if you interpret it as a probability function.
Let \(\Omega = \{ x^{(i)} \mid i=0,1,...,p \}\), define a probability function \(p: \Omega \to \mathbb{R}\) by \(p(x^{(i)}) = w^{(i)}\), and let \(x\) be a random variable distributed according to \(p\). Then,</p>
<ol>
  <li>\(\sum_{x \in \Omega} p(x) = \sum_i w^{(i)} = 1\),</li>
  <li>\(\mathbb{E}[x] = \sum_i w^{(i)} x^{(i)} = \mu\),</li>
  <li>\(\mathrm{cov}(x,x) = \mathbb{E}\left[ (x - \mathbb{E}[x])(x - \mathbb{E}[x])^\top \right] = \sum_i w^{(i)} (x^{(i)} - \mu)(x^{(i)} - \mu)^\top = \Sigma\).</li>
</ol>

<p>Now, this is only possible if there are at least \(n+1\) points, but there is no unique choice of sigma points.
However, the original choice by Uhlmann gives us a ‘canonical’ choice; for \(i = 1,..., 2n\) define</p>

\[\begin{aligned}
w^{(i)} &amp;= \frac{1}{2n}, &amp;
x^{(i)} &amp;= \mu + 
\begin{cases} 
(\sqrt{n \Sigma})_i &amp; \text{if }  i \leq n \\
-(\sqrt{n \Sigma})_i &amp; \text{if } i &gt; n
\end{cases}
\end{aligned}\]

<p>Here \((\sqrt{n\Sigma})_i\) denotes the \(i\)th column of the matrix square-root of \(\Sigma\) multiplied by \(n\).
Specifically, if \(A = \sqrt{n \Sigma}\), then \(A A^\top = n \Sigma\).</p>

<h3 id="unscented-transform">Unscented Transform</h3>

<p>Now that we have a set of sigma points that approximate the original distribution, we need to apply the nonlinear function \(f\).
This time, instead of approximating \(f\), we simply apply it to our sigma points and gather statistics at the end.
Let \(x\) be a random variable distributed according to the sigma points.
Then we compute the mean and covariance of \(f(x)\) as follows.</p>

\[\begin{aligned}
\hat{\eta} 
&amp;:= \mathbb{E} \left[ f(x) \right], \\
&amp;= \sum_i w^{(i)} f(x^{(i)}), \\
%-----------------------------
\hat{\Sigma} 
&amp;:= \mathbb{E} \left[ (f(x) - \mathbb{E} \left[ f(x) \right])(f(x) - \mathbb{E} \left[ f(x) \right])^\top \right], \\
&amp;= \mathbb{E} \left[ (f(x) - \hat{\eta})(f(x) - \hat{\eta})^\top \right], \\
&amp;= \sum_i w^{(i)} (f(x^{(i)}) - \hat{\eta})(f(x^{(i)}) - \hat{\eta})^\top.
\end{aligned}\]

<p>This is how we now approximate \(f(x)\): we say that, approximately, \(f(x) \sim N(\hat{\eta}, \hat{\Sigma})\).
This is quite different from the linearisation approach, but is it any better?
That depends on the function \(f\), the choice of sigma points, and also on what you mean by “better”.</p>

<h3 id="analysis">Analysis</h3>

<p>How does the unscented transform compare to the true distribution of \(f(x)\)?
Let \(x \sim N(\mu, \Sigma)\), and let \(s = x - \mu \sim N(0, \Sigma)\).
Then we can calculate the expected value of \(f(x)\) by using a Taylor expansion,</p>

\[\begin{aligned}
\mathbb{E} \left[ f(x) \right]
&amp;= \mathbb{E} \left[ f(\mu + s) \right], \\
&amp;= \mathbb{E} \left[
    f(\mu) + Df(\mu)[s] + \frac{1}{2} D^2 f(\mu)[s,s] + \cdots
\right], \\
&amp;= \mathbb{E} [f(\mu)]
    + Df(\mu) \mathbb{E} [s]
    + \frac{1}{2} D^2 f(\mu) \mathbb{E}[s,s]
    + \cdots, \\
&amp;= f(\mu)
    + Df(\mu) [0]
    + \frac{1}{2} D^2 f(\mu) \mathrm{cov}(s,s)
    + \cdots, \\
&amp;= f(\mu) + \frac{1}{2} D^2 f(\mu) \Sigma
    + O(\mathbb{E}[\vert s \vert^3]).
\end{aligned}\]

<p>In other words, the expected value of \(f(x)\) is determined by the mean and covariance of \(x\) up to third order deviations from the mean.
In particular, this means that the unscented transform gives the correct transformed distribution for all functions \(f\) where \(D^k f = 0\) for \(k \geq 3\); that is, functions \(f\) that are degree 2 polynomials.
In fact, using the canonical choice of sigma points, this is true for polynomials of degree 3 as well.</p>

<p>What about the covariance obtained from the unscented transform?
It turns out that this is equal to the covariance obtained from the unscented transform only when \(f\) is a first-order (linear-affine) function.
However, practical experience has shown that it can offer better performance in a range of applications.</p>

<h3 id="conclusion">Conclusion</h3>

<p>The unscented transform provides a different way to propagate uncertainty through nonlinear functions than the “standard” approach of linearisation.
Rather than approximate the function as a linear function, it approximates the probability distribution as a discrete probability function.
This has the advantage that there is no need to compute the derivative of a function to linearise it, and propagates the mean of the function more accurately than the linearisation approach.
In practice, a number of applications show that the unscented transform can outperform the linearisation approach.
Overall, in my view it is a very interesting perspective on approximations to probability that is not widely enough understood.</p>]]></content><author><name></name></author><category term="Mathematics" /><summary type="html"><![CDATA[]]></summary></entry></feed>