Daleckii-Krein Theorem applied to functions of a quaternion

I wanted to see what would happen if I applied the Daleckii-Krein Theorem to a quaternion function via its $4\times 4$ matrix representation.

$\def\h{\odot}\def\k{\otimes}\def\bb{\mathbb}\def\bbC#1{{\bb C}^{#1}}\def\bbH#1{{\bb H}^{#1}}\def\bbR#1{{\bb R}^{#1}}\def\a{\alpha} \def\b{\beta}\def\g{\gamma} \def\l{\lambda}\def\o{{\tt1}} \def\z{{\bf 0}}\def\e{{\bf e}}\def\n{{\bf n}}\def\CMR#1#2{\left\lbrace #1 \; \middle| \; #2 \right\rbrace}\def\CR#1{\left\lbrace #1\right\rbrace}\def\LR#1{\left(#1\right)}\def\lR#1{\Big(#1\Big)}\def\BR#1{\left[#1\right]}\def\bR#1{\Big[#1\Big]}\def\op#1{\operatorname{#1}}\def\xn#1#2{\left\| #2 \right\|_{\small#1}}\def\frob#1{\left\| #1 \right\|_{\small F}}\def\frob#1{\xn F{#1}}\def\Real#1{\op{\sf Real}\LR{#1}}\def\Imag#1{\op{\sf Imag}\LR{#1}}\def\Diag#1{\op{Diag}\LR{#1}}\def\vc#1{\op{vec}\LR{#1}}\def\quat#1{\op{Quat}\LR{#1}}\def\trace#1{\op{Tr}\LR{#1}}\def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}}\def\m#1{\left[\begin{array}{r}#1\end{array}\right]}\def\mmm#1{\left[\begin{array}{rrr|rrr|rrr}#1\end{array}\right]}\def\mmmm#1{\left[\begin{array}{rrrr|rrrr|rrrr|rrrr}#1\end{array}\right]}\def\mc#1{\left[\begin{array}{c|c}#1\end{array}\right]}\def\mq#1{\left[\begin{array}{r|rrr}#1\end{array}\right]}\def\mqq#1{\left[\begin{array}{rr|rr}#1\end{array}\right]}\def\BEvec#1{\begin{Bmatrix}#1\end{Bmatrix}}\def\fracbR#1#2{\bR{\frac{#1}{#2}}}\def\fracLR#1#2{\LR{\frac{#1}{#2}}}\def\q{\quad} \def\qq{\qquad}\def\qiq{\q\implies\q} \def\qif{\q\iff\q}\def\T{{\sf T}}$It is well known that a quaternion can be represented by a real matrix$$\eqalign{a = \BEvec{a_0\\ a_1\\ a_2\\ a_3},\qqA = \quat{a} = \mq{a_0 & -a_1 & -a_2 & -a_3 \\\hlinea_1 & a_0 & -a_3 & a_2 \\a_2 & a_3 & a_0 & -a_1 \\a_3 & -a_2 & a_1 & a_0}}$$There are many other ways to map a quaternion to a $4\times 4$ matrix, but I like this particular representation because the first column contains the components of the quaternion in order and with the proper sign.

I'm not really a fan of quaternions per se. I think they are overhyped and misused. But the matrix representation forms a nice closed group under addition and multiplication, and has other interesting properties$$\eqalign{\a^2 &\equiv \frob{a}^2 = \tfrac14\,\frob{A}^2 = {\sqrt{\det(A)}} \\\g^2 &\equiv a_1^2 + a_2^2 + a_3^2 \\\l &= a_0 + i\g \qq {\{ {\rm eigenvalue} \}} \\\l^*\l &= \a^2 \\A^\T A &= \a^2 \qiq A^{-1} = \a^{-2}A^\T \\}$$The eigenvalue $\l$ is associated with two independent eigenvectors, e.g.$$\eqalign{v = \mc{a_1a_2+i\g a_3 \\a_1a_3-i\g a_2 \\0\; \\a_2^2 + a_3^2\; \\},\qq \qw = \mc{-a_1a_3+i\g a_2 \\\;\;a_1a_2+i\g a_3 \\a_2^2 + a_3^2 \\0 \\}}$$The remaining eigenpairs are simply complex conjugates of these.

The following matrix$$\small\eqalign{H^\T = \mmmm{\o & 0 & 0 & 0 & 0 &\o & 0 & 0 & 0 & 0 &\o & 0 & 0 & 0 & 0 &\o \\ 0 &\o & 0 & 0 &-\o & 0 & 0 & 0 & 0 & 0 & 0 &\o & 0 & 0 &-\o & 0 \\ 0 & 0 &\o & 0 & 0 & 0 & 0 &-\o &-\o & 0 & 0 & 0 & 0 &\o & 0 & 0 \\ 0 & 0 & 0 &\o & 0 & 0 &\o & 0 & 0 &-\o & 0 & 0 &-\o & 0 & 0 & 0 \\} \\}$$can be used to vectorize/devectorize the matrix representation$$\eqalign{\vc{A} = Ha \qif a = \tfrac14H^\T\vc{A} \\}$$

With the preliminaries out of the way, we've arrived at the main topic of this post.

Given a differentiable function$$\eqalign{\def\f{\phi}\f = \f(\l) \qiq \psi = \frac{d\f}{d\l} \\}$$the DK Theorem states that if we apply the function to a square matrixwe can calculate its Frechet derivative like so$$\eqalign{\def\DVR{\Diag{\vc{R}}}\def\R{\;{\large\cal R}\;}\def\V{V^{-1}}L &= \Diag{\l,\l,\l^*,\l^*}, \q V=\m{v\;w\;v^*\;w^*} \\A &= VL\V \\F &= \f(A) = \quat{f} \qif f=\f(a) \\dF &= V\BR{R\h\LR{\V\:dA\:V}}\V \\}$$where $\h$ denotes the Hadamard product and the components of the $R$ matrix are$$\eqalign{R_{jk} &= \begin{cases}{\large\frac{\f(\l_j)-\f(\l_k)}{\l_j-\l_k}} \qq {\rm if}\;\;\l_j\ne\l_k \\\\\q\psi(\l_k) \qq\q\;\; {\rm otherwise} \qq \qq\end{cases} \\\\\R &= \DVR \\}$$Due to the nature of the eigenvalues in this problem,$R$ is a symmetric block matrix$$\eqalign{\b &= \frac{\f(\l)-\f(\l^*)}{\l-\l^*} = \frac{\Imag{\f(\l)}}{\g} \\R &= \mqq{\psi & \psi & \b & \b \\\psi & \psi & \b & \b \\\hline\b & \b & \psi^* & \psi^* \\\b & \b & \psi^* & \psi^* \\} = R^\T \\}$$By vectorizing the Frechet derivative and utilizing the $H$ matrix,we can easily derive an expression for the Jacobian of thequaternion function $\;f=\f(a)$$$\eqalign{\vc{dF} &= \vc{V\BR{R\h\LR{\V\:dA\:V}}\V} \\H\,df &= \LR{\V\k V^\T}^\T\R \LR{V^\T\k\V}\,\vc{dA} \\df &= \frac14\,H^\T\LR{\V\k V^\T}^\T\R \LR{V^\T\k\V} H\,da\qq\qq\\J\,\equiv\,\grad fa&= \frac14\,H^\T\LR{\V\k V^\T}^\T\R \LR{V^\T\k\V} H \\}$$The interesting thing is that, after testing with a variety of functionsand quaternions, the Jacobian always takes the form of a$\sf Real$ block-diagonal matrix$$\eqalign{J = \mq{\b & 0 & 0 & 0 \\\hline0 & * & * & * \\0 & * & * & * \\0 & * & * & * \\} \in \bbR{4\times 4}}$$Aside from the $\sf Real$ aspect, it's not clear (to me) why it takes this form.The lower $3\times 3$ block appears to be random, with no obvious structure(skew, symmetric, orthogonal, etc).

I assume that analysts who regularly deal with quarternions need this Jacobian,e.g. to invoke the Chain Rule. While $J$ is a $4\times 4$ matrix, it does not represent a quaternion, therefore it cannot be calculated using standard quaternion operations. So how does one calculate it without resorting to matrix methods?

Also, is there a simple explanation for the block-diagonal structure?

Latest Images

Trending Articles

Latest Images