common.title

Docs
Quantum Circuit
TYTAN CLOUD

QUANTUM GAMING


Overview
Contact
Event
Project
Research

Terms of service (Web service)

Terms of service (Quantum and ML Cloud service)

Privacy policy


Sign in
Sign up
common.title

量子コンピュータの基本 - 線形代数の公式~スカラー変数のベクトル微分、行列微分

Tetsuro Tabata

2021/06/20 00:30

#量子コンピュータ #量子計算 #線形代数 #ベクトル微分

量子コンピュータの基本 - 線形代数の公式~スカラー変数のベクトル微分、行列微分

§ この記事の目的

量子コンピュータのみに限らず、機械学習や深層学習の理論でもよく出てくる線形代数の計算や式展開のうち、ベクトルや行列を用いた微分の計算の公式とその導出方法について確認します。

§ 微分公式のまとめ

まずは公式の一覧を示します。

1. ベクトル微分の公式

ここでは、\mathbf{x}\mathbf{a}を列ベクトル、Aを行列とします。

\frac{\partial \mathbf{x}^T\mathbf{a}}{\partial\mathbf{x}}= \frac{\partial \mathbf{a}^T\mathbf{x}}{\partial\mathbf{x}}=\mathbf{a}\quad\quad(式1)\\ \frac{\partial \mathbf{x}^TA\mathbf{x}}{\partial \mathbf{x}}=(A + A^T)\mathbf{x}\quad\quad(式2)\\ \frac{\partial tr(\mathbf{xa}^T)}{\partial \mathbf{x}} = \frac{\partial tr(\mathbf{ax}^T)}{\partial \mathbf{x}} = \mathbf{a}\quad\quad(式3)\\ \frac{\partial}{\partial \mathbf{x}}(\mathbf{a}-A\mathbf{x})^T(\mathbf{a}-A\mathbf{x})=-2A^T(\mathbf{a}-A\mathbf{x})\quad\quad(式4)

2. 行列微分の公式

ここでは、\mathbf{x}\mathbf{y}を列ベクトル、XAを行列とします。
また、|A|は行列Aの行列式を意味します。

\frac{\partial \mathbf{x}^TX\mathbf{y}}{\partial X}=\mathbf{x}\mathbf{y}^T\quad\quad(式5)\\ \frac{\partial \mathbf{x}^TX^{-1}\mathbf{y}}{\partial X}=-X^{-1}\mathbf{x}\mathbf{y}^TX^{-1}\quad\quad(式6)\\ \frac{\partial \log |X|}{\partial X}=(X^{-1})^T\quad\quad(式7)\\ \frac{\partial tr(X)}{\partial X}=I\quad\quad(式8)\\ \frac{\partial tr(XA)}{\partial X}=A^T\quad\quad(式9)\\ \frac{\partial tr(X^TA)}{\partial X}=A\quad\quad(式10)\\ \frac{\partial tr(XAA^T)}{\partial X}=X(A + A^T)\quad\quad(式10)\\ \frac{\partial}{\partial x}\log|X|=tr\bigg(X^{-1}\frac{\partial X}{\partial x}\bigg)\quad\quad(式11)

スカラーyが行列Xの関数y=f(X)で表される場合、yの関数g(y)の行列Xでの微分は、

\frac{\partial g(y)}{\partial X} = \frac{\partial g(y)}{\partial y}\frac{\partial f(X)}{\partial X}\quad\quad(式12)

§ 公式の導出

(式1)~(式4)までの導出方法を確認します。
厳密な証明でなく、あくまで一例ですのでご了承下さい。
(式5)以降は紙面の都合上、導出を割愛します。

1. (式1)の導出

例として3次元を取り使いますが、どの次元でも結果は同じとなります。
列ベクトル\mathbf{x},\mathbf{a}をそれぞれ以下とします。

\mathbf{x}=\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}, \mathbf{a}=\begin{pmatrix}a_1 \\ a_2 \\ a_3\end{pmatrix}

また、ベクトル微分\frac{\partial }{\partial \mathbf{x}}は以下のように作用するものとします。

\frac{\partial}{\partial \mathbf{x}} = \begin{pmatrix}\frac{\partial }{\partial x_1} \\ \frac{\partial }{\partial x_2} \\ \frac{\partial }{\partial x_3}\end{pmatrix}

以下、導出例です。

\begin{align} \frac{\partial \mathbf{x}^T\mathbf{a}}{\partial \mathbf{x}} &= \frac{\partial}{\partial \mathbf{x}} \left\{(x_1\;x_2\;x_3)\begin{pmatrix}a_1 \\ a_2 \\ a_3\end{pmatrix} \right\}\\ &= \frac{\partial}{\partial \mathbf{x}}(a_1x_1+a_2x_2+0a_3x_3)\\ &=\begin{pmatrix}\frac{\partial }{\partial x_1} \\ \frac{\partial }{\partial x_2} \\ \frac{\partial }{\partial x_3}\end{pmatrix}(a_1x_1+a_2x_2+a_3x_3)\\ &= \begin{pmatrix} \frac{\partial}{\partial x_1}(a_1x_1+a_2x_2+a_3x_3) \\ \frac{\partial}{\partial x_2}(a_1x_1+a_2x_2+a_3x_3) \\ \frac{\partial}{\partial x_3}(a_1x_1+a_2x_2+a_3x_3) \end{pmatrix}\\ &=\begin{pmatrix}a_1 \\ a_2 \\ a_3\end{pmatrix} \\ &=\mathbf{a} \end{align}

また、

\begin{align} \frac{\partial \mathbf{a}^T\mathbf{x}}{\partial \mathbf{x}} &= \frac{\partial}{\partial \mathbf{x}} \left\{(a_1\;a_2\;a_3)\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix} \right\}\\ &= \frac{\partial}{\partial \mathbf{x}}(a_1x_1+a_2x_2+0a_3x_3)\\ &=\begin{pmatrix}\frac{\partial }{\partial x_1} \\ \frac{\partial }{\partial x_2} \\ \frac{\partial }{\partial x_3}\end{pmatrix}(a_1x_1+a_2x_2+a_3x_3)\\ &=\begin{pmatrix}a_1 \\ a_2 \\ a_3\end{pmatrix} \\ &=\mathbf{a} \end{align}

2. (式2)の導出

行列Aを以下とします。

A=\begin{pmatrix}A_{11} & A_{12} & A_{13}\\A_{21} & A_{22} & A_{23}\\A_{31} & A_{32} & A_{33}\end{pmatrix}

以下、導出例です。

\begin{align} \frac{\partial \mathbf{x}^TA\mathbf{x}}{\partial \mathbf{x}} &=\frac{\partial}{\partial \mathbf{x}} \left\{(x_1\;x_2\;x_3)\begin{pmatrix}A_{11} & A_{12} & A_{13}\\A_{21} & A_{22} & A_{23}\\A_{31} & A_{32} & A_{33}\end{pmatrix}\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}\right\}\\ &=\frac{\partial}{\partial \mathbf{x}} \left\{(A_{11}x_1+A_{21}x_2+A_{31}x_3\quad A_{12}x_1+A_{22}x_2+A_{32}x_3\quad A_{13}x_1+A_{23}x_2+A_{33}x_3)\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}\right\}\\ &=\frac{\partial}{\partial \mathbf{x}}(A_{11}x_1x_1+A_{21}x_1x_2+A_{31}x_1x_3+A_{12}x_1x_2+A_{22}x_2x_2+A_{32}x_2x_3+A_{13}x_1x_3+A_{23}x_2x_3+A_{33}x_3x_3)\\ &=\begin{pmatrix}2A_{11}x_1+A_{21}x_2+A_{31}x_3+A_{12}x_2+A_{13}x_3 \\ A_{21}x_1+A_{12}x_1+2A_{22}x_2+A_{32}x_3+A_{23}x_3 \\ A_{31}x_1+A_{32}x_2+A_{13}x_1+A_{23}x_2+2A_{33}x_3 \end{pmatrix}\\ &=\begin{pmatrix}A_{11}x_1+A_{21}x_2+A_{31}x_3 \\ A_{12}x_1+A_{22}x_2+A_{32}x_3 \\ A_{13}x_1+A_{23}x_2+A_{33}x_3\end{pmatrix}+ \begin{pmatrix}A_{11}x_1+A_{12}x_2+A_{13}x_3 \\ A_{21}x_1+A_{22}x_2+A_{23}x_3 \\ A_{31}x_1+A_{32}x_2+A_{33}x_3\end{pmatrix}\\ &=\begin{pmatrix}A_{11} & A_{21} & A_{31} \\ A_{12} & A_{22} & A_{32} \\ A_{13} & A_{23} & A_{33}\end{pmatrix}\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}+ \begin{pmatrix}A_{11} & A_{12} & A_{13} \\ A_{21} & A_{22} & A_{23} \\ A_{31} & A_{32} & A_{33}\end{pmatrix}\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}\\ &=(A^T + A)\mathbf{x}\\ &=(A + A^T)\mathbf{x} \end{align}

3. (式3)の導出

\begin{align} \frac{\partial tr(\mathbf{x}\mathbf{a}^T)}{\partial \mathbf{x}} \end{align}

ここで、

\begin{align} \mathbf{x}\mathbf{a}^T&=\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}(a_1\;a_2\;a_3)\\ &=\begin{pmatrix} a_1x_1 & a_2x_1 & a_3x_1 \\ a_1x_2 & a_2x_2 & a_3x_2 \\ a_1x_3 & a_2x_3 & a_3x_3 \end{pmatrix} \end{align}

トレースを取ると、

tr(\mathbf{x}\mathbf{a}^T)=a_1x_1 + a_2x_2 + a_3x_3

よって、

\begin{align} \frac{\partial tr(\mathbf{x}\mathbf{a}^T)}{\partial \mathbf{x}}&= \frac{\partial}{\partial \mathbf{x}}(a_1x_1 + a_2x_2 + a_3x_3)\\ &=\begin{pmatrix} \frac{\partial}{\partial x_1}(a_1x_1 + a_2x_2 + a_3x_3)\\ \frac{\partial}{\partial x_2}(a_1x_1 + a_2x_2 + a_3x_3)\\ \frac{\partial}{\partial x_3}(a_1x_1 + a_2x_2 + a_3x_3) \end{pmatrix}\\ &=\begin{pmatrix}a_1 \\ a_2 \\ a_3\end{pmatrix}\\ &=\mathbf{a} \end{align}

また、

\begin{align} \frac{\partial tr(\mathbf{a}\mathbf{x}^T)}{\partial \mathbf{x}} \end{align}

ここで、

\begin{align} \mathbf{a}\mathbf{x}^T&=\begin{pmatrix}a_1 \\ a_2 \\ a_3\end{pmatrix}(x_1\;x_2\;x_3)\\ &=\begin{pmatrix} a_1x_1 & a_1x_2 & a_1x_3 \\ a_2x_1 & a_2x_2 & a_2x_3 \\ a_3x_1 & a_3x_2 & a_3x_3 \end{pmatrix} \end{align}

トレースを取ると、

tr(\mathbf{a}\mathbf{x}^T)=a_1x_1 + a_2x_2 + a_3x_3

これ以降は同じ導出のため省略します。

4. (式4)の導出

\begin{align} \frac{\partial}{\partial \mathbf{x}}(\mathbf{a}-A\mathbf{x})^T(\mathbf{a}-A\mathbf{x})&=\frac{\partial}{\partial \mathbf{x}}(\mathbf{a}^T-\mathbf{x}^TA^T)(\mathbf{a}-A\mathbf{x})\\ &=\frac{\partial}{\partial \mathbf{x}}(\mathbf{a}^T\mathbf{a}-\mathbf{a}^TA\mathbf{x}-\mathbf{x}^TA^T\mathbf{a}+\mathbf{x}^TA^TA\mathbf{x}) \end{align}

ここで第一項\mathbf{a}^T\mathbf{a}xに関わらないことから微分すると消える項のため除外します。

\begin{align} 上式&=\frac{\partial}{\partial \mathbf{x}}(-\mathbf{a}^TA\mathbf{x}-\mathbf{x}^TA^T\mathbf{a}+\mathbf{x}^TA^TA\mathbf{x})\\ &=\frac{\partial}{\partial \mathbf{x}}\left\{-(a_1\;a_2\;a_3)\begin{pmatrix}A_{11} & A_{12} & A_{13}\\A_{21} & A_{22} & A_{23}\\A_{31} & A_{32} & A_{33}\end{pmatrix}\begin{pmatrix}x_1\\x_2\\x_3\end{pmatrix} -(x_1\;x_2\;x_3)\begin{pmatrix}A_{11} & A_{21} & A_{31}\\A_{12} & A_{22} & A_{32}\\A_{13} & A_{23} & A_{33}\end{pmatrix}\begin{pmatrix}a_1\\a_2\\a_3\end{pmatrix} +(x_1\;x_2\;x_3)\begin{pmatrix}A_{11} & A_{21} & A_{31}\\A_{12} & A_{22} & A_{32}\\A_{13} & A_{23} & A_{33}\end{pmatrix}\begin{pmatrix}A_{11} & A_{12} & A_{13}\\A_{21} & A_{22} & A_{23}\\A_{31} & A_{32} & A_{33}\end{pmatrix}\begin{pmatrix}x _1\\x_2\\x_3\end{pmatrix}\right\}\\ &=\frac{\partial}{\partial \mathbf{x}}\left\{ -(a_1A_{11}+a_2A_{21}+a_3A_{31}\quad a_1A_{12}+a_2A_{22}+a_3A_{32}\quad a_1A_{13}+a_2A_{23}+a_3A_{33})\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}-(a_1A_{11}+a_2A_{12}+a_3A_{13}\quad a_1A_{21}+a_2A_{22}+a_3A_{23}\quad a_1A_{31}+a_2A_{32}+a_3A_{33})\begin{pmatrix}a_1 \\ a_2 \\ a_3\end{pmatrix} +(x_1\;x_2\;x_3)\begin{pmatrix}B_{11} & B_{12} & B_{13} \\B_{21} & B_{22} & B_{23} \\B_{31} & B_{32} & B_{33}\end{pmatrix}\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}\right\}\end{align}

ここで、行列Bを以下のように仮置きしました。

\begin{align} B&=\begin{pmatrix}B_{11} & B_{12} & B_{13} \\ B_{21} & B_{22} & B_{23} \\ B_{31} & B_{32} & B_{33}\end{pmatrix}\\ &=\begin{pmatrix}A_{11}A_{11}+A_{21}A_{21}+A_{31}A_{31} & A_{11}A_{12}+A_{21}A_{22}+A_{31}A_{32} & A_{11}A_{13}+A_{21}A_{23}+A_{31}A_{33}\\ A_{12}A_{11}+A_{22}A_{21}+A_{32}A_{31} & A_{12}A_{12}+A_{22}A_{22}+A_{32}A_{32} & A_{12}A_{13}+A_{22}A_{23}+A_{32}A_{33}\\ A_{13}A_{11}+A_{23}A_{21}+A_{33}A_{31} & A_{13}A_{12}+A_{23}A_{22}+A_{33}A_{32} & A_{13}A_{13}+A_{23}A_{23}+A_{33}A_{33}\end{pmatrix} \end{align}

これより、

\begin{align} 上式&=\frac{\partial}{\partial \mathbf{x}}\left\{ -\bigg(x_1(a_1A_{11}+a_2A_{21}+a_3A_{31})+x_2(a_1A_{12}+a_2A_{22}+a_3A_{32})+x_3(a_1A_{13}+a_2A_{23}+a_3A_{33})\bigg) -\bigg(a_1(x_1A_{11}+x_2A_{12}+x_3A_{13})+a_2(x_1A_{21}+x_2A_{22}+x_3A_{23})+a_3(x_1A_{31}+x_2A_{32}+x_3A_{33})\bigg) +(x_1B_{11}+x_2B_{21}+x_3B_{31}\quad x_1B_{12}+x_2B_{22}+x_3B_{32}\quad x_1B_{13}+x_2B_{23}+x_3B_{33})\begin{pmatrix}x_1 \\ x_2 \\ x_3\end{pmatrix}\right\}\\ &=\frac{\partial}{\partial \mathbf{x}}\left\{ -x_1(a_1A_{11}+a_2A_{21}+a_3A_{31})-x_2(a_1A_{12}+a_2A_{22}+a_3A_{32})-x_3(a_1A_{13}+a_2A_{23}+a_3A_{33})-x_1(a_1A_{11}+a_2A_{21}+a_3A_{31})-x_2(a_1A_{12}+a_2A_{21}+a_3A_{32})-x_3(a_1A_{13}+a_2A_{23}+a_3A_{33}) +x_1^2B_{11}+x_1x_2B_{21}+x_1x_3B_{31}+x_1x_2B_{12}+x_2^2B_{22}+x_2x_3B_{32}+x_1x_3B_{13}+x_2x_3B_{23}+x_3^2B_{33}\right\} \end{align}

ベクトル微分\frac{\partial}{\partial \mathbf{x}}を作用させると、

\begin{align} 上式&= \begin{pmatrix} -(a_1A_{11} + a_2A_{21}+a_3A_{31})-(a_1A_{11} + a_2A_{21}+a_3A_{31})+2x_1B_{11}+x_2B_{21}+x_3B_{31}+x_2B_{12}+x_3B_{13}\\ -(a_1A_{12} + a_2A_{22}+a_3A_{32})-(a_1A_{12} + a_2A_{22}+a_3A_{32})+x_1B_{21}+x_1B_{12}+2x_2B_{22}+x_3B_{32}+x_3B_{33}\\ -(a_1A_{13} + a_2A_{23}+a_3A_{33})-(a_1A_{13} + a_2A_{23}+a_3A_{33})+x_1B_{31}+x_2B_{32}+x_1B_{13}+x_2B_{23}+2x_3B_{33} \end{pmatrix} \end{align}

BAで戻すと、

\begin{align} 上式&=\begin{pmatrix} -2(a_1A_{11}+a_2A_{21}+a_3A_{31}) +2x_1(A_{11}A_{11}+A_{21}A_{21}+A_{31}A_{31}) +x_2(A_{12}A_{11}+A_{22}A_{21}+A_{32}A_{31}) +x_3(A_{13}A_{11}+A_{23}A_{21}+A_{33}A_{31}) +x_2(A_{11}A_{12}+A_{21}A_{22}+A_{31}A_{32}) +x_3(A_{11}A_{13}+A_{21}A_{23}+A_{31}A_{33})\\ \\ -2(a_1A_{12}+a_2A_{22}+a_3A_{32}) +x_1(A_{12}A_{11}+A_{22}A_{21}+A_{32}A_{31}) +x_1(A_{11}A_{12}+A_{21}A_{22}+A_{31}A_{32}) +2x_2(A_{12}A_{12}+A_{22}A_{22}+A_{32}A_{32}) +x_3(A_{13}A_{12}+A_{23}A_{22}+A_{33}A_{32}) +x_3(A_{12}A_{13}+A_{22}A_{23}+A_{32}A_{33})\\ \\ -2(a_1A_{13}+a_2A_{23}+a_3A_{33}) +x_1(A_{13}A_{11}+A_{23}A_{21}+A_{33}A_{31}) +x_2(A_{13}A_{12}+A_{23}A_{22}+A_{33}A_{32}) +x_1(A_{11}A_{13}+A_{21}A_{23}+A_{31}A_{33}) +x_2(A_{12}A_{13}+A_{22}A_{23}+A_{32}A_{33}) +2x_3(A_{13}A_{13}+A_{23}A_{23}+A_{33}A_{33}) \end{pmatrix}\\ &=\begin{pmatrix} -2(a_1A_{11}+a_2A_{21}+a_3A_{31}) +2x_1(A_{11}A_{11}+A_{21}A_{21}+A_{31}A_{31}) +2x_2(A_{11}A_{12}+A_{21}A_{22}+A_{31}A_{32}) +2x_3(A_{11}A_{13}+A_{21}A_{23}+A_{31}A_{33})\\\\ -2(a_1A_{12}+a_2A_{22}+a_3A_{32}) +2x_1(A_{11}A_{12}+A_{21}A_{22}+A_{31}A_{32}) +2x_2(A_{12}A_{12}+A_{22}A_{22}+A_{32}A_{32}) +2x_3(A_{12}A_{13}+A_{22}A_{23}+A_{32}A_{33})\\\\ -2(a_1A_{13}+a_2A_{23}+a_3A_{33}) +2x_1(A_{11}A_{13}+A_{21}A_{23}+A_{31}A_{33}) +2x_2(A_{12}A_{13}+A_{22}A_{23}+A_{32}A_{33}) +2x_3(A_{13}A_{13}+A_{23}A_{23}+A_{33}A_{33}) \end{pmatrix}\\ &=-2\begin{pmatrix} a_1A_{11}+a_2A_{21}+a_3A_{31}\\ a_1A_{12}+a_2A_{22}+a_3A_{32}\\ a_1A_{13}+a_2A_{23}+a_3A_{33} \end{pmatrix} +2\begin{pmatrix} A_{11}A_{11}+A_{21}A_{21}+A_{31}A_{31} & A_{11}A_{12}+A_{21}A_{22}+A_{31}A_{32} & A_{11}A_{13}+A_{21}A_{23}+A_{31}A_{33}\\ A_{11}A_{12}+A_{21}A_{22}+A_{31}A_{32} & A_{12}A_{12}+A_{22}A_{22}+A_{32}A_{32} & A_{12}A_{13}+A_{22}A_{23}+A_{32}A_{33}\\ A_{11}A_{13}+A_{21}A_{23}+A_{31}A_{33} & A_{12}A_{13}+A_{22}A_{23}+A_{32}A_{33} & A_{13}A_{13}+A_{23}A_{23}+A_{33}A_{33} \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}\\ &=-2\left\{ \begin{pmatrix} A_{11} & A_{21} & A_{31} \\ A_{12} & A_{22} & A_{32} \\ A_{13} & A_{23} & A_{33} \end{pmatrix} \begin{pmatrix} a_1 \\ a_2 \\ a_3 \end{pmatrix} -\begin{pmatrix} A_{11} & A_{21} & A_{31} \\ A_{12} & A_{22} & A_{32} \\ A_{13} & A_{23} & A_{33} \end{pmatrix} \begin{pmatrix} A_{11} & A_{12} & A_{13} \\ A_{21} & A_{22} & A_{23} \\ A_{31} & A_{32} & A_{33} \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} \right\}\\ &=-2(A^T\mathbf{a}-A^TA\mathbf{x})\\ &=-2A^T(\mathbf{a}-A\mathbf{x}) \end{align}

5. (式5)以降の導出について

(式5)以降は省略しますが、以下のように行列微分を用いれば導出ができます。

\frac{\partial}{\partial X}=\begin{pmatrix} \frac{\partial}{\partial X_{11}} & \frac{\partial}{\partial X_{12}} & \frac{\partial}{\partial X_{13}}\\ \frac{\partial}{\partial X_{21}} & \frac{\partial}{\partial X_{22}} & \frac{\partial}{\partial X_{23}}\\ \frac{\partial}{\partial X_{31}} & \frac{\partial}{\partial X_{32}} & \frac{\partial}{\partial X_{33}} \end{pmatrix}

© 2025, blueqat Inc. All rights reserved