\section{Non-ergodicity} \cite{kaplan1979sufficient} We assume that the state-process is ergodic — i.e. all states are reachable under any policy from the current state after sufficiently many steps. \cite{majeed2018q} % ABCDE的随机游走的状态矩阵 \[ P = \begin{pmatrix} 1 & 0 & 0 & 0 & 0 & \\ \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0\\ 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0\\ 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2}\\ 0 & 0 & 0 & 0 & 1 \end{pmatrix} \] %可重启的随机游走矩阵 \[ P = \begin{pmatrix} 0 & 0 & 1 & 0 & 0 & \\ \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0\\ 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0\\ 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2}\\ 0 & 0 & 1 & 0 & 0 \end{pmatrix} \] % 计算平稳分布 \[ \begin{cases} \pi_1 = \pi_3 \\ \frac{1}{2}\pi_1 + \frac{1}{2}\pi_3 = \pi_2 \\ \frac{1}{2}\pi_2 + \frac{1}{2}\pi_4 = \pi_3 \\ \frac{1}{2}\pi_3 + \frac{1}{2}\pi_5 = \pi_4 \\ \pi_3 = \pi_5 \\ \pi_1 + \pi_2 + \pi_3 + \pi_4 + \pi_5 = 1 \end{cases} \] %随机游走pic \input{pic/randomWalk} 设两个上三角矩阵为( A ) 和 ( B ),它们的形式分别为: % 两个上三角矩阵乘积求和为上三角矩阵 A = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ 0 & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & a_{nn} \end{pmatrix}, \quad B = \begin{pmatrix} b_{11} & b_{12} & \cdots & b_{1n} \\ 0 & b_{22} & \cdots & b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & b_{nn} \end{pmatrix} [ c_{ij}=\sum_{k=1}^{n} a_{ik}b_{kj} ] 当 $i>j$ 时,有 $c_{ij}=0$,因为在此情况下,$a_{ik}=0$ 或 $b_{kj}=0$,乘积中至少有一项为 0。 所以 $C$ 也是一个上三角矩阵。 因此,证明了两个上三角矩阵的乘积还是一个上三角矩阵。 % N矩阵 $N=1+Q^1+Q^2……$ % “重启”随机游走 pic \input{pic/randomWalkRestart}