\section{Ergodicity and nonergodicity of a Markov chain}

\begin{assumption}
\label{assumption1}
In the sequel $\{X_n\}$ is a Markov chain with state space
$S=\{0,1,2,\ldots\}$,
$\{X_n\}$
is aperiodic and irreducible,
 and stationary transition probabilities
$\forall i,j\in S$, $P_{ij}\geq 0$.
\end{assumption}




\begin{theorem}(A sufficient condition for ergodicity \cite{pakes1969some,kaplan1979sufficient})
Assume Assumption \ref{assumption1},
 and there exist constants
$N> 0$, $B> 0$, such that
\begin{equation}
\forall i\geq 0, \sum_{j\in S}(j-i)P_{ij}<\infty,
\end{equation}
\begin{equation}
\forall i\geq N, \sum_{j\in S}(j-i)P_{ij}<-B,
\end{equation}
$\{X_n\}$ is ergodic.
\end{theorem}

请昕闻基于第一个定理完成 sutton 1998年书上 random walk 例子（书中图6.5）的遍历性证明。

\begin{theorem}(A sufficient condition for nonergodicity \cite{kaplan1979sufficient})
Assume Assumption \ref{assumption1}, if for some integer $N\geq 0$ and constants $B\geq 0$,
$c\in[0,1]$ the following two conditions hold, then
$\{X_n\}$ is not ergodic:
\begin{equation}
 \forall i\geq N, \sum_{j\in S} (j-i)P_{ij}>0, 
\end{equation}
\begin{equation}
 \forall i\geq N, \forall z\in[c,1], z^i-\sum_{j\in S}P_{ij}z^j\geq -B(1-z).
  \end{equation}
\end{theorem}

请昕闻基于第二个定理完成 sutton 1998年书上 cliff-walking task 例子（书中图6.13）的非遍历性证明。
以及圣彼得堡悖论的非遍历性证明。


\textcolor{red}{注意：证明过程应该是把Markov Chain写成N个状态（状态到底是第几个也需要明确定义），状态之间的转移概率是
一个矩阵，需要把矩阵元素明确定义出来，然后基于两个定理，明确推导出两个公式是否满足}


\section{2024年5月4日晚10:49与李昕闻讨论}

遍历性指的是任意状态两两之间都可以达，即两两之间的若干次转移概率大于0，并且具有稳定的分布。

具有吸收态的马尔科夫链是不满足遍历性的，因为它的稳定分布最终吸收态是1，非吸收态是0.

我们的强化学习例子，包括迷宫、随机游走、2048等等，都包含吸收态，所以都不满足遍历性。
但是并不影响强化学习，因为我们都有游戏结束后的restart设置。所以它从吸收态又重新开始了。
因此，满足遍历性。

但是，从需求角度出发，我们真正想看到的是 除去吸收态，那些非吸收态相互之间是否能走通，
即去除吸收态，剩下的状态是否具有遍历性。因为，显然迷宫、随机游走是有遍历性的。
圣彼得堡悖论、2048是没有遍历性的。


根据遍历性定义，
P可以分解为Q R I 0，那么$N=(I-Q)^{-1}$，即描述了非吸收态之间的遍历关系，
但凡有一个是0，就说明这两两之间不可达。只要都大于0，就是可达的。

\textcolor{red}{如果这事可行，那么请李昕闻仔细对比概念，是否叫拟遍历性，还是有其它的概念？
一定要分得清！}

这样的话，就可以计算随机游走、圣彼得堡悖论的N矩阵，看它们是否具有遍历性？
按照设想，随机游走应该每个值都大于0，而圣彼得堡悖论应该是上三角矩阵，甚至对角线都是0.

基于这样的观察，2048，如何证明具有``非遍历性''？

是否定义i，j，以及ij的转移概率即可？用构造性证明方法
最终也是上三角，并且对角线为0？

这样的话，相当于我们提出了一种满足非遍历性的充分条件吧？
似乎论文可以从这方面下手！

% 2048游戏局面编码
\begin{equation}
$p=2^{64} \cdot \sum_{m=0}^{15} I(B_m \neq 0) \cdot 2^{B_m} + \sum_{m=0}^{15} (1 \ll 4m) \cdot B_m$
\end{equation}

% 马尔可夫链标准形式
\begin{equation}
 P = \begin{bmatrix}
                     Q & R \\
                     0 & I
\end{bmatrix}
\end{equation}



% 带策略的马尔可夫链标准形式
\begin{equation}
 P_\pi = \begin{pmatrix} Q_\pi & R_\pi \\ 0 & I \end{pmatrix}
\end{equation}