\section{Acyclicity between non-absorbing states} \begin{definition}[Acyclicity between non-absorbing states] Assume that $N$ exists for any policy $\pi$ and is independent of initial states. $\forall i,j \in S\setminus\{\text{T}\}$, if $N_{ij}>0$, then $N_{ji}=0$, MDP is acyclic between non-absorbing states. \label{definition3} \end{definition} \subsection{Boyan chain} \input{pic/boyanchain} Figure \ref{boyanchain} shows Boyan chain. The transition probabilities between non-absorbing states are as follows: \[ Q_{\text{bo}}\dot{=}\begin{tiny}\left[ \begin{array}{cccccccccccc} 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}\right] \end{tiny} \] Then, \begin{strip} \begin{equation} \begin{split} N_{\text{bo}}=&(I_{12}-Q_{\text{bo}})^{-1}\\ =&\begin{tiny}\left[\begin{array}{cccccccccccc} 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 & 0.666015625 & 0.6669921875 & 0.66650390625 \\ 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 & 0.666015625 & 0.6669921875 \\ 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 & 0.666015625 \\ 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 \\ 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{array}\right] \end{tiny} \end{split} \end{equation} \end{strip} Bases on Definition \ref{definition3}, Boyan chain is acyclic between non-absorbing states. \subsection{A sufficient condition for acyclicity between non-absorbing states} By observing Boyan chain, it is easy to provide a sufficient condition for acyclicity between non-absorbing states. \begin{theorem}[A sufficient condition for acyclicity between non-absorbing states] \label{judgmentTheorem} Given a Markov chain with absorbing states, suppose the size of the non-absorbing states $|S\setminus\{\text{T}\}|\geq 2$. If the transition matrix $Q$ between non-absorbing states satifies, \begin{equation} \forall i,j \in S\setminus\{\text{T}\}, Q_{i,j}=\begin{cases} \geq 0, & \text{if } i\leq j; \\ 0, & \text{otherwise.} \end{cases} \label{condition} \end{equation} Then, the Markov chain is acyclic between non-absorbing states. \end{theorem} \begin{IEEEproof} The $Q$ matrix (\ref{condition}) is an upper triangular matrix. The product of two upper triangular matrices is still an upper triangular matrix. Furthermore, the sum of two upper triangular matrices is still an upper triangular matrix. Based on Definition \ref{definitionN}, the $N$ matrix is product and sum of upper triangular matrices. Then, the $N$ matrix is an upper triangular matrix. The claim now follows based on Definition \ref{definition3}. \end{IEEEproof}