\section{Acyclicity of the 2048 game} The purpose of this section is to prove the acyclicity of the 2048 game and give some discussions. \subsection{Acyclicity between non-absorbing states} \begin{definition}[Acyclicity between non-absorbing states] Assume that $N$ exists for any policy $\pi$ and is independent of initial states. $\forall i,j \in S\setminus\{\text{T}\}$, if $N_{ij}>0$, then $N_{ji}=0$, MDP is acyclic between non-absorbing states. \label{definition3} \end{definition} It is easy to see that if a Markov chain is ergodic, then it is cyclic. %\subsection{Boyan chain} \input{pic/boyanchain} Figure \ref{boyanchain} shows Boyan chain \cite{boyan2002technical}. The transition probabilities between non-absorbing states are as follows: \[ Q_{\text{bo}}\dot{=}\begin{tiny}\left[ \begin{array}{cccccccccccc} 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}\right] \end{tiny} \] Then, $N_{\text{bo}}$ is as (\ref{nbo}). \begin{figure*} \begin{equation} \begin{split} N_{\text{bo}}=&(I_{12}-Q_{\text{bo}})^{-1}\\ =&\begin{tiny}\left[\begin{array}{cccccccccccc} 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 & 0.666015625 & 0.6669921875 & 0.66650390625 \\ 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 & 0.666015625 & 0.6669921875 \\ 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 & 0.666015625 \\ 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 & 0.66796875 \\ 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 & 0.6640625 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 & 0.671875 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 & 0.65625 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 & 0.6875 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 & 0.625 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 & 0.75 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{array}\right] \end{tiny} \end{split} \label{nbo} \end{equation} \end{figure*} Bases on Definition \ref{definition3}, Boyan chain is acyclic between non-absorbing states. %\subsection{A sufficient condition for acyclicity between non-absorbing states} By observing Boyan chain, it is easy to provide a sufficient condition for acyclicity between non-absorbing states. \begin{theorem}[A sufficient condition for acyclicity between non-absorbing states] \label{judgmentTheorem} Given a Markov chain with absorbing states, suppose the size of the non-absorbing states $|S\setminus\{\text{T}\}|\geq 2$. If the transition matrix $Q$ between non-absorbing states satisfies, \begin{equation} \forall i,j \in S\setminus\{\text{T}\}, Q_{i,j}=\begin{cases} \geq 0, & \text{if } i\leq j; \\ 0, & \text{otherwise.} \end{cases} \label{condition} \end{equation} Then, the Markov chain is acyclic between non-absorbing states. \end{theorem} \begin{IEEEproof} The $Q$ matrix (\ref{condition}) is an upper triangular matrix. The product of two upper triangular matrices is still an upper triangular matrix. Furthermore, the sum of two upper triangular matrices is still an upper triangular matrix. Based on Definition \ref{definitionN}, $N\dot{=} \sum_{i=0}^{\infty}Q^i$, the $N$ matrix is the product and sum of upper triangular matrices. Then, the $N$ matrix is an upper triangular matrix. The claim now follows based on Definition \ref{definition3}. \end{IEEEproof}