\section{Non-ergodicity} \cite{kaplan1979sufficient} We assume that the state-process is ergodic — i.e. all states are reachable under any policy from the current state after sufficiently many steps. \cite{majeed2018q}