Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
2
20240414IEEETG
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
XingguoChen
20240414IEEETG
Commits
76a6eacc
Commit
76a6eacc
authored
May 26, 2024
by
Lenovo
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
圣彼得堡的非遍历性证明好了
parent
e0dead58
Hide whitespace changes
Inline
Side-by-side
Showing
9 changed files
with
159 additions
and
42 deletions
+159
-42
document.tex
+1
-0
main/background.tex
+30
-41
main/introduction.tex
+1
-1
main/nonergodic.tex
+96
-0
material/2048prove.tex
+0
-0
material/nonergodicity.tex
+0
-0
material/paradox.tex
+0
-0
material/theorem.tex
+0
-0
pic/paradox.tex
+31
-0
No files found.
document.tex
View file @
76a6eacc
...
...
@@ -74,6 +74,7 @@ wangwenhao11@nudt.edu.cn).
\input
{
main/introduction
}
\input
{
main/background
}
\input
{
main/nonergodic
}
%\input{main/nonergodicity}
%\input{main/paradox}
...
...
main/background.tex
View file @
76a6eacc
...
...
@@ -31,11 +31,10 @@ That is $\forall s\in \mathcal{S}$, we have
\sum
_{
s'
\in
\mathcal
{
S
}}
P
_{
\pi
}
(s',s)d
_{
\pi
}
(s')=d
_{
\pi
}
(s).
\end{equation}
\begin{definition}
[Ergodicity]
Ergodicity assumption about the MDP assume that
$
d
_{
\pi
}
(
s
)
$
exist for any policy
$
\pi
$
and are independent of
initial states
\cite
{
Sutton2018book
}
.
\end{definition}
This mean all states are reachable under any policy from the
current state after sufficiently many steps
\cite
{
majeed2018q
}
.
...
...
@@ -67,6 +66,7 @@ P_{\text{absorbing}}\dot{=}\begin{array}{c|ccccccc}
\text
{
E
}
&
\frac
{
1
}{
2
}
&
0
&
0
&
0
&
\frac
{
1
}{
2
}
&
0
\end
{
array
}
\]
Note that absorbing states can be combined into one.
According to (
\ref
{
invariance
}
),
the distribution
$
d
_{
\text
{
absorbing
}}
=
\{
1
$
,
$
0
$
,
$
0
$
,
$
0
$
,
$
0
$
,
$
0
$
\}
.
...
...
@@ -99,7 +99,7 @@ the distribution $d_{\text{restart}}=\{0.1$,
Since the probability of T, A, B, C, D, E are non-zeros,
random walk with restarts is ergodic.
\subsection
{
Ergodicity
and Non-ergodicity
between non-absorbing states
}
\subsection
{
Ergodicity between non-absorbing states
}
For Markov chains with absorbing states, we usually decompose
the transition matrix
$
P
$
into the following form:
\[
...
...
@@ -124,23 +124,18 @@ where $Q$ is the matrix of transition probabilities between
N
\dot
{
=
}
\sum
_{
i=0
}^{
\infty
}
Q
^
i=(I
_{
n-1
}
-Q)
^{
-1
}
,
\end{equation}
where
$
I
_{
n
-
1
}$
is the
$
(
n
-
1
)
\times
(
n
-
1
)
$
identity matrix.
Note that absorbing states can be combined into one.
It is now easy to define whether the non-absorbing states
are ergodic.
\begin{definition}
[Ergodicity between non-absorbing states]
Assume that
$
N
$
exist for any policy
$
\pi
$
and
are
independent of initial states.
Assume that
$
N
$
exist
s
for any policy
$
\pi
$
and
is
independent of initial states.
$
\forall
i,j
\in
S
\setminus\{\text
{
T
}
\}
$
,
$
N
_{
ij
}
>
0
$
, MDP is ergodic between non-absorbing states.
\label
{
definition2
}
\end{definition}
\begin{definition}
[Non-ergodicity between non-absorbing states]
Assume that
$
N
$
exist for any policy
$
\pi
$
and are independent of initial states.
$
\exists
i,j
\in
S
\setminus\{\text
{
T
}
\}
$
,
$
N
_{
ij
}
=
0
$
, MDP is non-ergodic between non-absorbing states.
\end{definition}
For random walk with absorbing states,
\[
...
...
@@ -161,26 +156,26 @@ Q_{\text{absorbing}}\dot{=}\begin{array}{c|ccccc}
\text
{
E
}
&
0
&
0
&
0
&
\frac
{
1
}{
2
}
&
0
\end
{
array
}
\]
%\[
% R_{\text{absorbing}}\dot{=}\begin{array}{c|c}
% &\text{T} \\\hline
% \text{A} & \frac{1}{2} \\
% \text{B} & 0 \\
% \text{C} & 0 \\
% \text{D} & 0 \\
% \text{E} & \frac{1}{2}
% \end{array}
% \]
% \[
% I_{\text{absorbing}}\dot{=}\begin{array}{c|c}
% &\text{T} \\\hline
% \text{T} & 1
% \end{array}
% \]
Then,
\[
R
_{
\text
{
absorbing
}}
\dot
{
=
}
\begin
{
array
}{
c|c
}
&
\text
{
T
}
\\\hline
\text
{
A
}
&
\frac
{
1
}{
2
}
\\
\text
{
B
}
&
0
\\
\text
{
C
}
&
0
\\
\text
{
D
}
&
0
\\
\text
{
E
}
&
\frac
{
1
}{
2
}
\end
{
array
}
\]
\[
I
_{
\text
{
absorbing
}}
\dot
{
=
}
\begin
{
array
}{
c|c
}
&
\text
{
T
}
\\\hline
\text
{
T
}
&
1
\end
{
array
}
\]
Then,
{
\[
N
_{
\text
{
absorbing
}}
\dot
{
=
}
\begin
{
array
}{
c|ccccc
}
N
_{
\text
{
absorbing
}}
=(
I
_
5
-
Q
_{
\text
{
absorbing
}}
)
^{
-
1
}
=
\begin
{
array
}{
c|ccccc
}
&
\text
{
A
}
&
\text
{
B
}
&
\text
{
C
}
&
\text
{
D
}
&
\text
{
E
}
\\\hline
\text
{
A
}
&
\frac
{
5
}{
3
}
&
\frac
{
4
}{
3
}
&
1
&
\frac
{
2
}{
3
}
&
\frac
{
1
}{
3
}
\\
\text
{
B
}
&
\frac
{
4
}{
3
}
&
\frac
{
8
}{
3
}
&
2
&
\frac
{
4
}{
3
}
&
\frac
{
2
}{
3
}
\\
...
...
@@ -188,15 +183,9 @@ N_{\text{absorbing}}\dot{=}\begin{array}{c|ccccc}
\text
{
D
}
&
\frac
{
2
}{
3
}
&
\frac
{
4
}{
3
}
&
2
&
\frac
{
8
}{
3
}
&
\frac
{
4
}{
3
}
\\
\text
{
E
}
&
\frac
{
1
}{
3
}
&
\frac
{
2
}{
3
}
&
1
&
\frac
{
4
}{
3
}
&
\frac
{
5
}{
3
}
\\
\end
{
array
}
\]
,
}
\highlight
{
昕闻帮我算这个矩阵
}
通过圣彼得堡例子说明,圣彼得堡不满足非吸收态之间的遍历性。
给出定理,同样证明2048游戏不满足非吸收态之间的遍历性。
\]
Bases on Definition
\ref
{
definition2
}
,
random walk with absorbing states
is ergodic between non-absorbing states.
main/introduction.tex
View file @
76a6eacc
...
...
@@ -108,7 +108,7 @@ The comparison in this set of experiments indicates that
while in the 2048 game, when the agent deviates
from the optimal state, it may never have the
chance to return to the previous state.
This relates to the game's property of
traversabil
ity.
This relates to the game's property of
ergodic
ity.
In this paper, we proved that the game 2048 is non-ergodic.
...
...
main/nonergodic.tex
0 → 100644
View file @
76a6eacc
\section
{
Non-ergodicity between non-absorbing states
}
\begin{definition}
[Non-ergodicity between non-absorbing states]
Assume that
$
N
$
exists for any policy
$
\pi
$
and is independent of initial states.
$
\exists
i,j
\in
S
\setminus\{\text
{
T
}
\}
$
,
$
N
_{
ij
}
=
0
$
, MDP is non-ergodic between non-absorbing states.
\label
{
definition3
}
\end{definition}
\subsection
{
St. Petersburg paradox
}
The St. Petersburg paradox is a paradox associated
with gambling and decision theory. It is named after the city
of St. Petersburg in Russia and was initially introduced
by the mathematician Daniel Bernoulli in 1738.
The paradox involves a gambling game with the following rules:
\begin{itemize}
\item
Participants must pay a fixed entry fee to join the game.
\item
The game continues until a coin lands heads up.
Each toss determines the prize, with the first heads
appearing on the
$
t
$
-th toss resulting in a prize of
$
2
^
t
$
.
\end{itemize}
%\input{pic/FigureParadox}
The expected return of all possibilities is
\begin{equation}
\begin{split}
\mathbb
{
E
}
(x)
&
=
\lim
_{
n
\rightarrow
\infty
}
\sum
_{
t=1
}^
n p(x)
\times
V(x)
\\
&
=
\lim
_{
n
\rightarrow
\infty
}
\sum
_{
t=1
}^
n
\frac
{
1
}{
2
^
t
}
2
^
t
\\
&
=
\infty
\end{split}
\end{equation}
Despite the potential for the prize to escalate
significantly, the expected value calculation
in probability theory reveals that the average
participant in this gambling game would end up paying
an infinite fee. This is due to the prize's expected
value being infinite. Even though the probability of
winning is small with each toss, when multiplied,
it leads to an infinitely increasing expected value.
This paradox challenges individuals' intuitions and
decision-making regarding gambling. Despite the allure
of a potentially substantial prize, the actual expected
value of participating in this gambling game is infinite.
Consequently, in the long run, participants could face
an infinite monetary loss.
\input
{
pic/paradox
}
Figure
\ref
{
TruncatedPetersburg
}
is a truncated version
of the St. Petersburg paradox. The transition probabilities between
non-absorbing states are as follows:
\[
Q
_{
\text
{
truncated
}}
\dot
{
=
}
\begin
{
array
}{
c|ccccc
}
&
\text
{
S
}_
1
&
\text
{
S
}_
2
&
\text
{
S
}_
3
&
\text
{
S
}_
4
&
\text
{
S
}_
5
\\\hline
\text
{
S
}_
1
&
0
&
\frac
{
1
}{
2
}
&
0
&
0
&
0
\\
\text
{
S
}_
2
&
0
&
0
&
\frac
{
1
}{
2
}
&
0
&
0
\\
\text
{
S
}_
3
&
0
&
0
&
0
&
\frac
{
1
}{
2
}
&
0
\\
\text
{
S
}_
4
&
0
&
0
&
0
&
0
&
\frac
{
1
}{
2
}
\\
\text
{
S
}_
5
&
0
&
0
&
0
&
0
&
0
\end
{
array
}
\]
Then,
\[
N
_{
\text
{
truncated
}}
=(
I
_
5
-
Q
_{
\text
{
truncated
}}
)
^{
-
1
}
=
\begin
{
array
}{
c|ccccc
}
&
\text
{
S
}_
1
&
\text
{
S
}_
2
&
\text
{
S
}_
3
&
\text
{
S
}_
4
&
\text
{
S
}_
5
\\\hline
\text
{
S
}_
1
&
1
&
\frac
{
1
}{
2
}
&
\frac
{
1
}{
4
}
&
\frac
{
1
}{
8
}
&
\frac
{
1
}{
16
}
\\
\text
{
S
}_
2
&
0
&
1
&
\frac
{
1
}{
2
}
&
\frac
{
1
}{
4
}
&
\frac
{
1
}{
8
}
\\
\text
{
S
}_
3
&
0
&
0
&
1
&
\frac
{
1
}{
2
}
&
\frac
{
1
}{
4
}
\\
\text
{
S
}_
4
&
0
&
0
&
0
&
1
&
\frac
{
1
}{
2
}
\\
\text
{
S
}_
5
&
0
&
0
&
0
&
0
&
1
\\
\end
{
array
}
\]
Bases on Definition
\ref
{
definition3
}
,
the truncated St. Petersburg paradox
is non-ergodic between non-absorbing states.
ma
in
/2048prove.tex
→
ma
terial
/2048prove.tex
View file @
76a6eacc
File moved
ma
in
/nonergodicity.tex
→
ma
terial
/nonergodicity.tex
View file @
76a6eacc
File moved
ma
in
/paradox.tex
→
ma
terial
/paradox.tex
View file @
76a6eacc
File moved
ma
in
/theorem.tex
→
ma
terial
/theorem.tex
View file @
76a6eacc
File moved
pic/paradox.tex
0 → 100644
View file @
76a6eacc
\begin{figure}
[!t]
\centering
\scalebox
{
0.9
}{
\begin{tikzpicture}
\node
[draw, rectangle, fill=gray!50]
(DEAD1) at (0,1.5)
{
T
}
;
\node
[draw, rectangle, fill=gray!50]
(DEAD2) at (1.5,1.5)
{
T
}
;
\node
[draw, rectangle, fill=gray!50]
(DEAD3) at (3,1.5)
{
T
}
;
\node
[draw, rectangle, fill=gray!50]
(DEAD4) at (4.5,1.5)
{
T
}
;
\node
[draw, rectangle, fill=gray!50]
(DEAD5) at (6,1.5)
{
T
}
;
\node
[draw, circle]
(A) at (0,0)
{
S
$_
1
$}
;
\node
[draw, circle]
(B) at (1.5,0)
{
S
$_
2
$}
;
\node
[draw, circle]
(C) at (3,0)
{
S
$_
3
$}
;
\node
[draw, circle]
(D) at (4.5,0)
{
S
$_
4
$}
;
\node
[draw, circle]
(E) at (6,0)
{
S
$_
5
$}
;
\draw
[->]
(A) -- node
{
0.5
}
(DEAD1);
\draw
[->]
(A) -- node
{
0.5
}
(B);
\draw
[->]
(B) -- node
{
0.5
}
(DEAD2);
\draw
[->]
(B) -- node
{
0.5
}
(C);
\draw
[->]
(C) -- node
{
0.5
}
(DEAD3);
\draw
[->]
(C) -- node
{
0.5
}
(D);
\draw
[->]
(D) -- node
{
0.5
}
(DEAD4);
\draw
[->]
(D) -- node
{
0.5
}
(E);
\draw
[->]
(E) -- node
{
1.0
}
(DEAD5);
\draw
[->]
([xshift=-4ex]A.west) -- ([xshift=-5.2ex]A.east);
\end{tikzpicture}
}
\caption
{
Truncated St. Petersburg.
}
\label
{
TruncatedPetersburg
}
\end{figure}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment