%File: anonymous-submission-latex-2025.tex \documentclass[letterpaper]{article} % DO NOT CHANGE THIS \usepackage[submission]{aaai25} % DO NOT CHANGE THIS \usepackage{times} % DO NOT CHANGE THIS \usepackage{helvet} % DO NOT CHANGE THIS \usepackage{courier} % DO NOT CHANGE THIS \usepackage[hyphens]{url} % DO NOT CHANGE THIS \usepackage{graphicx} % DO NOT CHANGE THIS \urlstyle{rm} % DO NOT CHANGE THIS \def\UrlFont{\rm} % DO NOT CHANGE THIS \usepackage{natbib} % DO NOT CHANGE THIS AND DO NOT ADD ANY OPTIONS TO IT \usepackage{caption} % DO NOT CHANGE THIS AND DO NOT ADD ANY OPTIONS TO IT \frenchspacing % DO NOT CHANGE THIS \setlength{\pdfpagewidth}{8.5in} % DO NOT CHANGE THIS \setlength{\pdfpageheight}{11in} % DO NOT CHANGE THIS % % These are recommended to typeset algorithms but not required. See the subsubsection on algorithms. Remove them if you don't have algorithms in your paper. \usepackage{algorithm} \usepackage{algorithmic} \usepackage{subfigure} \usepackage{diagbox} \usepackage{booktabs} \usepackage{amsmath} \usepackage{amssymb} \usepackage{mathtools} \usepackage{amsthm} \usepackage{tikz} \usepackage{bm} \usepackage{esvect} \usepackage{multirow} \theoremstyle{plain} % \newtheorem{theorem}{Theorem}[section] \newtheorem{theorem}{Theorem} \newtheorem{proposition}[theorem]{Proposition} \newtheorem{lemma}[theorem]{Lemma} \newtheorem{corollary}[theorem]{Corollary} \theoremstyle{definition} \newtheorem{definition}[theorem]{Definition} \newtheorem{assumption}[theorem]{Assumption} \theoremstyle{remark} \newtheorem{remark}[theorem]{Remark} % % These are are recommended to typeset listings but not required. See the subsubsection on listing. Remove this block if you don't have listings in your paper. \usepackage{newfloat} \usepackage{listings} \DeclareCaptionStyle{ruled}{labelfont=normalfont,labelsep=colon,strut=off} % DO NOT CHANGE THIS \lstset{% basicstyle={\footnotesize\ttfamily},% footnotesize acceptable for monospace numbers=left,numberstyle=\footnotesize,xleftmargin=2em,% show line numbers, remove this entire line if you don't want the numbers. aboveskip=0pt,belowskip=0pt,% showstringspaces=false,tabsize=2,breaklines=true} \floatstyle{ruled} \newfloat{listing}{tb}{lst}{} \floatname{listing}{Listing} % % Keep the \pdfinfo as shown here. There's no need % for you to add the /Title and /Author tags. \pdfinfo{ /TemplateVersion (2025.1) } % DISALLOWED PACKAGES % \usepackage{authblk} -- This package is specifically forbidden % \usepackage{balance} -- This package is specifically forbidden % \usepackage{color (if used in text) % \usepackage{CJK} -- This package is specifically forbidden % \usepackage{float} -- This package is specifically forbidden % \usepackage{flushend} -- This package is specifically forbidden % \usepackage{fontenc} -- This package is specifically forbidden % \usepackage{fullpage} -- This package is specifically forbidden % \usepackage{geometry} -- This package is specifically forbidden % \usepackage{grffile} -- This package is specifically forbidden % \usepackage{hyperref} -- This package is specifically forbidden % \usepackage{navigator} -- This package is specifically forbidden % (or any other package that embeds links such as navigator or hyperref) % \indentfirst} -- This package is specifically forbidden % \layout} -- This package is specifically forbidden % \multicol} -- This package is specifically forbidden % \nameref} -- This package is specifically forbidden % \usepackage{savetrees} -- This package is specifically forbidden % \usepackage{setspace} -- This package is specifically forbidden % \usepackage{stfloats} -- This package is specifically forbidden % \usepackage{tabu} -- This package is specifically forbidden % \usepackage{titlesec} -- This package is specifically forbidden % \usepackage{tocbibind} -- This package is specifically forbidden % \usepackage{ulem} -- This package is specifically forbidden % \usepackage{wrapfig} -- This package is specifically forbidden % DISALLOWED COMMANDS % \nocopyright -- Your paper will not be published if you use this command % \addtolength -- This command may not be used % \balance -- This command may not be used % \baselinestretch -- Your paper will not be published if you use this command % \clearpage -- No page breaks of any kind may be used for the final version of your paper % \columnsep -- This command may not be used % \newpage -- No page breaks of any kind may be used for the final version of your paper % \pagebreak -- No page breaks of any kind may be used for the final version of your paperr % \pagestyle -- This command may not be used % \tiny -- This is not an acceptable font size. % \vspace{- -- No negative value may be used in proximity of a caption, figure, table, section, subsection, subsubsection, or reference % \vskip{- -- No negative value may be used to alter spacing above or below a caption, figure, table, section, subsection, subsubsection, or reference \setcounter{secnumdepth}{0} %May be changed to 1 or 2 if section numbers are desired. % The file aaai25.sty is the style file for AAAI Press % proceedings, working notes, and technical reports. % % Title % Your title must be in mixed case, not sentence case. % That means all verbs (including short verbs like be, is, using,and go), % nouns, adverbs, adjectives should be capitalized, including both words in hyphenated terms, while % articles, conjunctions, and prepositions are lower case unless they % directly follow a colon or long dash \title{A Variance Minimization Approach to Off-policy Temporal-Difference Learning} \author{ %Authors % All authors must be in the same font size and format. Written by AAAI Press Staff\textsuperscript{\rm 1}\thanks{With help from the AAAI Publications Committee.}\\ AAAI Style Contributions by Pater Patel Schneider, Sunil Issar,\\ J. Scott Penberthy, George Ferguson, Hans Guesgen, Francisco Cruz\equalcontrib, Marc Pujol-Gonzalez\equalcontrib } \affiliations{ %Afiliations \textsuperscript{\rm 1}Association for the Advancement of Artificial Intelligence\\ % If you have multiple authors and multiple affiliations % use superscripts in text and roman font to identify them. % For example, % Sunil Issar\textsuperscript{\rm 2}, % J. Scott Penberthy\textsuperscript{\rm 3}, % George Ferguson\textsuperscript{\rm 4}, % Hans Guesgen\textsuperscript{\rm 5} % Note that the comma should be placed after the superscript 1101 Pennsylvania Ave, NW Suite 300\\ Washington, DC 20004 USA\\ % email address must be in roman text type, not monospace or sans serif proceedings-questions@aaai.org % % See more examples next } %Example, Single Author, ->> remove \iffalse,\fi and place them surrounding AAAI title to use it \iffalse \title{My Publication Title --- Single Author} \author { Author Name } \affiliations{ Affiliation\\ Affiliation Line 2\\ name@example.com } \fi \iffalse %Example, Multiple Authors, ->> remove \iffalse,\fi and place them surrounding AAAI title to use it \title{My Publication Title --- Multiple Authors} \author { % Authors First Author Name\textsuperscript{\rm 1}, Second Author Name\textsuperscript{\rm 2}, Third Author Name\textsuperscript{\rm 1} } \affiliations { % Affiliations \textsuperscript{\rm 1}Affiliation 1\\ \textsuperscript{\rm 2}Affiliation 2\\ firstAuthor@affiliation1.com, secondAuthor@affilation2.com, thirdAuthor@affiliation1.com } \fi % REMOVE THIS: bibentry % This is only needed to show inline citations in the guidelines document. You should not need it and can safely delete it. \usepackage{bibentry} % END REMOVE bibentry \begin{document} \setcounter{theorem}{0} \maketitle % \setcounter{theorem}{0} \begin{abstract} In this paper, we introduce the concept of improving the performance of parametric Temporal-Difference (TD) learning algorithms by the Variance Minimization (VM) parameter, $\omega$, which is dynamically updated at each time step. Specifically, we incorporate the VM parameter into off-policy linear algorithms such as TDC and ETD, resulting in the Variance Minimization TDC (VMTDC) algorithm and the Variance Minimization ETD (VMETD) algorithm. In the two-state counterexample, we analyze the convergence speed of these algorithms by calculating the minimum eigenvalue of the key matrices and find that the VMTDC algorithm converges faster than TDC, while VMETD is more stable in convergence than ETD through the experiment.In controlled experiments, the VM algorithms demonstrate superior performance. \end{abstract} % Uncomment the following to link to your code, datasets, an extended version or similar. % % \begin{links} % \link{Code}{https://aaai.org/example/code} % \link{Datasets}{https://aaai.org/example/datasets} % \link{Extended version}{https://aaai.org/example/extended-version} % \end{links} \input{main/introduction.tex} \input{main/preliminaries.tex} \input{main/motivation.tex} \input{main/theory.tex} \input{main/experiment.tex} % \input{main/relatedwork.tex} \input{main/conclusion.tex} \bibliography{aaai25} \end{document}