Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes
Xavier Venel and
Bruno Ziliotto
Additional contact information
Bruno Ziliotto: CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique, Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres
Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) from HAL
Abstract:
In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a robust notion of value for the infinitely repeated problem, namely the strong uniform value. This solves two open problems. First, this shows that for any > 0, the decision-maker has a pure strategy σ which is-optimal in any n-stage problem, provided that n is big enough (this result was only known for behavior strategies, that is, strategies which use randomization). Second, for any > 0, the decision-maker can guarantee the limit of the n-stage value minus in the infinite problem where the payoff is the expectation of the inferior limit of the time average payoff.
Keywords: Partial Observation; Markov decision processes; Dynamic programming; Long-run average payoff; Uniform value (search for similar items in EconPapers)
Date: 2016
Note: View the original document on HAL open archive server: https://hal.science/hal-01395429
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Published in SIAM Journal on Control and Optimization, 2016, 54 (4), pp.1983-2008. ⟨10.1137/15M1043340⟩
Downloads: (external link)
https://hal.science/hal-01395429/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:cesptp:hal-01395429
DOI: 10.1137/15M1043340
Access Statistics for this paper
More papers in Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) from HAL
Bibliographic data for series maintained by CCSD ().