[go: up one dir, main page]

  EconPapers    
Economics at your fingertips  
 

Strong Uniform Value in Gambling Houses and Partially Observable Markov Decision Processes

Xavier Venel and Bruno Ziliotto
Additional contact information
Bruno Ziliotto: CEREMADE - CEntre de REcherches en MAthématiques de la DEcision - Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres - CNRS - Centre National de la Recherche Scientifique, Université Paris Dauphine-PSL - PSL - Université Paris Sciences et Lettres

Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) from HAL

Abstract: In several standard models of dynamic programming (gambling houses, MDPs, POMDPs), we prove the existence of a robust notion of value for the infinitely repeated problem, namely the strong uniform value. This solves two open problems. First, this shows that for any > 0, the decision-maker has a pure strategy σ which is-optimal in any n-stage problem, provided that n is big enough (this result was only known for behavior strategies, that is, strategies which use randomization). Second, for any > 0, the decision-maker can guarantee the limit of the n-stage value minus in the infinite problem where the payoff is the expectation of the inferior limit of the time average payoff.

Keywords: Partial Observation; Markov decision processes; Dynamic programming; Long-run average payoff; Uniform value (search for similar items in EconPapers)
Date: 2016
Note: View the original document on HAL open archive server: https://hal.science/hal-01395429
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Published in SIAM Journal on Control and Optimization, 2016, 54 (4), pp.1983-2008. ⟨10.1137/15M1043340⟩

Downloads: (external link)
https://hal.science/hal-01395429/document (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hal:cesptp:hal-01395429

DOI: 10.1137/15M1043340

Access Statistics for this paper

More papers in Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) from HAL
Bibliographic data for series maintained by CCSD ().

 
Page updated 2024-04-23
Handle: RePEc:hal:cesptp:hal-01395429