default search action

combined dblp search
author search
venue search
publication search

ask others

Mohammad Gheshlaghi Azar

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j3]
- view
  - electronic edition @ jmlr.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/jmlr/RowlandMATOHTBD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/RowlandMATOHTBD24
Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney:
An Analysis of Quantile Temporal-Difference Learning. J. Mach. Learn. Res. 25: 163:1-163:47 (2024)
[c24]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/AzarGPMRVC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/AzarGPMRVC24
Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Rémi Munos, Mark Rowland, Michal Valko, Daniele Calandriello:
A General Theoretical Paradigm to Understand Learning from Human Preferences. AISTATS 2024: 4447-4455
[c23]
- view
  - electronic edition @ aclanthology.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/emnlp/Flet-BerliacGSC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/Flet-BerliacGSC24
Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Bill Wu, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. EMNLP 2024: 21353-21370
[c22]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/MunosVCARGTGMFM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MunosVCARGTGMFM24
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. ICML 2024
[i32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-19107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-19107
Pierre Harvey Richemond, Yunhao Tang, Daniel Guo, Daniele Calandriello, Mohammad Gheshlaghi Azar, Rafael Rafailov, Bernardo Ávila Pires, Eugene Tarassov, Lucas Spangher, Will Ellsworth, Aliaksei Severyn, Jonathan Mallinson, Lior Shani, Gil Shamir, Rishabh Joshi, Tianqi Liu, Rémi Munos, Bilal Piot:
Offline Regularised Reinforcement Learning for Large Language Models Alignment. CoRR abs/2405.19107 (2024)
[i31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-01660
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-01660
Eugene Choi, Arash Ahmadian, Matthieu Geist, Olivier Pietquin, Mohammad Gheshlaghi Azar:
Self-Improving Robust Preference Optimization. CoRR abs/2406.01660 (2024)
[i30]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19185
Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. CoRR abs/2406.19185 (2024)
[i29]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19188
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19188
Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist:
Averaging log-likelihoods in direct alignment. CoRR abs/2406.19188 (2024)
2023
[c21]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/KitamuraKTVVYMM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/KitamuraKTVVYMM23
Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175
[c20]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/TangGRPCMRALL0T23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/TangGRPCMRALL0T23
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. ICML 2023: 33632-33656
[i28]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-04462
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-04462
Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney:
An Analysis of Quantile Temporal-Difference Learning. CoRR abs/2301.04462 (2023)
[i27]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13185
Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023)
[i26]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-12036
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-12036
Mohammad Gheshlaghi Azar, Mark Rowland, Bilal Piot, Daniel Guo, Daniele Calandriello, Michal Valko, Rémi Munos:
A General Theoretical Paradigm to Understand Learning from Human Preferences. CoRR abs/2310.12036 (2023)
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-00886
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-00886
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. CoRR abs/2312.00886 (2023)
2022
[c19]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/ThakoorTAADMVV22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ThakoorTAADMVV22
Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Mehdi Azabou, Eva L. Dyer, Rémi Munos, Petar Velickovic, Michal Valko:
Large-Scale Representation Learning on Graphs via Bootstrapping. ICLR 2022
[c18]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/GuoTPPATSCGTVMA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuoTPPATSCGTVMA22
Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. NeurIPS 2022
[i24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-14211
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-14211
Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022)
[i23]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-08332
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-08332
Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Ávila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot:
BYOL-Explore: Exploration by Bootstrapped Prediction. CoRR abs/2206.08332 (2022)
[i22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-03319
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-03319
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko:
Understanding Self-Predictive Learning for Reinforcement Learning. CoRR abs/2212.03319 (2022)
2021
[c17]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/LiuADLAHVD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LiuADLAHVD21
Ran Liu, Mehdi Azabou, Max Dabagia, Chi-Heng Lin, Mohammad Gheshlaghi Azar, Keith B. Hengen, Michal Valko, Eva L. Dyer:
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity. NeurIPS 2021: 10587-10599
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2101-02055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-02055
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Ávila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos:
Geometric Entropic Exploration. CoRR abs/2101.02055 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-06514
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-06514
Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Rémi Munos, Petar Velickovic, Michal Valko:
Bootstrapped Representation Learning on Graphs. CoRR abs/2102.06514 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-10106
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-10106
Mehdi Azabou, Mohammad Gheshlaghi Azar, Ran Liu, Chi-Heng Lin, Erik C. Johnson, Kiran Bhaskaran-Nair, Max Dabagia, Keith B. Hengen, William R. Gray Roncal, Michal Valko, Eva L. Dyer:
Mine Your Own vieW: Self-Supervised Learning Through Across-Sample Prediction. CoRR abs/2102.10106 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2111-02338
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-02338
Ran Liu, Mehdi Azabou, Max Dabagia, Chi-Heng Lin, Mohammad Gheshlaghi Azar, Keith B. Hengen, Michal Valko, Eva L. Dyer:
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity. CoRR abs/2111.02338 (2021)
2020
[c16]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/GuoPPGAMA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GuoPPGAMA20
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. ICML 2020: 3875-3886
[c15]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/MunosPLRVLTHOGA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/MunosPLRVLTHOGA20
Rémi Munos, Julien Pérolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls:
Fast computation of Nash Equilibria in Imperfect Information Games. ICML 2020: 7119-7129
[c14]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/GrillSATRBDPGAP20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GrillSATRBDPGAP20
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. NeurIPS 2020
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2004-14646
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-14646
Zhaohan Daniel Guo, Bernardo Ávila Pires, Bilal Piot, Jean-Bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar:
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning. CoRR abs/2004.14646 (2020)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2006-07733
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-07733
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Ávila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko:
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. CoRR abs/2006.07733 (2020)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2008-12234
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2008-12234
Audrunas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Pérolat, Dustin Morrill, Vinícius Flores Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls:
The Advantage Regret-Matching Actor-Critic. CoRR abs/2008.12234 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c13]
- view
- export record
  dblp key:
  - conf/nips/HarutyunyanDMAP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HarutyunyanDMAP19
Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. NeurIPS 2019: 12467-12476
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1902-07685
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-07685
Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Jean-Bastien Grill, Florent Altché, Rémi Munos:
World Discovery Models. CoRR abs/1902.07685 (2019)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1905-03030
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-03030
Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alexander Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin J. Miller, Mohammad Gheshlaghi Azar, Ian Osband, Neil C. Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew M. Botvinick, Shane Legg:
Meta-learning of Sequential Strategies. CoRR abs/1905.03030 (2019)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1912-02503
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-02503
Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. CoRR abs/1912.02503 (2019)
2018
[c12]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HesselMHSODHPAS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HesselMHSODHPAS18
Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. AAAI 2018: 3215-3222
[c11]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/FortunatoAPMHOG18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/FortunatoAPMHOG18
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Matteo Hessel, Ian Osband, Alex Graves, Volodymyr Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks For Exploration. ICLR (Poster) 2018
[c10]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/GruslysDAPBM18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/GruslysDAPBM18
Audrunas Gruslys, Will Dabney, Mohammad Gheshlaghi Azar, Bilal Piot, Marc G. Bellemare, Rémi Munos:
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning. ICLR (Poster) 2018
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1805-11593
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-11593
Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Vecerík, Matteo Hessel, Rémi Munos, Olivier Pietquin:
Observe and Look Further: Achieving Consistent Performance on Atari. CoRR abs/1805.11593 (2018)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1811-06407
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-06407
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Toby Pohlen, Rémi Munos:
Neural Predictive Belief Representations. CoRR abs/1811.06407 (2018)
2017
[c9]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/AzarOM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/AzarOM17
Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. ICML 2017: 263-272
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/AzarOM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/AzarOM17
Mohammad Gheshlaghi Azar, Ian Osband, Rémi Munos:
Minimax Regret Bounds for Reinforcement Learning. CoRR abs/1703.05449 (2017)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/GruslysABM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/GruslysABM17
Audrunas Gruslys, Mohammad Gheshlaghi Azar, Marc G. Bellemare, Rémi Munos:
The Reactor: A Sample-Efficient Actor-Critic Architecture. CoRR abs/1704.04651 (2017)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/FortunatoAPMOGM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/FortunatoAPMOGM17
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg:
Noisy Networks for Exploration. CoRR abs/1706.10295 (2017)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1710-02298
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1710-02298
Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Daniel Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. CoRR abs/1710.02298 (2017)
2016
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/ccia/GomezAK16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ccia/GomezAK16
Vicenç Gómez, Mohammad Gheshlaghi Azar, Hilbert J. Kappen:
Correcting Multivariate Auto-Regressive Models for the Influence of Unobserved Common Input. CCIA 2016: 177-186
[c7]
- view
  - electronic edition @ auai.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/uai/AzarDK16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/uai/AzarDK16
Mohammad Gheshlaghi Azar, Eva L. Dyer, Konrad P. Körding:
Convex Relaxation Regression: Black-Box Optimization of Smooth Functions by Learning Their Convex Envelopes. UAI 2016
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/AzarDK16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/AzarDK16
Mohammad Gheshlaghi Azar, Eva L. Dyer, Konrad P. Körding:
Convex Relaxation Regression: Black-Box Optimization of Smooth Functions by Learning Their Convex Envelopes. CoRR abs/1602.02191 (2016)
2014
[c6]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/AzarLB14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/AzarLB14
Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Online Stochastic Optimization under Correlated Bandit Feedback. ICML 2014: 1557-1565
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/AzarLB14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/AzarLB14
Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Stochastic Optimization of a Locally Smooth Function under Correlated Bandit Feedback. CoRR abs/1402.0562 (2014)
2013
[j2]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/ml/AzarMK13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ml/AzarMK13
Mohammad Gheshlaghi Azar, Rémi Munos, Hilbert J. Kappen:
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model. Mach. Learn. 91(3): 325-349 (2013)
[c5]
- view
- export record
  dblp key:
  - conf/nips/AzarLB13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/AzarLB13
Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Sequential Transfer in Multi-armed Bandit with Finite Set of Models. NIPS 2013: 2220-2228
[c4]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/pkdd/AzarLB13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/pkdd/AzarLB13
Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Regret Bounds for Reinforcement Learning with Policy Advice. ECML/PKDD (1) 2013: 97-112
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1305-1027
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1305-1027
Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Regret Bounds for Reinforcement Learning with Policy Advice. CoRR abs/1305.1027 (2013)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/AzarLB13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/AzarLB13
Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill:
Sequential Transfer in Multi-armed Bandit with Finite Set of Models. CoRR abs/1307.6887 (2013)
2012
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/jmlr/AzarGK12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/AzarGK12
Mohammad Gheshlaghi Azar, Vicenç Gómez, Hilbert J. Kappen:
Dynamic policy programming. J. Mach. Learn. Res. 13: 3207-3245 (2012)
[c3]
- view
  - electronic edition @ icml.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/AzarMK12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/AzarMK12
Mohammad Gheshlaghi Azar, Rémi Munos, Bert Kappen:
On the Sample Complexity of Reinforcement Learning with a Generative Model . ICML 2012
2011
[c2]
- view
- export record
  dblp key:
  - conf/nips/AzarMGK11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/AzarMGK11
Mohammad Gheshlaghi Azar, Rémi Munos, Mohammad Ghavamzadeh, Hilbert J. Kappen:
Speedy Q-Learning. NIPS 2011: 2411-2419
[c1]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/jmlr/AzarGK11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/AzarGK11
Mohammad Gheshlaghi Azar, Vicenç Gómez, Bert Kappen:
Dynamic Policy Programming with Function Approximation. AISTATS 2011: 119-127
2010
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1004-2027
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1004-2027
Mohammad Gheshlaghi Azar, Hilbert J. Kappen:
Dynamic Policy Programming. CoRR abs/1004.2027 (2010)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.