Profils utilisateurs correspondant à "Marlos C. Machado"
Marlos C. MachadoUniversity of Alberta; Amii Adresse e-mail validée de ualberta.ca Cité 3268 fois |
Autonomous navigation of stratospheric balloons using reinforcement learning
Efficiently navigating a superpressure balloon in the stratosphere 1 requires the integration
of a multitude of cues, such as wind speed and solar elevation, and the process is …
of a multitude of cues, such as wind speed and solar elevation, and the process is …
Revisiting the arcade learning environment: Evaluation protocols and open problems for general agents
The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge
of building AI agents with general competency across dozens of Atari 2600 games. It …
of building AI agents with general competency across dozens of Atari 2600 games. It …
A laplacian framework for option discovery in reinforcement learning
MC Machado, MG Bellemare… - … on Machine Learning, 2017 - proceedings.mlr.press
Abstract Representation learning and option discovery are two of the biggest challenges in
reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for …
reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for …
Contrastive behavioral similarity embeddings for generalization in reinforcement learning
Reinforcement learning methods trained on few environments rarely learn policies that
generalize to unseen environments. To improve generalization, we incorporate the inherent …
generalize to unseen environments. To improve generalization, we incorporate the inherent …
Count-based exploration with the successor representation
In this paper we introduce a simple approach for exploration in reinforcement learning (RL)
that allows us to develop theoretically justified algorithms in the tabular case but that is also …
that allows us to develop theoretically justified algorithms in the tabular case but that is also …
Generalization and regularization in dqn
… Introducing flavours to the ALE is not one of our contributions, this was done by Machado et
al. (2018). … Marlos C. Machado performed part of this work while at the University of Alberta. …
al. (2018). … Marlos C. Machado performed part of this work while at the University of Alberta. …
Eigenoption discovery through the deep successor representation
Options in reinforcement learning allow agents to hierarchically decompose a task into
subtasks, having the potential to speed up learning and planning. However, autonomously …
subtasks, having the potential to speed up learning and planning. However, autonomously …
Loss of plasticity in continual deep reinforcement learning
In this paper, we characterize the behavior of canonical value-based deep reinforcement
learning (RL) approaches under varying degrees of non-stationarity. In particular, we …
learning (RL) approaches under varying degrees of non-stationarity. In particular, we …
State of the art control of atari games using shallow reinforcement learning
The recently introduced Deep Q-Networks (DQN) algorithm has gained attention as one of
the first successful combinations of deep neural networks and reinforcement learning. Its …
the first successful combinations of deep neural networks and reinforcement learning. Its …
True online temporal-difference learning
The temporal-difference methods TD(λ) and Sarsa(λ) form a core part of modern reinforcement
learning. Their appeal comes from their good performance, low computational cost, and …
learning. Their appeal comes from their good performance, low computational cost, and …