default search action
Piotr Milos
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c21]Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Milos, Yuxiang Wu, Pasquale Minervini:
Analysing The Impact of Sequence Composition on Language Model Pre-Training. ACL (1) 2024: 7897-7912 - [c20]Maciej Mikula, Szymon Tworkowski, Szymon Antoniak, Bartosz Piotrowski, Albert Q. Jiang, Jin Peng Zhou, Christian Szegedy, Lukasz Kucinski, Piotr Milos, Yuhuai Wu:
Magnushammer: A Transformer-Based Approach to Premise Selection. ICLR 2024 - [c19]Michal Nauman, Michal Bortkiewicz, Piotr Milos, Tomasz Trzcinski, Mateusz Ostaszewski, Marek Cygan:
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning. ICML 2024 - [c18]Maciej Wolczyk, Bartlomiej Cupial, Mateusz Ostaszewski, Michal Bortkiewicz, Michal Zajac, Razvan Pascanu, Lukasz Kucinski, Piotr Milos:
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem. ICML 2024 - [i30]Maciej Wolczyk, Bartlomiej Cupial, Mateusz Ostaszewski, Michal Bortkiewicz, Michal Zajac, Razvan Pascanu, Lukasz Kucinski, Piotr Milos:
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem. CoRR abs/2402.02868 (2024) - [i29]Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Milos, Yuxiang Wu, Pasquale Minervini:
Analysing The Impact of Sequence Composition on Language Model Pre-Training. CoRR abs/2402.13991 (2024) - [i28]Michal Nauman, Michal Bortkiewicz, Mateusz Ostaszewski, Piotr Milos, Tomasz Trzcinski, Marek Cygan:
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning. CoRR abs/2403.00514 (2024) - [i27]Lukasz Kucinski, Witold Drzewakowski, Mateusz Olko, Piotr Kozakowski, Lukasz Maziarka, Marta Emilia Nowakowska, Lukasz Kaiser, Piotr Milos:
tsGT: Stochastic Time Series Modeling With Transformer. CoRR abs/2403.05713 (2024) - [i26]Michal Nauman, Mateusz Ostaszewski, Krzysztof Jankowski, Piotr Milos, Marek Cygan:
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control. CoRR abs/2405.16158 (2024) - [i25]Michal Zawalski, Gracjan Góral, Michal Tyrolski, Emilia Wisnios, Franciszek Budrowski, Lukasz Kucinski, Piotr Milos:
What Matters in Hierarchical Search for Combinatorial Reasoning Problems? CoRR abs/2406.03361 (2024) - [i24]Alicja Ziarko, Albert Q. Jiang, Bartosz Piotrowski, Wenda Li, Mateja Jamnik, Piotr Milos:
Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe. CoRR abs/2406.04165 (2024) - 2023
- [c17]Samuel Kessler, Mateusz Ostaszewski, Michal Pawel Bortkiewicz, Mateusz Zarski, Maciej Wolczyk, Jack Parker-Holder, Stephen J. Roberts, Piotr Milos:
The Effectiveness of World Models for Continual Reinforcement Learning. CoLLAs 2023: 184-204 - [c16]Michal Zawalski, Michal Tyrolski, Konrad Czechowski, Tomasz Odrzygózdz, Damian Stachura, Piotr Piekos, Yuhuai Wu, Lukasz Kucinski, Piotr Milos:
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search. ICLR 2023 - [c15]Wojciech Masarczyk, Mateusz Ostaszewski, Ehsan Imani, Razvan Pascanu, Piotr Milos, Tomasz Trzcinski:
The Tunnel Effect: Building Data Representations in Deep Neural Networks. NeurIPS 2023 - [c14]Mateusz Olko, Michal Zajac, Aleksandra Nowak, Nino Scherrer, Yashas Annadani, Stefan Bauer, Lukasz Kucinski, Piotr Milos:
Trust Your 𝛁: Gradient-based Intervention Targeting for Causal Discovery. NeurIPS 2023 - [c13]Szymon Tworkowski, Konrad Staniszewski, Mikolaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Milos:
Focused Transformer: Contrastive Training for Context Scaling. NeurIPS 2023 - [i23]Maciej Mikula, Szymon Antoniak, Szymon Tworkowski, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Lukasz Kucinski, Piotr Milos, Yuhuai Wu:
Magnushammer: A Transformer-based Approach to Premise Selection. CoRR abs/2303.04488 (2023) - [i22]Michal Zajac, Kamil Deja, Anna Kuzina, Jakub M. Tomczak, Tomasz Trzcinski, Florian Shkurti, Piotr Milos:
Exploring Continual Learning of Diffusion Models. CoRR abs/2303.15342 (2023) - [i21]Wojciech Masarczyk, Mateusz Ostaszewski, Ehsan Imani, Razvan Pascanu, Piotr Milos, Tomasz Trzcinski:
The Tunnel Effect: Building Data Representations in Deep Neural Networks. CoRR abs/2305.19753 (2023) - [i20]Szymon Tworkowski, Konrad Staniszewski, Mikolaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Milos:
Focused Transformer: Contrastive Training for Context Scaling. CoRR abs/2307.03170 (2023) - [i19]Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur, Henryk Michalewski, Lukasz Kucinski, Piotr Milos:
Structured Packing in LLM Training Improves Long Context Utilization. CoRR abs/2312.17296 (2023) - 2022
- [c12]Michal Zawalski, Blazej Osinski, Henryk Michalewski, Piotr Milos:
Off-Policy Correction For Multi-Agent Reinforcement Learning. AAMAS 2022: 1774-1776 - [c11]Piotr Kozakowski, Mikolaj Pacek, Piotr Milos:
Planning and Learning using Adaptive Entropy Tree Search. IJCNN 2022: 1-8 - [c10]Albert Qiaochu Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygózdz, Piotr Milos, Yuhuai Wu, Mateja Jamnik:
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers. NeurIPS 2022 - [c9]Maciej Wolczyk, Michal Zajac, Razvan Pascanu, Lukasz Kucinski, Piotr Milos:
Disentangling Transfer in Continual Reinforcement Learning. NeurIPS 2022 - [i18]Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygózdz, Piotr Milos, Yuhuai Wu, Mateja Jamnik:
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers. CoRR abs/2205.10893 (2022) - [i17]Michal Zawalski, Michal Tyrolski, Konrad Czechowski, Damian Stachura, Piotr Piekos, Tomasz Odrzygózdz, Yuhuai Wu, Lukasz Kucinski, Piotr Milos:
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search. CoRR abs/2206.00702 (2022) - [i16]Maciej Wolczyk, Michal Zajac, Razvan Pascanu, Lukasz Kucinski, Piotr Milos:
Disentangling Transfer in Continual Reinforcement Learning. CoRR abs/2209.13900 (2022) - [i15]Mateusz Olko, Michal Zajac, Aleksandra Nowak, Nino Scherrer, Yashas Annadani, Stefan Bauer, Lukasz Kucinski, Piotr Milos:
Trust Your ∇: Gradient-based Intervention Targeting for Causal Discovery. CoRR abs/2211.13715 (2022) - [i14]Samuel Kessler, Piotr Milos, Jack Parker-Holder, Stephen J. Roberts:
The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning. CoRR abs/2211.15944 (2022) - 2021
- [c8]Konrad Czechowski, Piotr Januszewski, Piotr Kozakowski, Lukasz Kucinski, Piotr Milos:
Structure and Randomness in Planning and Reinforcement Learning. IJCNN 2021: 1-8 - [c7]Konrad Czechowski, Tomasz Odrzygózdz, Michal Izworski, Marek Zbysinski, Lukasz Kucinski, Piotr Milos:
Trust, but Verify: Alleviating Pessimistic Errors in Model-Based Exploration. IJCNN 2021: 1-8 - [c6]Konrad Czechowski, Tomasz Odrzygózdz, Marek Zbysinski, Michal Zawalski, Krzysztof Olejnik, Yuhuai Wu, Lukasz Kucinski, Piotr Milos:
Subgoal Search For Complex Reasoning Tasks. NeurIPS 2021: 624-638 - [c5]Lukasz Kucinski, Tomasz Korbak, Pawel Kolodziej, Piotr Milos:
Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication. NeurIPS 2021: 23075-23088 - [c4]Maciej Wolczyk, Michal Zajac, Razvan Pascanu, Lukasz Kucinski, Piotr Milos:
Continual World: A Robotic Benchmark For Continual Reinforcement Learning. NeurIPS 2021: 28496-28510 - [i13]Piotr Kozakowski, Mikolaj Pacek, Piotr Milos:
Robust and Efficient Planning using Adaptive Entropy Tree Search. CoRR abs/2102.06808 (2021) - [i12]Maciej Wolczyk, Michal Zajac, Razvan Pascanu, Lukasz Kucinski, Piotr Milos:
Continual World: A Robotic Benchmark For Continual Reinforcement Learning. CoRR abs/2105.10919 (2021) - [i11]Konrad Czechowski, Tomasz Odrzygózdz, Marek Zbysinski, Michal Zawalski, Krzysztof Olejnik, Yuhuai Wu, Lukasz Kucinski, Piotr Milos:
Subgoal Search For Complex Reasoning Tasks. CoRR abs/2108.11204 (2021) - [i10]Lukasz Kucinski, Tomasz Korbak, Pawel Kolodziej, Piotr Milos:
Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication. CoRR abs/2111.06464 (2021) - [i9]Michal Zawalski, Blazej Osinski, Henryk Michalewski, Piotr Milos:
Off-Policy Correction For Multi-Agent Reinforcement Learning. CoRR abs/2111.11229 (2021) - [i8]Piotr Januszewski, Mateusz Olko, Michal Królikowski, Jakub Swiatkowski, Marcin Andrychowicz, Lukasz Kucinski, Piotr Milos:
Continuous Control With Ensemble Deep Deterministic Policy Gradients. CoRR abs/2111.15382 (2021) - 2020
- [c3]Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski:
Model Based Reinforcement Learning for Atari. ICLR 2020 - [c2]Blazej Osinski, Adam Jakubowski, Pawel Ziecina, Piotr Milos, Christopher Galias, Silviu Homoceanu, Henryk Michalewski:
Simulation-Based Reinforcement Learning for Real-World Autonomous Driving. ICRA 2020: 6411-6418 - [i7]Blazej Osinski, Piotr Milos, Adam Jakubowski, Pawel Ziecina, Michal Martyniak, Christopher Galias, Antonia Breuer, Silviu Homoceanu, Henryk Michalewski:
CARLA Real Traffic Scenarios - novel training ground and benchmark for autonomous driving. CoRR abs/2012.11329 (2020)
2010 – 2019
- 2019
- [i6]Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Ryan Sepassi, George Tucker, Henryk Michalewski:
Model-Based Reinforcement Learning for Atari. CoRR abs/1903.00374 (2019) - [i5]Tomasz Korbak, Julian Zubek, Lukasz Kucinski, Piotr Milos, Joanna Raczaszek-Leonardi:
Developmentally motivated emergence of compositional communication via template transfer. CoRR abs/1910.06079 (2019) - [i4]Blazej Osinski, Adam Jakubowski, Piotr Milos, Pawel Ziecina, Christopher Galias, Silviu Homoceanu, Henryk Michalewski:
Simulation-based reinforcement learning for real-world autonomous driving. CoRR abs/1911.12905 (2019) - [i3]Piotr Milos, Lukasz Kucinski, Konrad Czechowski, Piotr Kozakowski, Maciej Klimek:
Uncertainty-sensitive Learning and Planning with Ensembles. CoRR abs/1912.09996 (2019) - 2018
- [i2]Lukasz Kidzinski, Sharada Prasanna Mohanty, Carmichael F. Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey M. Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Milos, Blazej Osinski, Andrew Melnik, Malte Schilling, Helge J. Ritter, Sean F. Carroll, Jennifer L. Hicks, Sergey Levine, Marcel Salathé, Scott L. Delp:
Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. CoRR abs/1804.00361 (2018) - [i1]Michal Garmulewicz, Henryk Michalewski, Piotr Milos:
Expert-augmented actor-critic for ViZDoom and Montezumas Revenge. CoRR abs/1809.03447 (2018) - 2017
- [c1]Maciej Klimek, Henryk Michalewski, Piotr Milos:
Hierarchical Reinforcement Learning with Parameters. CoRL 2017: 301-313
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-30 20:33 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint