‪Michel Ma‬ - ‪Google 學術搜尋‬

建立我自己的個人學術檔案

引用次數

	全部	自 2019 年
引文	14	14
H 指數	2	2
i10 指數	0	0

0

8

4

202320247 7

共同作者

Pierre-Luc BaconUniversity of Montreal在 mila.quebec 的電子郵件地址已通過驗證
Tianwei NiMila, University of Montreal在 mila.quebec 的電子郵件地址已通過驗證
Pierluca D'OroMila & Meta在 mila.quebec 的電子郵件地址已通過驗證

Michel Ma

Michel Ma

PhD candidate, University of Montreal, Mila

在 mila.quebec 的電子郵件地址已通過驗證

Reinforcement Learning Deep Learning


標題按引用次數排序按年份排序按標題排序	引用次數引用次數	年份
When do transformers shine in rl? decoupling memory from credit assignment T Ni, M Ma, B Eysenbach, PL Bacon Advances in Neural Information Processing Systems 36, 2024	8	2024
Long-term credit assignment via model-based temporal shortcuts M Ma, P D'Oro, Y Bengio, PL Bacon Deep RL Workshop NeurIPS 2021, 2021	5	2021
Counterfactual Policy Evaluation and the Conditional Monte Carlo Method M Ma, B Pierre-Luc Offline Reinforcement Learning Workshop, NeurIPS, 2020	1	2020
Do Transformer World Models Give Better Policy Gradients? M Ma, T Ni, C Gehring, P D'Oro, PL Bacon arXiv preprint arXiv:2402.05290, 2024		2024
Bridging State and History Representations: Understanding Self-Predictive RL T Ni, B Eysenbach, E Seyedsalehi, M Ma, C Gehring, A Mahajan, ... arXiv preprint arXiv:2401.08898, 2024		2024
A Differentiable Sequence Model Perspective on Policy Gradients M Ma, P D'Oro, T Ni, C Gehring, PL Bacon		2023
Parsimonious reasoning in reinforcement learning for better credit assignment M Ma		2022

系統目前無法執行作業，請稍後再試。

文章 1–7