机器学习与数据科学博士生系列论坛(第一百期)—— Distributional Temporal Difference Learning with Linear Function Approximation
Holder: Kaicheng Jin(Peking University)
Time:2026-04-16 16:00-17:00
Location:Tencent Conference 928-6293-8217
Abstract:
In this talk, we study the finite-sample statistical efficiency of distributional temporal difference (TD) learning with linear function approximation. Distributional TD learning aims to estimate the full return distribution in a discounted Markov decision process under a given policy. Building on our algorithms, we show that with linear function approximation, learning the entire return distribution from streaming data is no more difficult than learning its expectation (the value function). Furthermore, variance reduction techniques can be used to achieve tighter sample complexity bounds independent of the support size. This talk will provide new theoretical insights into when and why distributional reinforcement learning can be statistically efficient, bridging the gap between distributional and classical TD methods in the linear function approximation regime. In addition, we will also present empirical results to show the strengths and limitations of our methods.
About the Speaker:
该线上论坛每两周主办一次(除了公共假期)。论坛每次邀请一位博士生就某个前沿课题做较为系统深入的介绍,主题包括但不限于机器学习、高维统计学、运筹优化和理论计算机科学。
Your participation is warmly welcomed!

欢迎扫码关注北大统计科学中心公众号,了解更多讲座信息!