机器学习与数据科学博士生系列论坛(第七十六期)—— Follow-the-Perturbed-Leader Achieves Best-of-Both-Worlds for Bandit Problems
报告人: 詹景昕(北京大学)
时间:2024-09-19 16:00-17:00
地点:腾讯会议 627-5441-1672
Abstract:
Best-Of-Both-Worlds (BOBW) bandit algorithms that have regret guarantees for both stochastic and adversarial settings have been studied for many years and Tsallis-INF (or other FTRL policies) is one of the most promising frameworks for BOBW policies.
About the Speaker:
However, a limitation of FTRL policies is that we need to explicitly compute the list of arm selection probabilities. The Follow-The-Perturbed-Leader (FTPL) policy has been researched as a promising candidate to circumvent this limitation. In this talk, we will introduce a FTPL algorithm with Fréchet perturbation, which also achieves the BOBW bound, based on a recent work by Lee, Honda, Ito and Oh (Colt 2024).
论坛每次邀请一位博士生就某个前沿课题做较为系统深入的介绍,主题包括但不限于机器学习、高维统计学、运筹优化和理论计算机科学。
Your participation is warmly welcomed!

欢迎扫码关注北大统计科学中心公众号,了解更多讲座信息!