北京大学统计科学中心

首页» En» Events» Seminars» Statistics and Data Science

Statistics and Data Science

Heavy-tailed Information-Theoretic Generalization Bounds with Applications to LLM Safety Alignment

Holder： Minghui Zhang（Beihang University）

Time：2026-03-19 15:10-17:00

Location：Room 217, Guanghua Building 2

Abstract: Classical information-theoretic generalization bounds, which link generalization error to the mutual information between an algorithm's input and output, typically rely on sub-Gaussian assumptions or finite moment generating functions (MGFs). However, these assumptions are often violated in heavy-tailed scenarios, such as adversarial training, reinforcement learning with rare high-reward events, and financial modeling. In this work, we bridge this gap by establishing a comprehensive framework for generalization under heavy-tailed sub-Weibull regimes. We demonstrate that standard KL divergence bounds are vacuous in these settings due to the unboundedness of extreme events. To overcome this, we introduce a novel decorrelation lemma based on Rényi divergence and a generalized, MGF-free Young-type inequality. By combining these tools with a refined chaining technique on the space of measures, we derive Dudley-type generalization bounds that explicitly depend on the tail parameter and the Rényi information. Additionally, we establish new maximal inequalities and information-theoretic generalization bounds under the assumption that the loss functions exhibit heavy-tailed sub-Weibull behavior. We apply our theory to large language models (LLMs) in the context of reward hacking within Reinforcement Learning from Human Feedback (RLHF). We show that Rényi-regularized alignment provides finite reward guarantees and ensures that best-of-N policies remain well-controlled, thereby mitigating the catastrophic Goodhart effects where standard KL-regularization fails.

About the Speaker:

张慧铭是北航人工智能研究院的副教授(准聘)、硕士生导师；北航数学科学学院兼职博导。曾在澳门大学担任濠江学者博士后研究员(2020-2022)；曾就读于北京大学(2016-2020)获得统计学博士学位。研究方向：稳健机器学习, AI数学理论(泛化误差、非渐近\小样本理论)、高维概率统计、函数型数据、子抽样估计、莱维过程等。发表SCI论文30篇(包括AI与自动化领域顶刊JMLR, IEEE-TAC；统计顶刊JASA, Biometrika、精算顶刊IME；Nature子刊Scientific Reports)，谷歌学术引用次数近千次(其中5篇(曾)为高被引论文)。曾担任过美国《数学评论》评论员；概率统计、AI与机器学习领域顶刊(AOS,AOAP,JASA,JMLR,IEEE-TSP)的审稿人。

Your participation is warmly welcomed!

欢迎扫码关注北大统计科学中心公众号，了解更多讲座信息!