LLMs for optimization modeling: benchmark, data synthesis and reinforcement learning with verifiable reward
报告人: Yitian Chen(Cardinal Operations)
时间:2025-12-25 15:10-17:00
地点:Room 217, Guanghua Building 2
Abstract:
This presentation provides a comprehensive overview of the current landscape of Large Language Models (LLMs) for Operations Research (OR) and Optimization Modeling and focus on three critical areas essential for advancing this field: Benchmark Development, High-Quality Synthetic Data Generation (including Data Distillation), and Open-Source Training Methodologies. We will introduce our latest work, Solver-Informed Reinforcement Learning (SI-RL), which has been accepted to NeurIPS 2025. This novel framework proposes a robust reasoning paradigm for optimization tasks and leverages the Reinforcement Learning with Verifiable Reward (RLVR) framework, employing an external optimization solver as the primary verification mechanism.Leveraging our enhanced synthetic data pipeline and the SIRL framework, our method achieves state-of-the-art (SOTA) results on established LLMs for OR benchmarks and surpass the DeepSeek-R1 and OpenAI-o3. Finally, we will discuss current community limitations in benchmarking and outline directions for our future work.
About the Speaker:
Yitian Chen is an Algorithmic Scientist and LLM Team Leader at Cardinal Operations. His expertise lies in the research and practical application of AI and Large Language Models (LLMs) in advanced domains such as optimization modeling and enterprise forecasting. He has a distinguished record in international data mining competitions, having secured multiple championship and runner-up positions in prestigious challenges, including the KDD Cup, CVPR Cup, and PCIC Cup.

Your participation is warmly welcomed!

欢迎扫码关注北大统计科学中心公众号,了解更多讲座信息!