北京大学统计科学中心

首页» En» Events» Seminars» Statistics and Data Science

Statistics and Data Science

Integral-Operator-Based Spectral Algorithms for Goodness-of-Fit Tests/Reveal the Value of Data: A Data Trading Platform with Bidirectional Evaluation Mechanism

Holder： Hongqi Chen（Hunan University）

Time：2026-06-11 15:10-17:00

Location：Room 217, Guanghua Building 2

Abstract: The widespread adoption of the maximum mean discrepancy (MMD) in goodness-of-fit testing has spurred extensive research on its statistical performance. However, recent studies indicate that the inherent structure of MMD may constrain its ability to distinguish between distributions, leaving room for improvement. Regularization techniques have the potential to overcome this limitation by refining the discrepancy measure. In this talk, we introduce a family of regularized kernel-based discrepancy measures constructed via spectral filtering. Our framework can be regarded as a natural generalization of prior studies, removing restrictive assumptions on both kernel functions and filter functions, thereby broadening the methodological scope and the theoretical inclusiveness. We establish non-asymptotic guarantees showing that the resulting tests achieve valid Type~I error control and enhanced power performance. Numerical experiments are conducted to demonstrate the broader generality and competitive performance of the proposed tests compared with existing methods. Abstract 2: Currently, data trading platforms are confronted with three major challenges: inadequate pre-purchase utility assessment, opaque pricing mechanisms, and imperfect after-sales support. These issues result in severe information asymmetry between buyers and sellers, thereby undermining market efficiency. To address these problems, we have proposed a data trading platform that incorporates a Kernel-based Bidirectional Evaluation Assessment Mechanism (KBEAM). At its core lies the Multilevel Regularization Kernel Method (MRKM), which enables accurate value assessment using only a small number of free samples (2%-5% for general predictive tasks and approximately 10% for extreme event detection). This significantly reduces the cost of information disclosure for sellers. The platform offers comprehensive oversight at different transaction stages. Before the sale, it monitors versioning strategies to ensure sample representativeness. During the sale, it provides value - based search and purchase recommendations. After the sale, it offers a data enhancement compensation mechanism. Through extensive experiments on datasets from multiple fields, we have found that the dynamic pricing of KBEAM outperforms traditional strategies in various market scenarios, maximizing both transaction rates and revenues. Moreover, the after-sales mechanism incentivizes both parties to participate honestly. This research provides practical guidelines for determining the quantity of free samples, selecting pricing methods, and optimizing after - sales services on data platforms. It effectively bridges the functional gap between data and physical goods trading platforms, offering a blueprint for constructing reliable data trading platforms and promoting the development of the data economy.

About the Speaker:

朱学虎，西安交通大学教授，博士生导师，主要从事统计学习、高维数据分析及应用统计等领域的研究，在Journal of the American Statistical Association、Journal of Business & Economic Statistics、Science China Mathematics、IEEE Transactions on Geoscience and Remote Sensing、NeurIPS等发表学术论文40余篇，。先后主持科技部重点研发计划子课题、国家自然科学基金面上项目、国家自然科学基金重点项目课题等，入选陕西省高校青年杰出人才支持计划和仲英青年学者等荣誉。

林绍波，西安交通大学管理学院，教授、博士生导师。研究方向为分布式学习理论、深度学习理论及强化学习理论。在JMLR，TPAMI，TIT,IJOC等期刊发表学术论文70余篇。主持或以核心骨干参与国家级课题11项。

Your participation is warmly welcomed!

欢迎扫码关注北大统计科学中心公众号，了解更多讲座信息!