Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

MMD Aggregated Two-Sample Test

Antonin Schrab 1, 2, 3 Ilmun Kim 4 Mélisande Albert 5, 6 Béatrice Laurent 6, 5 Benjamin Guedj 7, 1, 2, 3 Arthur Gretton 8 
1 MODAL - MOdel for Data Analysis and Learning
LPP - Laboratoire Paul Painlevé - UMR 8524, Université de Lille, Sciences et Technologies, Inria Lille - Nord Europe, METRICS - Evaluation des technologies de santé et des pratiques médicales - ULR 2694, Polytech Lille - École polytechnique universitaire de Lille
Abstract : We propose a novel nonparametric two-sample test based on the Maximum Mean Discrepancy (MMD), which is constructed by aggregating tests with different kernel bandwidths. This aggregation procedure, called MMDAgg, ensures that test power is maximised over the collection of kernels used, without requiring held-out data for kernel selection (which results in a loss of test power), or arbitrary kernel choices such as the median heuristic. We work in the non-asymptotic framework, and prove that our aggregated test is minimax adaptive over Sobolev balls. Our guarantees are not restricted to a specific kernel, but hold for any product of one-dimensional translation invariant characteristic kernels which are absolutely and square integrable. Moreover, our results apply for popular numerical procedures to determine the test threshold, namely permutations and the wild bootstrap. Through numerical experiments on both synthetic and real-world datasets, we demonstrate that MMDAgg outperforms alternative state-of-the-art approaches to MMD kernel adaptation for two-sample testing.
Complete list of metadata

https://hal.inria.fr/hal-03408976
Contributor : Antonin Schrab Connect in order to contact the contributor
Submitted on : Wednesday, June 29, 2022 - 6:15:32 PM
Last modification on : Monday, July 4, 2022 - 9:16:03 AM

File

2110.15073.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

  • HAL Id : hal-03408976, version 2
  • ARXIV : 2110.15073

Citation

Antonin Schrab, Ilmun Kim, Mélisande Albert, Béatrice Laurent, Benjamin Guedj, et al.. MMD Aggregated Two-Sample Test. 2022. ⟨hal-03408976v2⟩

Share

Metrics

Record views

103

Files downloads

95