\dm_csml_event_details UCL ELLIS

Equivalence of distance-based and RKHS-based statistics in hypothesis testing


Speaker

Dino Sejdinovic

Affiliation

UCL

Date

Friday, 26 October 2012

Time

12:30-14:00

Location

Zoom

Link

Cruciform B404 - LT2

Event series

Jump Trading/ELLIS CSML Seminar Series

Abstract

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, Maximum Mean Discrepancies (MMD), i.e., distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with the semimetric of negative type, a positive definite kernel, termed distance kernel, may be defined such that the MMD corresponds exactly to the energy distance. Conversely, for any positive definite kernel, we can interpret the MMD as energy distance with respect to some negative-type semimetric. This equivalence readily extends to the case of independence testing using kernels on the product space. We determine the class of probability distributions for which the test statistics are consistent against all alternatives. Finally, we investigate the performance of the family of distance kernels in two-sample and independence tests.

Slides for the talk: PDF

Biography