A multivariate extension of Azadkia-Chatterjee’s rank coefficient
Prof. Yuhao Wang
Assistant Professor
Institute for Interdisciplinary Information Sciences
Tsinghua University
The Azadkia-Chatterjee coefficient is a rank-based measure of dependence between a random variable and a random vector. In this paper, we further extend it to a measure of the dependence between two random vectors Y and Z, based on n i.i.d. samples. The proposed coefficient converges almost surely to a limit with the following properties: i) it lies in [0, 1]; ii) it is equal to zero if and only if the random vectors Y and Z are independent; and iii) it is equal to one if and only if Y is almost surely a function of Z. Remarkably, the only assumption required by this convergence is that Y is not almost surely a constant vector. We further prove that under the same mild condition and after a proper scaling, this coefficient converges in distribution to a standard normal random variable under the null hypothesis of independence. This asymptotic normality result allows us to construct a Wald-type hypothesis test of independence based on this coefficient. To compute this coefficient, we propose a merge sort based algorithm that runs in O(n (\log n)^{dim(Y)}). Finally, we show that it can be used to measure the conditional dependence between Y and Z conditional on a third random vector X, and prove that the measure is monotonic with respect to the deviation from an independence distribution under certain model restrictions.
Yuhao Wang is an assistant professor in the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University. Before joining Tsinghua, Yuhao was a postdoctoral research associate at the University of Cambridge. Yuhao received his Ph.D. from Massachusetts Institute of Technology and his Bachelor from Tsinghua University. Yuhao’s main research interests include causal inference and distribution-free test. Yuhao received the Forbes China 30 under 30 award in 2021 and is currently serving as an Associate Editor for the Electronic Journal of Statistics.
















