Résumé
This work is motivated by the study of local protein structure, which is defined by two variable dihedral angles that take values from probability distributions on the flat torus. Our goal is to provide the space $\mathcal{P}(\mathbb{R}^2/\mathbb{Z}^2)$ with a metric that quantifies local structural modifications due to changes in the protein sequence, and to define associated two-sample goodness-of-fit testing approaches. Due to its adaptability to the space geometry, we focus on the Wasserstein distance as a metric between distributions.
We extend existing results of the theory of Optimal Transport to the $d$-dimensional flat torus $\mathbb{T}^d=\mathbb{R}^d/\mathbb{Z}^d$, in particular a Central Limit Theorem. Moreover, we assess different techniques for two-sample goodness-of-fit testing for the two-dimensional case, based on the Wasserstein distance. We provide an implentation of these approaches in \textsf{R}. Their performance is illustrated by numerical experiments on synthetic data and protein structure data.
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Javier González-Delgado : Connectez-vous pour contacter le contributeur
https://hal.science/hal-03369795
Soumis le : jeudi 7 octobre 2021-15:38:40
Dernière modification le : lundi 20 novembre 2023-11:44:19
Archivage à long terme le : samedi 8 janvier 2022-19:20:13
Dates et versions
- HAL Id : hal-03369795 , version 1
Citer
Javier González-Delgado, Alberto González-Sanz, Juan Cortés, Pierre Neuvial. Two-sample goodness-of-fit tests on the flat torus based on Wasserstein distance and their relevance to structural biology. 2021. ⟨hal-03369795v1⟩
371
Consultations
208
Téléchargements