Using Pairwise Link Prediction and Graph Attention Networks for Music Structure Analysis - Archive ouverte HAL
[go: up one dir, main page]

Communication Dans Un Congrès Année : 2024
Using Pairwise Link Prediction and Graph Attention Networks for Music Structure Analysis
1 S2A - Signal, Statistique et Apprentissage (Télécom Paris 19 Place Marguerite Perey 91120 Palaiseau - France)
"> S2A - Signal, Statistique et Apprentissage
2 IDS - Département Images, Données, Signal (46, rue Barrault 75013 Paris ; 15 Place Marguerite Perey 91120 Palaiseau (depuis oct 2019) - France)
"> IDS - Département Images, Données, Signal
3 NYU, MARL - New York University, Music and Audio Research Laboratory (États-Unis)
"> NYU, MARL - New York University, Music and Audio Research Laboratory
4 CDS - Center for Data Science [NYU] (New York University, CDS, 7th floor, 60 5th Ave, New York, NY, 1001 - États-Unis)
"> CDS - Center for Data Science [NYU]

Résumé

The task of music structure analysis has been mostly addressed as a sequential problem, by relying on the internal homogeneity of musical sections or their repetitions. In this work, we instead regard it as a pairwise link prediction task. If for any pair of time instants in a track, one can successfully predict whether they belong to the same structural entity or not, then the underlying structure can be easily recovered. Building upon this assumption, we propose a method that first learns to classify pairwise links between time frames as belonging to the same section (or segment) or not. The resulting link features, along with node-specific information, are combined through a graph attention network. The latter is regularized with a graph partitioning training objective and outputs boundary locations between musical segments and section labels. The overall system is lightweight and performs competitively with previous methods. The evaluation is done on two standard datasets for music structure analysis and an ablation study is conducted in order to gain insight on the role played by its different components.
Fichier principal
Vignette du fichier
ISMIR_24_camera_ready.pdf (704.7 Ko) Télécharger le fichier
Origine Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-04665063 , version 1 (01-08-2024)
hal-04665063 , version 2 (25-10-2024)

Licence

Identifiants
  • HAL Id : hal-04665063 , version 2

Citer

Morgan Buisson, Brian Mcfee, Slim Essid. Using Pairwise Link Prediction and Graph Attention Networks for Music Structure Analysis. 25th International Society for Music Information Retrieval (ISMIR) (2024), Nov 2024, San Francisco (CA), United States. ⟨hal-04665063v2⟩
1016 Consultations
233 Téléchargements

Partager

More