UGOSA: user-guided one-shot deep model adaptation for music source separation

UGOSA: user-guided one-shot deep model adaptation for music source separation

Presentation by Giorgia Cantisani at WASPAA 2021 – IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

Abstract:
Music source separation is the task of isolating individual instruments which are mixed in a musical piece. This task is particularly challenging, and even state-of-the-art models can hardly generalize to unseen test data.
Nevertheless, prior knowledge about individual sources can be used to better adapt a generic source separation model to the observed signal.
In this work, we propose to exploit a temporal segmentation provided by the user, that indicates when each instrument is active, in order to fine-tune a pre-trained deep model for source separation and adapt it to one specific mixture. This paradigm can be referred to as user-guided one-shot deep model adaptation for music source separation, as the adaptation acts on the target song instance only. Our results are promising and show that state-of-the-art source separation models have large margins of improvement especially for those instruments which are underrepresented in the training data.

Paper + slides: https://hal.telecom-paris.fr/hal-03219350
Code: https://github.com/giorgiacantisani/ugosa
Demo: https://adasp.telecom-paris.fr/resources/2021-06-01-ugosa-paper/

Authors: Giorgia Cantisani, Alexey Ozerov, Slim Essid, Gaël Richard

Acknowledgement:
This work has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 765068 (MIP-frontiers) and the European Union’s Horizon 2020 research and innovation program under the grant agreement No. 951911 (AI4Media).

Music source separationuser-guidedone-shot adaptation

Post a Comment

0 Comments