Domain-Invariant Representation Learning of Bird Sounds - Equipe Better Representations for Artificial Intelligence
Pré-Publication, Document De Travail Année : 2024

Domain-Invariant Representation Learning of Bird Sounds

Résumé

Passive acoustic monitoring (PAM) is crucial for bioacoustic research, enabling non-invasive species tracking and biodiversity monitoring. Citizen science platforms like Xeno-Canto provide large annotated datasets from focal recordings, where the target species is intentionally recorded. However, PAM requires monitoring in passive soundscapes, creating a domain shift between focal and passive recordings, which challenges deep learning models trained on focal recordings. To address this, we leverage supervised contrastive learning to improve domain generalization in bird sound classification, enforcing domain invariance across same-class examples from different domains. We also propose ProtoCLR (Prototypical Contrastive Learning of Representations), which reduces the computational complexity of the SupCon loss by comparing examples to class prototypes instead of pairwise comparisons. Additionally, we present a new few-shot classification benchmark based on BirdSet, a large-scale bird sound dataset, and demonstrate the effectiveness of our approach in achieving strong transfer performance.

Domaines

Son [cs.SD]
Fichier principal
Vignette du fichier
BIRT___icassp25.pdf (213.74 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04696391 , version 1 (13-09-2024)

Licence

Identifiants

  • HAL Id : hal-04696391 , version 1

Citer

Ilyass Moummad, Romain Serizel, Emmanouil Benetos, Nicolas Farrugia. Domain-Invariant Representation Learning of Bird Sounds. 2024. ⟨hal-04696391⟩
80 Consultations
94 Téléchargements

Partager

More