Fork me on GitHub

Few-shot bioacoustic event detection using a prototypical network ensemble with adaptive embedding functions

A log Mel spectrogram of part of a sound recording (top) and examples of predictions (bottom) from an ensemble prototypical network (solid blue line) and a prototypical network (dashed blue line) as well as the given few-shot examples (purple line) and remaining ground truth events (green line). The decision threshold τ is 0.5 (red line).

In this report we present our method for the DCASE 2022 challenge on few-shot bioacoustic event detection. We use an ensemble of prototypical neural networks with adaptive embedding functions and show that both ensemble and adaptive embedding functions can be used to improve results from an average F-score of 41.3% to an average F-score of 60.0% on the validation dataset.

John Martinsson, Martin Willbo, Aleksis Pirinen, Olof Mogren, Maria Sandsten

Detection and Classification of Acoustic Scenes and Events
PDF Fulltext
arxiv:
bibtex.

Olof Mogren, PhD, RISE Research institutes of Sweden. Follow me on Mastodon.