Real-Time Open-Set Sound Event Recognition

Created by: Johannes Traa, 2023



Login



Welcome

What is this?

This website is the result of a learning project to create a full-stack, open-set sound recognition (OSSR) system. "Open-set" indicates that it is able to distinguish between human-annotated classes of sounds while also understanding when it hasn't heard a sound before. This is an important property of any recognition system that is deployed "in the wild".

The system consists of various components: audio data collection on an edge device, search and annotation of sounds, training and testing of models, and streaming detection of events.

How does it work?

The web application you are currently interacting with is served from a Node.js app running on an Ubuntu server that has been made accessible through the internet. A data lake consisting of a PostgreSQL database, a feature store, and a model store is hosted on this server. A separate Flask app controls an edge device (Raspberry Pi) that is set up for capturing audio in real time. All offline and real-time functionality of the system is controlled through the pages of this web app.

What does the name mean?

This system was originally set up to capture data in a domestic setting. As such, the most easily recognizable event in the database turned out to be dog barks. Translating this into Doggolingo (the internet slang for describing dog behavior), we get "buddybork". That being said, the system does implement general-purpose recognition.