Data Simulation for AI

Developing AI models for any task often relies on the availability of strongly-labeled audio data. Recording such data manually is very time consuming and expensive, and as a result existing datasets for specific tasks are scarce and limited in size. To alleviate this problem, machine learning models rely on data simulation to “augment” the amount of data available. The augmentation process consists in applying transformations to the original data to simulate a recording in another environment and obtain a new set of recordings.

Form: webinar

Date: 7 June 2022

Time: 15.00 – 16.30

Place: Online via Zoom

Price: free

Language: English

If you have questions about signing up, please contact Murielle De Smedt,

Participant profile:

  • DSP engineers
  • ML engineers
  • Acoustic engineers
  • Researchers in AI/ML
You will meet:



Simulating audio data for AI training – Jabra Research. 

I will motivate and introduce the considerations in play when simulating training data for AI audio processing schemes in communication devices. Based on a few product use cases, I will show how acoustical models can be used to generate useful training data. Finally, I will show a few examples of AI algorithms trained on generated data.  

Rasmus Kongsgaard Olsson has worked in Jabra’s (GN Audio) Research Department for the last 14 years contributing to research / product development within machine learning / AI, signal processing and related acoustics.  Before that, he completed PhD degrees (2007) at the Technical University of Denmark, specializing on machine learning for audio applications.  

Sound Event Detection with Synthetic Soundscapes

As humans, we constantly rely on the sounds around us to get information about our environment (birds singing, a car passing passing, the constant hum from a highway nearby…) and to get feedback about our actions (the noise of a door closing, the bips from an ATM keyboard…). Ambient sound analysis aims at designing algorithms that can analyze and interpret sounds as a human would do. In particular, in sound event detection the goal is to detect not only the class of the sound events but also their time boundaries. Ideally this would rely on strongly labeled audio clips (with onset and offset timestamps) at training time but these are time consuming to obtain and prone to annotations errors. One alternative is to use either weakly labeled audio clips (that are cheap to annotate but do not provide timestamps), synthetically generated soundscapes with strong labels (that are cheap to obtain but can  introduce a domain mismatch) or a combination of both. Since 2018, we are organizing a task on sound event detection with systems trained on a heterogeneous dataset composed of both recorded and synthetic soundscapes with varying levels of annotation. During this talk I will discuss the data generation process and the added value of using sythetic soundscapes both at training time and at test time.

Romain Serizel, Université de Lorraine

Romain Serizel received his Ph.D. degree in Engineering Sciences from the KU Leuven (Leuven, Belgium) in June 2011 working on multichannel noise reduction algorithms targeting applications for hearing impaired persons. He was then a postdoctoral researcher at KU Leuven (2011-2012), at FBK (Trento, Italy, 2013-2014) and at Télécom ParisTech (Paris, France, 2014-2016) where he worked on machine learning applied to speech processing and sound analysis. He is now an Associate Professor with Université de Lorraine (Nancy, France) doing research on machine listening and robust speech communications. He has co-authored 14 journal papers, about 40 articles in peer-reviewed conferences and 2 book chapters. Since 2018, he is the coordinator of DCASE challenge task 4 on “Sound event detection in domestic environments”. Since 2019, he is coordinating the DCASE challenge series together with Annamaria Mesaros.

Simulation of Room Acoustics

I intend to talk about how room acoustics can be simulated in two particular cases. Firstly I will focus on the extremely simplified case of an ideal shoebox room with uniform absorption properties across the walls, which can be simulated using the image method or the scattered delay network. Secondly, I will elaborate on the more interesting case of how to simulate acoustic impulse responses of a real room in which just a few impulse response measurements have been made. I will further explain how both cases can be elegantly cast into a common room acoustics modeling framework using the so-called method of fundamental solutions.

Toon Van Waterschoot, KU Leuven

Toon van Waterschoot received MSc (2001) and PhD (2009) degrees in Electrical Engineering, both from KU Leuven, Belgium, where he is currently an Associate Professor, Head of the STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, and Consolidator Grantee of the European Research Council (ERC). He has previously also held teaching and research positions at Delft University of Technology in The Netherlands and the University of Lugano in Switzerland. His research interests are in signal processing, machine learning, and numerical optimization, applied to acoustic signal enhancement, acoustic modeling, audio analysis, and audio reproduction.

When you participate in this event, your time will be used as co-financing for the Innovation Power Project, which is funded by the Danish Business Promotion Board and the Danish Agency for Education and Research at a standard rate. Read more about Innovationskraft  HERE.

Danish Sound Cluster

Hold dig opdateret

Tilmeld vores nyhedsbrev: