The CoRal Project - Danish Speech Dataset

Speech tech in general is a growing industry, with an estimated annual compound growth of 16-19% in the next five years. But the majority of speech tech development is targeted towards high resource languages such as English. Danish, a small and low-resource language, risks falling behind and missing revenue from this technology. Modern speech tech is based on ML algorithms, which require large amounts of data. Lack of Danish data is the main problem for moving the industry forward.

Date: 7 December 2023

Time: 15.00-16.30

Form: Online Zoom

Price: Free

Language: English

If you have questions about signing up, please contact Murielle De Smedt: mds@danishsoundcluster.dk

You will meet:

Participant profile

  • Engineers
  • Audio engineers
  • AI & ML specialists, anthropologists
  • Speech technology specialists
  • Healthcare technology experts

Program

During this webinar Dan Saattrup Nielsen will outline the purpose of the Danish Conversational and Read-aloud Speech Dataset and the main problems it solves, including insight in the process with data collecting, and some of the challenges that follow when working with language and different dialects.

Martin Carsten Nielsen will be introducing us to the Machine Learning, clarify how the models are being trained and share some of the results achieved. Finally, Lars Maaløe will share perspective of the use of the Danish speech dataset within healthcare and possibly other sectors, and he will offer his prediction on the impact of the speech dataset and how practice will look like in 10 years.

You will meet:

Dan Saattrup Nielsen

Dan Saattrup Nielsen has a PhD in Mathematics and has been working actively with natural language processing since 2019, within academia, startups, governmental institutions, as well as consulting. He is currently working as a Senior AI Specialist at the Alexandra Institute, where he leads the machine learning model development on the CoRal project, which aims to build open-source speech datasets and models for the Danish language.

Martin Carsten Nielsen

Martin is the co-founder of the speech technology company Alvenir and has contributed to several open-source projects with a focus on developing Danish language models. He has a Master’s degree in mathematical modeling and computing from the Technical University of Denmark (DTU), and has, until now, dedicated his career to building and implementing solutions based on machine learning. Martin strives to expand the use of Danish language technology through innovation and collaboration, of which CoRaL is a great example.”

Lars Maaløe

Co-Founder & CTO at Corti, Adj. Assoc. Professor of Machine Learning

This event was created in collaboration with IDA – The Danish Association of Engineers. The participant list of this event will be shared with IDA for statistical use only. 

Innovationskraft
When you participate in this event, your time will be used as co-financing for the Innovation Power Project, which is funded by the Danish Business Promotion Board and the Danish Ministry of Higher Education and Science at a standard rate. Read more about Innovationskraft  HERE

By signing up to this event, you automatically will receive the Danish Sound Cluster Newsletter

Danish Sound Cluster

Hold dig opdateret

Tilmeld vores nyhedsbrev: