Danish Speech Dataset (CoRal 2.0)

Follow up and current status on the Danish Conversational and Read-aloud Speech Dataset project

Speech tech in general is a growing industry, with an estimated annual compound growth of 16-19% in the next five years. But the majority of speech tech development is targeted towards high resource languages such as English. Danish, a small and low-resource language, risks falling behind and missing revenue from this technology. Modern speech tech is based on ML algorithms, which require large amounts of data. Lack of Danish data is the main problem for moving the industry forward.

Date: 28 May

Time: 14.00 – 15.30

Form: Online Zoom

Price: Free

Language: English

If you have questions about signing up, please contact Murielle De Smedt: mds@danishsoundcluster.dk

You will meet:

Participant profile: 

Engineers, audio engineers, AI & ML specialists, anthropologists, speech technology specialists, healthcare technology experts.

Program

During this webinar Torben Blach will outline the purpose of the Danish Conversational and Read-aloud Speech Dataset (CoRal) and give us an update on data and results following up on our previous webinar in December 2023 here.

Martin Carsten Nielsen will be introducing us to the Machine Learning, clarify how the models are being trained and share some of the results achieved. Lars Maaløe will share his perspective on the purpose and use of a Danish speech dataset within healthcare, and other sectors, and finally industry and society impact predictions and prospects will be discussed in the panel.

You will meet:

Torben Blach, Senior Project Manager at Alexandra Instituttet 

Torben has over 20 years of experience in the technology field. His expertise spans software development, project management and business development across different domains. Torben’s current primary focus is on the potential of AI, data science and ML. Many of the projects are about supporting the Danish language in generative AI and large language models (LMMs). 

 As project manager for the Danish project ‘CoRal – Conversation and Read-aloud Dataset’, Torben is leading the collection of speech data and the development of solutions to improve Danish language technology. He also plays a crucial role in the Danish part of ‘TrustLLM’, an EU project that aims to create a reliable and unbiased LLM for Germanic languages. Torben is also a member of the steering committee of the Danish Language Model Consortium, which is currently collecting a comprehensive and compatible language dataset from Danish companies and organizations. 

Martin Carsten Nielsen, Founder of Alvenir

Martin is the co-founder of the speech technology company Alvenir and has contributed to several open-source projects with a focus on developing Danish language models. He has a master’s degree in mathematical modeling and computing from the Technical University of Denmark (DTU), and has, until now, dedicated his career to building and implementing solutions based on machine learning. Martin strives to expand the use of Danish language technology through innovation and collaboration, of which CoRaL is a great example.”

Lars Maaløe, Co-Founder & CTO at Corti

Bio to come…

This webinar is facilitated by Danish Sound Cluster and will be moderated by

Jeppe Lindegaard, Event & Project Manager

Other information

This event was created in collaboration with IDA – The Danish Association of Engineers. The participant list of this event will be shared with IDA for statistical use only. By signing up for this event, you automatically will receive the Danish Sound Cluster Newsletter

Innovationskraft
When you participate in this event, your time will be used as co-financing for the Innovation Power Project, which is funded by the Danish Business Promotion Board and the Danish Ministry of Higher Education and Science at a standard rate. Read more about Innovationskraft  HERE

By signing up to this event, you automatically will receive the Danish Sound Cluster Newsletter