Abstract
The first Natural Office Talkers in Settings of Far-field Audio Recordings (NOTSOFAR-1) Challenge is a pivotal initiative that sets new benchmarks by offering datasets more representative of the needs of real-world business applications than those previously available. The challenge provides a unique combination of 315 recorded meetings across 30 diverse environments, capturing real-world acoustic conditions and conversational dynamics, and a 1000-hour simulated training dataset, synthesized with enhanced authenticity for real-world generalization, incorporating 15,000 real acoustic transfer functions. In this paper, we provide an overview of the systems submitted to the challenge and analyze the top-performing approaches, hypothesizing the factors behind their success. Additionally, we highlight promising directions left unexplored by participants. By presenting key findings and actionable insights, this work aims to drive further innovation and progress in DASR research and applications.
| Original language | English |
|---|---|
| Article number | 101796 |
| Journal | Computer Speech and Language |
| Volume | 93 |
| DOIs | |
| State | Published - Aug 2025 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2025
Keywords
- Multi-channel speech processing
- Speaker diarization
- Speech recognition
- Speech separation
Fingerprint
Dive into the research topics of 'Summary of the NOTSOFAR-1 challenge: Highlights and learnings'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver