Overall, it is going pretty good. Many very good papers, diarization joins with decoding, everything goes to the right direction.

RadioTalk: a large-scale corpus of talk radio transcripts Doug Beeferman (MIT Media Lab), William Brannon (MIT Media Lab), Deb Roy (MIT Media Lab) 248000 hours dataset

Automatic lyric transcription from Karaoke vocal tracks: Resources and a Baseline System Gerardo Roa (University of Sheffield), Jon Barker (University of Sheffield)

Speaker Diarization with Lexical Information Tae Jin Park, Kyu J. Han, Jing Huang, Xiaodong He, Bowen Zhou, Panayiotis Georgiou, Shrikanth Narayanan

Full-Sentence Correlation: a Method to Handle Unpredictable Noise for Robust Speech Recognition Ming Ji (Queen’s University Belfast), Danny Crookes (Queen’s University Belfast)

Untranscribed Web Audio for Low Resource Speech Recognition Andrea Carmantini, Peter Bell, Steve Renals

Building Large-Vocabulary ASR Systems for Languages Without Any Audio Training Data Manasa Prasad, Daan van Esch, Sandy Ritchie, Jonas Fromseier Mortensen

How to annotate 100 hours in 45 minutes Per Fallgren (KTH Royal Institute of Technology), Zofia Malisz (KTH, Stockholm), Jens Edlund (KTH Speech, Music and Hearing)

Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition

High quality - lightweight and adaptable TTS using LPCNet Zvi Kons (IBM Haifa research lab), Slava Shechtman (Speech Technologies, IBM Research AI), Alexander Sorin (IBM Research - Haifa), Carmel Rabinovitz (IBM Research - Haifa), Ron Hoory (IBM Haifa Research Lab)

Very nice quality

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition Ziping Zhao, Zhongtian Bao, Zixing Zhang, Nicholas Cummins, Haishuai Wang, Björn W. Schuller

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition Khoi-Nguyen Mac (University of Illinois at Urbana-Champaign), Xiaodong Cui (IBM T. J. Watson Research Center), Wei Zhang (IBM T. J. Watson Research Center), Michael Picheny (IBM T. J. Watson Research Center)

An Investigation into On-Device Personalization of End-to-End Automatic Speech Recognition Models Khe Chai Sim, Petr Zadrazil, Françoise Beaufays