Selected Papers Interspeech 2019 Wednesday

A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David Kung, Michael Picheny

Cool merge graphs

Detection and Recovery of OOVs for Improved English Broadcast News Captioning Samuel Thomas (IBM Research AI), Kartik Audhkhasi (IBM Research AI), Zoltan Tuske (IBM Research AI), Yinghui Huang (IBM Research AI), Michael Picheny (IBM Research AI)

Nothing new but still important

Disfluencies and Human Speech Transcription Errors Vicky Zayats (University of Washington), Trang Tran (University of Washington), Courtney Mansfield (University of Washington), Richard Wright (University of Washington), Mari Ostendorf (University of Washington)

Robust Sound Recognition: A Neuromorphic Approach Jibin Wu (National University of Singapore), Zihan Pan , Malu Zhang , Rohan Kumar Das , Yansong Chua , Haizhou Li

Spiking neural networks

Neural Named Entity Recognition from Subword Units Abdalghani Abujabal (Max Planck Institute for Informatics), Judith Gaspers (Amazon)

Names recognition is still important

Unsupervised Acoustic Segmentation and Clustering using Siamese Network Embeddings Saurabhchand Bhati (The Johns Hopkins University), Shekhar Nayak (Indian Institute of Technology Hyderabad), Sri Rama Murty Kodukula (IIT Hyderabad), Najim Dehak (Johns Hopkins University)

Acoustic Model Bootstrapping Using Semi-Supervised Learning Langzhou Chen (Amazon Cambridge office), Volker Leutnant (Amazon Aachen office)

Bandwidth Embeddings for Mixed-bandwidth Speech Recognition Gautam Mantena (Apple Inc.), Ozlem Kalinli (Apple Inc), Ossama Abdel-Hamid (Apple Inc), Don McAllaster (Apple Inc)

Towards Debugging Deep Neural Networks by Generating Speech Utterances Bilal Soomro (University of Eastern Finland), Anssi Kanervisto (University of Eastern Finland), Trung Ngo Trong (University of Eastern Finland), Ville Hautamaki (University of Eastern Finland)

Debugging is very nice idea

A Study for Improving Device-Directed Speech Detection toward Frictionless Human-Machine Interaction Che-Wei Huang (Amazon), Roland Maas (, Sri Harish Mallidi (Amazon, USA), Bjorn Hoffmeister (

Nice idea, we covered that before

Deep Learning for Orca Call Type Identification — A Fully Unsupervised Approach Christian Bergler, Manuel Schmitt, Rachael Xi Cheng, Andreas Maier, Volker Barth, Elmar Nöth

Kinda cool

The STC ASR System for the VOiCES from a Distance Challenge 2019 Ivan Medennikov (STC-innovations Ltd), Yuri Khokhlov (STC-innovations Ltd), Aleksei Romanenko (ITMO University), Ivan Sorokin (STC), Anton Mitrofanov (STC-innovations Ltd), Vladimir Bataev (Speech Technology Center Ltd), Andrei Andrusenko (STC-innovations Ltd), Tatiana Prisyach (STC-innovations Ltd), Mariya Korenevskaya (STC-innovations Ltd), Oleg Petrov (ITMO University), Alexander Zatvornitskiy (Speech Technology Center)

Kaggle type and cool tricks (char based LM), congrats to STC

Continuous Emotion Recognition in Speech – Do We Need Recurrence? Maximilian Schmitt (ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg), Nicholas Cummins (University of Augsburg), Björn Schuller (University of Augsburg / Imperial College London)

Self-supervised speaker embeddings Themos Stafylakis (Omilia - Conversational Intelligence), Johan Rohdin (Brno University of Technology), Oldrich Plchot (Brno University of Technology), Petr Mizera (Czech Technical University in Prague), Lukas Burget (Brno University of Technology)

the word of the year

Better morphology prediction for better speech systems Dravyansh Sharma (Carnegie Mellon University), Melissa Wilson (Google LLC), Antoine Bruguier (Google LLC)

Connecting and Comparing Language Model Interpolation Techniques Ernest Pusateri, Christophe Van Gysel, Rami Botros, Sameer Badaskar, Mirko Hannemann, Youssef Oualil, Ilya Oparin

Worth to remind

Articulation rate as a metric in spoken language assessment Calbert Graham (University of Cambridge), Francis Nolan (University of Cambridge)