From 93ed6cfa93ffdce04ae0d125aa4d645cc6b9ea77 Mon Sep 17 00:00:00 2001 From: Max Bain <36994049+m-bain@users.noreply.github.com> Date: Thu, 1 Jun 2023 16:54:16 +0100 Subject: [PATCH] interspeech --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index a660d2d..d6c2e3d 100644 --- a/README.md +++ b/README.md @@ -54,6 +54,7 @@ This repository provides fast automatic speech recognition (70x realtime with la

New🚨

+- _WhisperX_ accepted at INTERSPEECH 2023 - v3 transcript segment-per-sentence: using nltk sent_tokenize for better subtitlting & better diarization - v3 released, 70x speed-up open-sourced. Using batched whisper with [faster-whisper](https://github.com/guillaumekln/faster-whisper) backend! - v2 released, code cleanup, imports whisper library VAD filtering is now turned on by default, as in the paper. @@ -276,7 +277,7 @@ If you use this in your research, please cite the paper: @article{bain2022whisperx, title={WhisperX: Time-Accurate Speech Transcription of Long-Form Audio}, author={Bain, Max and Huh, Jaesung and Han, Tengda and Zisserman, Andrew}, - journal={arXiv preprint, arXiv:2303.00747}, + journal={INTERSPEECH 2023}, year={2023} } ```