mirror of
https://github.com/m-bain/whisperX.git
synced 2025-07-01 18:17:27 -04:00
interspeech
This commit is contained in:
@ -54,6 +54,7 @@ This repository provides fast automatic speech recognition (70x realtime with la
|
|||||||
|
|
||||||
<h2 align="left", id="highlights">New🚨</h2>
|
<h2 align="left", id="highlights">New🚨</h2>
|
||||||
|
|
||||||
|
- _WhisperX_ accepted at INTERSPEECH 2023
|
||||||
- v3 transcript segment-per-sentence: using nltk sent_tokenize for better subtitlting & better diarization
|
- v3 transcript segment-per-sentence: using nltk sent_tokenize for better subtitlting & better diarization
|
||||||
- v3 released, 70x speed-up open-sourced. Using batched whisper with [faster-whisper](https://github.com/guillaumekln/faster-whisper) backend!
|
- v3 released, 70x speed-up open-sourced. Using batched whisper with [faster-whisper](https://github.com/guillaumekln/faster-whisper) backend!
|
||||||
- v2 released, code cleanup, imports whisper library VAD filtering is now turned on by default, as in the paper.
|
- v2 released, code cleanup, imports whisper library VAD filtering is now turned on by default, as in the paper.
|
||||||
@ -276,7 +277,7 @@ If you use this in your research, please cite the paper:
|
|||||||
@article{bain2022whisperx,
|
@article{bain2022whisperx,
|
||||||
title={WhisperX: Time-Accurate Speech Transcription of Long-Form Audio},
|
title={WhisperX: Time-Accurate Speech Transcription of Long-Form Audio},
|
||||||
author={Bain, Max and Huh, Jaesung and Han, Tengda and Zisserman, Andrew},
|
author={Bain, Max and Huh, Jaesung and Han, Tengda and Zisserman, Andrew},
|
||||||
journal={arXiv preprint, arXiv:2303.00747},
|
journal={INTERSPEECH 2023},
|
||||||
year={2023}
|
year={2023}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
Reference in New Issue
Block a user