mirror of
https://github.com/m-bain/whisperX.git
synced 2025-07-01 18:17:27 -04:00
Merge branch 'main' into cuda-11.8
This commit is contained in:
10
README.md
10
README.md
@ -54,6 +54,7 @@ This repository provides fast automatic speech recognition (70x realtime with la
|
||||
|
||||
<h2 align="left", id="highlights">New🚨</h2>
|
||||
|
||||
- _WhisperX_ accepted at INTERSPEECH 2023
|
||||
- v3 transcript segment-per-sentence: using nltk sent_tokenize for better subtitlting & better diarization
|
||||
- v3 released, 70x speed-up open-sourced. Using batched whisper with [faster-whisper](https://github.com/guillaumekln/faster-whisper) backend!
|
||||
- v2 released, code cleanup, imports whisper library VAD filtering is now turned on by default, as in the paper.
|
||||
@ -74,7 +75,7 @@ GPU execution requires the NVIDIA libraries cuBLAS 11.x and cuDNN 8.x to be inst
|
||||
|
||||
### 2. Install PyTorch2.0, e.g. for Linux and Windows CUDA11.7:
|
||||
|
||||
`conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia`
|
||||
`conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia`
|
||||
|
||||
See other methods [here.](https://pytorch.org/get-started/previous-versions/#v200)
|
||||
|
||||
@ -184,6 +185,11 @@ print(diarize_segments)
|
||||
print(result["segments"]) # segments are now assigned speaker IDs
|
||||
```
|
||||
|
||||
## Demos 🚀
|
||||
|
||||
[](https://replicate.com/daanelson/whisperx)
|
||||
|
||||
If you don't have access to your own GPUs, use the link above to try out WhisperX.
|
||||
|
||||
<h2 align="left" id="whisper-mod">Technical Details 👷♂️</h2>
|
||||
|
||||
@ -276,7 +282,7 @@ If you use this in your research, please cite the paper:
|
||||
@article{bain2022whisperx,
|
||||
title={WhisperX: Time-Accurate Speech Transcription of Long-Form Audio},
|
||||
author={Bain, Max and Huh, Jaesung and Han, Tengda and Zisserman, Andrew},
|
||||
journal={arXiv preprint, arXiv:2303.00747},
|
||||
journal={INTERSPEECH 2023},
|
||||
year={2023}
|
||||
}
|
||||
```
|
||||
|
Reference in New Issue
Block a user