mirror of
https://github.com/m-bain/whisperX.git
synced 2025-07-01 18:17:27 -04:00
docs: update installation instructions
This commit is contained in:
49
README.md
49
README.md
@ -62,54 +62,41 @@ This repository provides fast automatic speech recognition (70x realtime with la
|
|||||||
- Paper drop🎓👨🏫! Please see our [ArxiV preprint](https://arxiv.org/abs/2303.00747) for benchmarking and details of WhisperX. We also introduce more efficient batch inference resulting in large-v2 with *60-70x REAL TIME speed.
|
- Paper drop🎓👨🏫! Please see our [ArxiV preprint](https://arxiv.org/abs/2303.00747) for benchmarking and details of WhisperX. We also introduce more efficient batch inference resulting in large-v2 with *60-70x REAL TIME speed.
|
||||||
|
|
||||||
<h2 align="left" id="setup">Setup ⚙️</h2>
|
<h2 align="left" id="setup">Setup ⚙️</h2>
|
||||||
Tested for PyTorch 2.0, Python 3.10 (use other versions at your own risk!)
|
|
||||||
|
|
||||||
GPU execution requires the NVIDIA libraries cuBLAS 11.x and cuDNN 8.x to be installed on the system. Please refer to the [CTranslate2 documentation](https://opennmt.net/CTranslate2/installation.html).
|
### 1. Simple Installation (Recommended)
|
||||||
|
|
||||||
|
The easiest way to install WhisperX is through PyPi:
|
||||||
### 1. Create Python3.10 environment
|
|
||||||
|
|
||||||
`conda create --name whisperx python=3.10`
|
|
||||||
|
|
||||||
`conda activate whisperx`
|
|
||||||
|
|
||||||
|
|
||||||
### 2. Install PyTorch, e.g. for Linux and Windows CUDA11.8:
|
|
||||||
|
|
||||||
`conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia`
|
|
||||||
|
|
||||||
See other methods [here.](https://pytorch.org/get-started/previous-versions/#v200)
|
|
||||||
|
|
||||||
### 3. Install WhisperX
|
|
||||||
|
|
||||||
You have several installation options:
|
|
||||||
|
|
||||||
#### Option A: Stable Release (recommended)
|
|
||||||
Install the latest stable version from PyPI:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install whisperx
|
pip install whisperx
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Option B: Development Version
|
Or if using [uvx](https://docs.astral.sh/uv/guides/tools/#running-tools):
|
||||||
Install the latest development version directly from GitHub (may be unstable):
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install git+https://github.com/m-bain/whisperx.git
|
uvx whisperx
|
||||||
```
|
```
|
||||||
|
|
||||||
If already installed, update to the most recent commit:
|
### 2. Advanced Installation Options
|
||||||
|
|
||||||
|
These installation methods are for developers or users with specific needs. If you're not sure, stick with the simple installation above.
|
||||||
|
|
||||||
|
#### Option A: Install from GitHub
|
||||||
|
|
||||||
|
To install directly from the GitHub repository:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install git+https://github.com/m-bain/whisperx.git --upgrade
|
uvx git+https://github.com/m-bain/whisperX.git
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Option C: Development Mode
|
#### Option B: Developer Installation
|
||||||
If you wish to modify the package, clone and install in editable mode:
|
|
||||||
|
If you want to modify the code or contribute to the project:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/m-bain/whisperX.git
|
git clone https://github.com/m-bain/whisperX.git
|
||||||
cd whisperX
|
cd whisperX
|
||||||
pip install -e .
|
uv sync --all-extras --dev
|
||||||
```
|
```
|
||||||
|
|
||||||
> **Note**: The development version may contain experimental features and bugs. Use the stable PyPI release for production environments.
|
> **Note**: The development version may contain experimental features and bugs. Use the stable PyPI release for production environments.
|
||||||
@ -117,12 +104,12 @@ pip install -e .
|
|||||||
You may also need to install ffmpeg, rust etc. Follow openAI instructions here https://github.com/openai/whisper#setup.
|
You may also need to install ffmpeg, rust etc. Follow openAI instructions here https://github.com/openai/whisper#setup.
|
||||||
|
|
||||||
### Speaker Diarization
|
### Speaker Diarization
|
||||||
|
|
||||||
To **enable Speaker Diarization**, include your Hugging Face access token (read) that you can generate from [Here](https://huggingface.co/settings/tokens) after the `--hf_token` argument and accept the user agreement for the following models: [Segmentation](https://huggingface.co/pyannote/segmentation-3.0) and [Speaker-Diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) (if you choose to use Speaker-Diarization 2.x, follow requirements [here](https://huggingface.co/pyannote/speaker-diarization) instead.)
|
To **enable Speaker Diarization**, include your Hugging Face access token (read) that you can generate from [Here](https://huggingface.co/settings/tokens) after the `--hf_token` argument and accept the user agreement for the following models: [Segmentation](https://huggingface.co/pyannote/segmentation-3.0) and [Speaker-Diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) (if you choose to use Speaker-Diarization 2.x, follow requirements [here](https://huggingface.co/pyannote/speaker-diarization) instead.)
|
||||||
|
|
||||||
> **Note**<br>
|
> **Note**<br>
|
||||||
> As of Oct 11, 2023, there is a known issue regarding slow performance with pyannote/Speaker-Diarization-3.0 in whisperX. It is due to dependency conflicts between faster-whisper and pyannote-audio 3.0.0. Please see [this issue](https://github.com/m-bain/whisperX/issues/499) for more details and potential workarounds.
|
> As of Oct 11, 2023, there is a known issue regarding slow performance with pyannote/Speaker-Diarization-3.0 in whisperX. It is due to dependency conflicts between faster-whisper and pyannote-audio 3.0.0. Please see [this issue](https://github.com/m-bain/whisperX/issues/499) for more details and potential workarounds.
|
||||||
|
|
||||||
|
|
||||||
<h2 align="left" id="example">Usage 💬 (command line)</h2>
|
<h2 align="left" id="example">Usage 💬 (command line)</h2>
|
||||||
|
|
||||||
### English
|
### English
|
||||||
|
Reference in New Issue
Block a user