2 Commits

Author SHA1 Message Date
3dfe6c6ea0 docs: document Docker image usage for WhisperX in README
- Add a new section to the README describing how to use pre-built Docker images for WhisperX with example commands.
- Provide a link to the Docker image repository for available tags.

Signed-off-by: CHEN, CHUN <jim60105@gmail.com>

# Conflicts:
#	README.md
2025-06-14 00:23:19 +08:00
d700b56c9c docs: add missing torch import to Python usage example in README 2025-06-08 03:34:49 -06:00

View File

@ -97,6 +97,18 @@ uv sync --all-extras --dev
You may also need to install ffmpeg, rust etc. Follow openAI instructions here https://github.com/openai/whisper#setup.
### 3. Docker Images
Execute pre-built WhisperX container images:
```bash
docker run --gpus all -it -v ".:/app" ghcr.io/jim60105/whisperx:base-en -- --output_format srt audio.mp3
docker run --gpus all -it -v ".:/app" ghcr.io/jim60105/whisperx:large-v3-ja -- --output_format srt audio.mp3
docker run --gpus all -it -v ".:/app" ghcr.io/jim60105/whisperx:no_model -- --model tiny --language en --output_format srt audio.mp3
```
Review the tag lists in this repository: [jim60105/docker-whisperX](https://github.com/jim60105/docker-whisperX)
### Common Issues & Troubleshooting 🔧
#### libcudnn Dependencies (GPU Users)
@ -189,7 +201,7 @@ result = model.transcribe(audio, batch_size=batch_size)
print(result["segments"]) # before alignment
# delete model if low on GPU resources
# import gc; gc.collect(); torch.cuda.empty_cache(); del model
# import gc; import torch; gc.collect(); torch.cuda.empty_cache(); del model
# 2. Align whisper output
model_a, metadata = whisperx.load_align_model(language_code=result["language"], device=device)
@ -198,7 +210,7 @@ result = whisperx.align(result["segments"], model_a, metadata, audio, device, re
print(result["segments"]) # after alignment
# delete model if low on GPU resources
# import gc; gc.collect(); torch.cuda.empty_cache(); del model_a
# import gc; import torch; gc.collect(); torch.cuda.empty_cache(); del model_a
# 3. Assign speaker labels
diarize_model = whisperx.diarize.DiarizationPipeline(use_auth_token=YOUR_HF_TOKEN, device=device)