Add PT (pt-br) align support

2025-07-01 18:17:27 -04:00 · 2023-01-11 12:11:41 -03:00
parent d51353a4b6
commit 7459bf8ad0
2 changed files with 2 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -93,7 +93,7 @@ https://user-images.githubusercontent.com/36994049/207743923-b4f0d537-29ae-4be2-
 The phoneme ASR alignment model is *language-specific*, for tested languages these models are [automatically picked from torchaudio pipelines or huggingface](https://github.com/m-bain/whisperX/blob/e909f2f766b23b2000f2d95df41f9b844ac53e49/whisperx/transcribe.py#L22).
 Just pass in the `--language` code, and use the whisper `--model large`.
-Currently default models provided for `{en, fr, de, es, it, ja, zh, nl, uk}`. If the detected language is not in this list, you need to find a phoneme-based ASR model from [huggingface model hub](https://huggingface.co/models) and test it on your data.
+Currently default models provided for `{en, fr, de, es, it, ja, zh, nl, uk, pt}`. If the detected language is not in this list, you need to find a phoneme-based ASR model from [huggingface model hub](https://huggingface.co/models) and test it on your data.
 #### E.g. German
--- a/whisperx/transcribe.py
+++ b/whisperx/transcribe.py
@ -32,6 +32,7 @@ DEFAULT_ALIGN_MODELS_HF = {
    "zh": "jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn",
    "nl": "jonatasgrosman/wav2vec2-large-xlsr-53-dutch",
    "uk": "Yehor/wav2vec2-xls-r-300m-uk-with-small-lm",
    "pt": "joaoalvarenga/wav2vec2-large-100k-voxpopuli-pt"
 }