Commit Graph

206 Commits

Author SHA1 Message Date
74a00eecd7 suppress numerals fix 2023-06-05 15:33:04 +01:00
b026407fd9 Merge branch 'v3' of https://github.com/m-bain/whisperX into v3
Conflicts:
	whisperx/asr.py
2023-06-05 15:30:02 +01:00
a323cff654 --suppress_numerals option, ensures non-numerical words, for wav2vec2 alignment 2023-06-05 15:27:42 +01:00
ec6a110cdf Merge pull request #290 from m-bain/main
push contributions from main
2023-05-29 12:55:24 +01:00
8d8c027a92 Merge pull request #278 from Mr-Turtleeeee/add_align_for_vi
Add war2vec model for Vietnamese
2023-05-29 12:54:37 +01:00
4cbd3030cc no sentence split on mr. mrs. dr... 2023-05-29 12:48:14 +01:00
1c528d1a3c Merge pull request #284 from prameshbajra/main 2023-05-27 11:19:13 +01:00
c65e7ba9b4 Merge pull request #280 from Thebys/patch-1 2023-05-27 11:18:27 +01:00
5a47f458ac Added download path parameter. 2023-05-27 11:38:54 +02:00
f1032bb40a VAD unequal stack size, remove debug change 2023-05-26 20:39:19 +01:00
bc8a03881a Merge pull request #281 from m-bain/v3
fix Unequal Stack Size VAD error
2023-05-26 20:37:57 +01:00
42b4909bc0 fix Unequal Stack Size VAD error 2023-05-26 20:36:03 +01:00
bb15d6b68e Add Czech alignment model
This PR adds the following Czech alignment model: https://huggingface.co/comodoro/wav2vec2-xls-r-300m-cs-250.

I have successfully tested this with several Czech audio recordings with length of up to 3 hours, and the results are satisfactory.

However, I have received the following warnings and I am not sure how relevant it is:
```
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Thebys\.cache\torch\whisperx-vad-segmentation.bin`
Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.0. Bad things might happen unless you revert torch to 1.x.
```
2023-05-26 21:17:01 +02:00
23d405e1cf Merge branch 'main' into add_align_for_vi 2023-05-26 17:14:09 +01:00
17e2f7f859 Merge pull request #277 from Boulaouaney/add-Korean-alignment-model
added Korean wav2vec2 model
2023-05-26 17:12:47 +01:00
1d9d630fb9 added Korean wav2vec2 model 2023-05-26 20:33:16 +09:00
9c042c2d28 Add war2vec model for Vietnamese 2023-05-26 16:46:55 +07:00
a23f2aa3f7 Merge pull request #269 from sorgfresser/transcribe_keywords
Add transcribe keywords
2023-05-21 12:08:44 +01:00
7c5468116f Merge branch 'm-bain:main' into transcribe_keywords 2023-05-20 16:03:40 +02:00
a1c705b3a7 fix tokenizer is None 2023-05-20 15:52:45 +02:00
29a5e0b236 Merge pull request #266 from sorgfresser/main
Add device_index option
2023-05-20 14:45:34 +01:00
715435db42 add tokenizer is None case 2023-05-20 15:42:21 +02:00
1fc965bc1a add task, language keyword to transcribe 2023-05-20 15:30:25 +02:00
74b98ebfaa ensure device_index not None 2023-05-20 13:11:30 +02:00
53396adb21 add device_index 2023-05-20 13:02:46 +02:00
d8a2b4ffc9 Merge pull request #246 from m-bain/v3
V3
v3.1.1
2023-05-13 12:18:09 +01:00
9ffb7e7a23 Merge branch 'v3' of https://github.com/m-bain/whisperX into v3
Conflicts:
	setup.py
2023-05-13 12:16:33 +01:00
fd8f1003cf add translate, fix word_timestamp error 2023-05-13 12:14:06 +01:00
46b416296f Merge pull request #123 from koldbrandt/danish_alignment
Danish alignment model
2023-05-09 23:10:24 +01:00
7642390d0a Merge branch 'main' into danish_alignment 2023-05-09 23:10:13 +01:00
8b05ad4dae Merge pull request #235 from sorgfresser/main
Add custom typing for results
2023-05-09 23:05:02 +01:00
5421f1d7ca remove v3 tag on pip install 2023-05-09 13:42:50 +01:00
91e959ec4f Merge branch 'm-bain:main' into main 2023-05-08 20:46:25 +02:00
eabf35dff0 Custom result types 2023-05-08 20:45:34 +02:00
4919ad21fc Merge pull request #233 from sorgfresser/main
Fix tuple unpacking
2023-05-08 19:05:47 +01:00
b50aafb17b Fix tuple unpacking 2023-05-08 20:03:42 +02:00
2efa136114 update python usage example 2023-05-08 17:20:38 +01:00
0b839f3f01 Update README.md 2023-05-07 20:36:08 +01:00
1caddfb564 Merge pull request #225 from m-bain/v3
V3
v3.1.0
2023-05-07 20:31:16 +01:00
7ad554c64f Merge branch 'main' into v3 2023-05-07 20:30:57 +01:00
4603f010a5 update readme, setup, add option to return char_timestamps 2023-05-07 20:28:33 +01:00
24008aa1ed fix long segments, break into sentences using nltk, improve align logic, improve diarize (sentence-based) 2023-05-07 15:32:58 +01:00
07361ba1d7 add device to dia pipeline @sorgfresser 2023-05-05 11:53:51 +01:00
4e2ac4e4e9 torch2.0, remove compile for now, round to times to 3 decimal v3.0.2 2023-05-04 20:38:13 +01:00
d2116b98ca Merge pull request #210 from sorgfresser/v3
Update pyannote and torch version
2023-05-04 20:32:06 +01:00
d8f0ef4a19 Set diarization device manually 2023-05-04 16:25:34 +02:00
1b62c61c71 Merge pull request #216 from aramlang/blank_id-fix
Enable Hebrew support
2023-05-04 01:13:23 +01:00
2d59eb9726 Add torch compile to log mel spectrogram 2023-05-03 23:17:44 +02:00
cb53661070 Enable Hebrew support 2023-05-03 11:26:12 -05:00
2a6830492c Fix pyannote to specific commit 2023-05-02 20:25:56 +02:00