whisperX

mirror of https://github.com/m-bain/whisperX.git synced 2025-07-01 18:17:27 -04:00

Author	SHA1	Message	Date
Max Bain	74a00eecd7	suppress numerals fix	2023-06-05 15:33:04 +01:00
Max Bain	b026407fd9	Merge branch 'v3' of https://github.com/m-bain/whisperX into v3 Conflicts: whisperx/asr.py	2023-06-05 15:30:02 +01:00
Max Bain	a323cff654	--suppress_numerals option, ensures non-numerical words, for wav2vec2 alignment	2023-06-05 15:27:42 +01:00
Max Bain	ec6a110cdf	Merge pull request #290 from m-bain/main push contributions from main	2023-05-29 12:55:24 +01:00
Max Bain	8d8c027a92	Merge pull request #278 from Mr-Turtleeeee/add_align_for_vi Add war2vec model for Vietnamese	2023-05-29 12:54:37 +01:00
Max Bain	4cbd3030cc	no sentence split on mr. mrs. dr...	2023-05-29 12:48:14 +01:00
Max Bain	1c528d1a3c	Merge pull request #284 from prameshbajra/main	2023-05-27 11:19:13 +01:00
Max Bain	c65e7ba9b4	Merge pull request #280 from Thebys/patch-1	2023-05-27 11:18:27 +01:00
prameshbajra	5a47f458ac	Added download path parameter.	2023-05-27 11:38:54 +02:00
Max Bain	f1032bb40a	VAD unequal stack size, remove debug change	2023-05-26 20:39:19 +01:00
Max Bain	bc8a03881a	Merge pull request #281 from m-bain/v3 fix Unequal Stack Size VAD error	2023-05-26 20:37:57 +01:00
Max Bain	42b4909bc0	fix Unequal Stack Size VAD error	2023-05-26 20:36:03 +01:00
Thebys	bb15d6b68e	Add Czech alignment model This PR adds the following Czech alignment model: https://huggingface.co/comodoro/wav2vec2-xls-r-300m-cs-250. I have successfully tested this with several Czech audio recordings with length of up to 3 hours, and the results are satisfactory. However, I have received the following warnings and I am not sure how relevant it is: ``` Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Thebys\.cache\torch\whisperx-vad-segmentation.bin` Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.0.0. Bad things might happen unless you revert torch to 1.x. ```	2023-05-26 21:17:01 +02:00
Max Bain	23d405e1cf	Merge branch 'main' into add_align_for_vi	2023-05-26 17:14:09 +01:00
Max Bain	17e2f7f859	Merge pull request #277 from Boulaouaney/add-Korean-alignment-model added Korean wav2vec2 model	2023-05-26 17:12:47 +01:00
Youssef Boulaoaune	1d9d630fb9	added Korean wav2vec2 model	2023-05-26 20:33:16 +09:00
iambestfeeddddd	9c042c2d28	Add war2vec model for Vietnamese	2023-05-26 16:46:55 +07:00
Max Bain	a23f2aa3f7	Merge pull request #269 from sorgfresser/transcribe_keywords Add transcribe keywords	2023-05-21 12:08:44 +01:00
Simon	7c5468116f	Merge branch 'm-bain:main' into transcribe_keywords	2023-05-20 16:03:40 +02:00
Simon	a1c705b3a7	fix tokenizer is None	2023-05-20 15:52:45 +02:00
Max Bain	29a5e0b236	Merge pull request #266 from sorgfresser/main Add device_index option	2023-05-20 14:45:34 +01:00
Simon	715435db42	add tokenizer is None case	2023-05-20 15:42:21 +02:00
Simon	1fc965bc1a	add task, language keyword to transcribe	2023-05-20 15:30:25 +02:00
Simon	74b98ebfaa	ensure device_index not None	2023-05-20 13:11:30 +02:00
Simon	53396adb21	add device_index	2023-05-20 13:02:46 +02:00
Max Bain	d8a2b4ffc9	Merge pull request #246 from m-bain/v3 V3 v3.1.1	2023-05-13 12:18:09 +01:00
Max Bain	9ffb7e7a23	Merge branch 'v3' of https://github.com/m-bain/whisperX into v3 Conflicts: setup.py	2023-05-13 12:16:33 +01:00
Max Bain	fd8f1003cf	add translate, fix word_timestamp error	2023-05-13 12:14:06 +01:00
Max Bain	46b416296f	Merge pull request #123 from koldbrandt/danish_alignment Danish alignment model	2023-05-09 23:10:24 +01:00
Max Bain	7642390d0a	Merge branch 'main' into danish_alignment	2023-05-09 23:10:13 +01:00
Max Bain	8b05ad4dae	Merge pull request #235 from sorgfresser/main Add custom typing for results	2023-05-09 23:05:02 +01:00
Max Bain	5421f1d7ca	remove v3 tag on pip install	2023-05-09 13:42:50 +01:00
Simon	91e959ec4f	Merge branch 'm-bain:main' into main	2023-05-08 20:46:25 +02:00
Simon	eabf35dff0	Custom result types	2023-05-08 20:45:34 +02:00
Max Bain	4919ad21fc	Merge pull request #233 from sorgfresser/main Fix tuple unpacking	2023-05-08 19:05:47 +01:00
Simon	b50aafb17b	Fix tuple unpacking	2023-05-08 20:03:42 +02:00
Max Bain	2efa136114	update python usage example	2023-05-08 17:20:38 +01:00
Max Bain	0b839f3f01	Update README.md	2023-05-07 20:36:08 +01:00
Max Bain	1caddfb564	Merge pull request #225 from m-bain/v3 V3 v3.1.0	2023-05-07 20:31:16 +01:00
Max Bain	7ad554c64f	Merge branch 'main' into v3	2023-05-07 20:30:57 +01:00
Max Bain	4603f010a5	update readme, setup, add option to return char_timestamps	2023-05-07 20:28:33 +01:00
Max Bain	24008aa1ed	fix long segments, break into sentences using nltk, improve align logic, improve diarize (sentence-based)	2023-05-07 15:32:58 +01:00
Max Bain	07361ba1d7	add device to dia pipeline @sorgfresser	2023-05-05 11:53:51 +01:00
Max Bain	4e2ac4e4e9	torch2.0, remove compile for now, round to times to 3 decimal v3.0.2	2023-05-04 20:38:13 +01:00
Max Bain	d2116b98ca	Merge pull request #210 from sorgfresser/v3 Update pyannote and torch version	2023-05-04 20:32:06 +01:00
Simon	d8f0ef4a19	Set diarization device manually	2023-05-04 16:25:34 +02:00
Max Bain	1b62c61c71	Merge pull request #216 from aramlang/blank_id-fix Enable Hebrew support	2023-05-04 01:13:23 +01:00
Simon	2d59eb9726	Add torch compile to log mel spectrogram	2023-05-03 23:17:44 +02:00
aramlang	cb53661070	Enable Hebrew support	2023-05-03 11:26:12 -05:00
Simon	2a6830492c	Fix pyannote to specific commit	2023-05-02 20:25:56 +02:00

1 2 3 4 5

206 Commits