whisperX

mirror of https://github.com/m-bain/whisperX.git synced 2025-07-01 18:17:27 -04:00

Author	SHA1	Message	Date
Ahmad Bilal	734ecc2844	Add Urdu model support for alignment	2023-07-17 19:29:41 +05:00
Max Bain	8d8c027a92	Merge pull request #278 from Mr-Turtleeeee/add_align_for_vi Add war2vec model for Vietnamese	2023-05-29 12:54:37 +01:00
Max Bain	4cbd3030cc	no sentence split on mr. mrs. dr...	2023-05-29 12:48:14 +01:00
Max Bain	c65e7ba9b4	Merge pull request #280 from Thebys/patch-1	2023-05-27 11:18:27 +01:00
Max Bain	bc8a03881a	Merge pull request #281 from m-bain/v3 fix Unequal Stack Size VAD error	2023-05-26 20:37:57 +01:00
Max Bain	42b4909bc0	fix Unequal Stack Size VAD error	2023-05-26 20:36:03 +01:00
Thebys	bb15d6b68e	Add Czech alignment model This PR adds the following Czech alignment model: https://huggingface.co/comodoro/wav2vec2-xls-r-300m-cs-250. I have successfully tested this with several Czech audio recordings with length of up to 3 hours, and the results are satisfactory. However, I have received the following warnings and I am not sure how relevant it is: ``` Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Thebys\.cache\torch\whisperx-vad-segmentation.bin` Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.0.0. Bad things might happen unless you revert torch to 1.x. ```	2023-05-26 21:17:01 +02:00
Max Bain	23d405e1cf	Merge branch 'main' into add_align_for_vi	2023-05-26 17:14:09 +01:00
Youssef Boulaoaune	1d9d630fb9	added Korean wav2vec2 model	2023-05-26 20:33:16 +09:00
iambestfeeddddd	9c042c2d28	Add war2vec model for Vietnamese	2023-05-26 16:46:55 +07:00
Max Bain	d8a2b4ffc9	Merge pull request #246 from m-bain/v3 V3	2023-05-13 12:18:09 +01:00
Max Bain	fd8f1003cf	add translate, fix word_timestamp error	2023-05-13 12:14:06 +01:00
Max Bain	7642390d0a	Merge branch 'main' into danish_alignment	2023-05-09 23:10:13 +01:00
Simon	eabf35dff0	Custom result types	2023-05-08 20:45:34 +02:00
Max Bain	4603f010a5	update readme, setup, add option to return char_timestamps	2023-05-07 20:28:33 +01:00
Max Bain	24008aa1ed	fix long segments, break into sentences using nltk, improve align logic, improve diarize (sentence-based)	2023-05-07 15:32:58 +01:00
Max Bain	4e2ac4e4e9	torch2.0, remove compile for now, round to times to 3 decimal	2023-05-04 20:38:13 +01:00
aramlang	cb53661070	Enable Hebrew support	2023-05-03 11:26:12 -05:00
Arnav Mehta	64ca208cc8	Fixed the word_start variable not initialized bug.	2023-05-02 13:13:02 +05:30
Prashanth Ellina	601c91140f	references #202 , attempt to fix speaker diarization failing in v3	2023-04-30 17:33:24 +00:00
Max Bain	558d980535	v3 init	2023-04-24 21:08:43 +01:00
Max Bain	da458863d7	allow custom model_dir for torchaudio models	2023-04-14 21:40:36 +01:00
Max Bain	189aeac83e	v2 lets goo	2023-04-01 00:10:45 +01:00
Max Bain	18b63d46e2	skeleton v2	2023-03-30 05:31:57 +01:00
Marcus Brandt	c8404d9805	added a danish alignment model	2023-03-04 13:20:40 +01:00
JCGoran	cfcede41f6	Added Python 3.7 compatibility - removed use of walrus operator in favor of `np.cumsum`	2023-03-02 15:46:07 +01:00
arnavmehta7	2e307814dd	added if clause for checking	2023-02-10 14:48:51 +05:30
Mahmoud Ashraf	9f26112d5c	added turkish wav2vec2 model	2023-02-01 21:38:50 +02:00
m-bain	fd2a093754	Merge pull request #55 from jonatasgrosman/main FIX: Error when loading Hugging Face's models with embedded LM	2023-02-01 10:27:45 +00:00
Jonatas Grosman	d294e29ad9	fix: error when loading huggingface model with embedded language model	2023-01-31 23:24:26 -03:00
Mahmoud Ashraf	0eae9e1f50	added several wav2vec2 models by jonatasgrosman since his models were used in other languages before and I tested the arabic model myself, I assumed it's safe to include all the available models	2023-02-01 03:02:10 +02:00
Mahmoud Ashraf	1b08661e42	change arabic model to jonatasgrosman	2023-01-31 19:32:31 +02:00
Mahmoud Ashraf	a49799294b	add arabic wav2vec2 model form elgeish	2023-01-31 19:07:48 +02:00
Max Bain	76f79f600a	fix short seg timestamps bug	2023-01-28 19:04:19 +00:00
Max Bain	c19cf407d8	handle non-alignable whole segments	2023-01-28 13:53:03 +00:00
Max Bain	5b8c8a7bd3	pandas fix	2023-01-27 15:05:08 +00:00
Max Bain	16d24b1c96	only pad timestamps if not using VAD	2023-01-26 10:46:13 +00:00
Max Bain	286a2f2c14	clean up logic, use pandas where possibl	2023-01-25 18:42:52 +00:00
Max Bain	5a668a7d80	fallback on whisper alignment failures, update readme	2023-01-05 11:15:19 +00:00
Max Bain	9f6fa61160	init commit	2022-12-14 18:59:12 +00:00

40 Commits