whisperX

mirror of https://github.com/m-bain/whisperX.git synced 2025-07-01 18:17:27 -04:00

Author	SHA1	Message	Date
Ayushi-Desynova	423667f00b	Update alignment.py	2023-08-09 17:08:56 +05:30
Max Bain	1b092de19a	Merge pull request #395 from Joemgu7/main Fix repeat transcription on different languages and proper suppress_numerals use	2023-08-02 13:44:27 +01:00
Max Bain	69a52b00c7	Merge pull request #400 from davidas1/fast-diarize make diarization faster	2023-08-02 13:43:20 +01:00
Dudu Asulin	9e3145cead	more	2023-08-02 10:36:56 +03:00
Dudu Asulin	577db33430	more	2023-08-02 10:35:20 +03:00
Dudu Asulin	da6ed83dc9	more	2023-08-02 10:34:42 +03:00
Dudu Asulin	7eb9692cb9	more	2023-08-02 10:32:02 +03:00
Dudu Asulin	8de0e2af51	make diarization faster	2023-08-02 10:11:43 +03:00
briguetjo	225f6b4d69	fix suppress_numerals	2023-07-29 19:34:51 +02:00
briguetjo	864976af23	fix issue by resetting tokenizer	2023-07-29 18:56:33 +02:00
briguetjo	9d736dca1c	add some warning if languages do not match	2023-07-29 18:20:59 +02:00
briguetjo	d87f6268d0	fix preset language	2023-07-29 18:13:36 +02:00
Max Bain	d80b98601b	Merge pull request #255 from tijszwinkels/cuda-11.8 Suggest using pytorch-cuda 11.8 instead of 11.7	2023-07-25 00:29:08 +01:00
Max Bain	aa37509362	Merge branch 'main' into cuda-11.8	2023-07-25 00:28:53 +01:00
Max Bain	15b4c558c2	Merge pull request #352 from daanelson/replicate-demo adding link to Replicate demo	2023-07-24 10:48:24 +01:00
Max Bain	54504a2be8	Merge pull request #374 from abCods/main Add Urdu model support for alignment	2023-07-24 10:47:52 +01:00
Max Bain	8c0fee90d3	Update alignment.py	2023-07-24 10:47:41 +01:00
Max Bain	016f0293cd	Merge pull request #378 from baer/patch-1 Remove torchvision from README	2023-07-24 10:47:14 +01:00
Max Bain	44daf50501	Merge pull request #382 from mabergerx/patch-1 Update transcribe.py -> small change in `batch_size` description	2023-07-24 10:46:55 +01:00
Mark Berger	48e7caad77	Update transcribe.py -> small change in `batch_size` description Changed the description of the `batch_size` parameter.	2023-07-24 11:45:38 +02:00
Eric Baer	8673064658	Remove torchvision from README	2023-07-20 17:02:34 -07:00
Ahmad Bilal	e6ecbaa68f	Remove spacing	2023-07-20 03:20:47 +05:00
Ahmad Bilal	e92325b7eb	Remove the fix	2023-07-20 03:19:37 +05:00
Ahmad Bilal	eb712f3999	Rectify refernce to the word	2023-07-20 02:54:06 +05:00
Ahmad Bilal	30eff5a01f	Replace double quotes to single for JSON parsing	2023-07-20 02:32:37 +05:00
Ahmad Bilal	734ecc2844	Add Urdu model support for alignment	2023-07-17 19:29:41 +05:00
dan nelson	512ab1acf9	adding Replicate demo	2023-06-30 18:22:10 -07:00
Max Bain	befe2b242e	torch 2+	2023-06-07 22:43:29 +01:00
Max Bain	f9c5ff9f08	Merge pull request #309 from Ca-ressemble-a-du-fake/patch-1 Add Audacity export	2023-06-07 11:50:05 +01:00
Max Bain	d39c1b2319	add "aud" to output_format	2023-06-07 11:48:49 +01:00
Max Bain	b13778fefd	make aud optional	2023-06-07 11:47:49 +01:00
CaraDuf	076ff96eb2	Add Audacity export This exports the transcript to a text file that can be directly imported in Audacity as label file. This is useful to quickly check the transcript-audio alignment.	2023-06-07 05:49:49 +02:00
Max Bain	0c84c26d92	Merge pull request #303 from m-bain/v3 Suppress numerals	2023-06-05 15:46:26 +01:00
Max Bain	d7f1d16f19	suppress numerals change logic	2023-06-05 15:44:17 +01:00
Max Bain	74a00eecd7	suppress numerals fix	2023-06-05 15:33:04 +01:00
Max Bain	b026407fd9	Merge branch 'v3' of https://github.com/m-bain/whisperX into v3 Conflicts: whisperx/asr.py	2023-06-05 15:30:02 +01:00
Max Bain	a323cff654	--suppress_numerals option, ensures non-numerical words, for wav2vec2 alignment	2023-06-05 15:27:42 +01:00
Max Bain	93ed6cfa93	interspeech	2023-06-01 16:54:16 +01:00
Max Bain	9797a67391	Merge pull request #294 from SohaibAnwaar/fix/typehint-bug-fix fix: Bug in type hinting	2023-05-30 11:13:22 +01:00
Master X	5a4382ae4d	fix: Bug in type hinting	2023-05-30 15:11:07 +05:00
Max Bain	ec6a110cdf	Merge pull request #290 from m-bain/main push contributions from main	2023-05-29 12:55:24 +01:00
Max Bain	8d8c027a92	Merge pull request #278 from Mr-Turtleeeee/add_align_for_vi Add war2vec model for Vietnamese	2023-05-29 12:54:37 +01:00
Max Bain	4cbd3030cc	no sentence split on mr. mrs. dr...	2023-05-29 12:48:14 +01:00
Max Bain	1c528d1a3c	Merge pull request #284 from prameshbajra/main	2023-05-27 11:19:13 +01:00
Max Bain	c65e7ba9b4	Merge pull request #280 from Thebys/patch-1	2023-05-27 11:18:27 +01:00
prameshbajra	5a47f458ac	Added download path parameter.	2023-05-27 11:38:54 +02:00
Max Bain	f1032bb40a	VAD unequal stack size, remove debug change	2023-05-26 20:39:19 +01:00
Max Bain	bc8a03881a	Merge pull request #281 from m-bain/v3 fix Unequal Stack Size VAD error	2023-05-26 20:37:57 +01:00
Max Bain	42b4909bc0	fix Unequal Stack Size VAD error	2023-05-26 20:36:03 +01:00
Thebys	bb15d6b68e	Add Czech alignment model This PR adds the following Czech alignment model: https://huggingface.co/comodoro/wav2vec2-xls-r-300m-cs-250. I have successfully tested this with several Czech audio recordings with length of up to 3 hours, and the results are satisfactory. However, I have received the following warnings and I am not sure how relevant it is: ``` Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Thebys\.cache\torch\whisperx-vad-segmentation.bin` Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.0.0. Bad things might happen unless you revert torch to 1.x. ```	2023-05-26 21:17:01 +02:00

... 3 4 5 6 7 ...

444 Commits