Commit Graph

22 Commits

Author SHA1 Message Date
b026407fd9 Merge branch 'v3' of https://github.com/m-bain/whisperX into v3
Conflicts:
	whisperx/asr.py
2023-06-05 15:30:02 +01:00
a323cff654 --suppress_numerals option, ensures non-numerical words, for wav2vec2 alignment 2023-06-05 15:27:42 +01:00
5a47f458ac Added download path parameter. 2023-05-27 11:38:54 +02:00
7c5468116f Merge branch 'm-bain:main' into transcribe_keywords 2023-05-20 16:03:40 +02:00
a1c705b3a7 fix tokenizer is None 2023-05-20 15:52:45 +02:00
715435db42 add tokenizer is None case 2023-05-20 15:42:21 +02:00
1fc965bc1a add task, language keyword to transcribe 2023-05-20 15:30:25 +02:00
53396adb21 add device_index 2023-05-20 13:02:46 +02:00
d8a2b4ffc9 Merge pull request #246 from m-bain/v3
V3
2023-05-13 12:18:09 +01:00
fd8f1003cf add translate, fix word_timestamp error 2023-05-13 12:14:06 +01:00
eabf35dff0 Custom result types 2023-05-08 20:45:34 +02:00
b50aafb17b Fix tuple unpacking 2023-05-08 20:03:42 +02:00
24008aa1ed fix long segments, break into sentences using nltk, improve align logic, improve diarize (sentence-based) 2023-05-07 15:32:58 +01:00
4e2ac4e4e9 torch2.0, remove compile for now, round to times to 3 decimal 2023-05-04 20:38:13 +01:00
2d59eb9726 Add torch compile to log mel spectrogram 2023-05-03 23:17:44 +02:00
b9c8c5072b Pad language detection if audio is too short 2023-04-30 18:34:18 +02:00
cb176a186e added num_workers to fix pickling error 2023-04-29 19:51:05 +02:00
558d980535 v3 init 2023-04-24 21:08:43 +01:00
6a72b61564 clamp end_timestamp to prevent infinite loop 2023-04-11 20:15:37 +01:00
b9ca701d69 .wav conversion, handle audio with no detected speech 2023-03-31 23:02:38 +01:00
ae4a9de307 add vad model external dl 2023-03-30 18:57:55 +01:00
18b63d46e2 skeleton v2 2023-03-30 05:31:57 +01:00