Commit Graph

40 Commits

Author SHA1 Message Date
734ecc2844 Add Urdu model support for alignment 2023-07-17 19:29:41 +05:00
8d8c027a92 Merge pull request #278 from Mr-Turtleeeee/add_align_for_vi
Add war2vec model for Vietnamese
2023-05-29 12:54:37 +01:00
4cbd3030cc no sentence split on mr. mrs. dr... 2023-05-29 12:48:14 +01:00
c65e7ba9b4 Merge pull request #280 from Thebys/patch-1 2023-05-27 11:18:27 +01:00
bc8a03881a Merge pull request #281 from m-bain/v3
fix Unequal Stack Size VAD error
2023-05-26 20:37:57 +01:00
42b4909bc0 fix Unequal Stack Size VAD error 2023-05-26 20:36:03 +01:00
bb15d6b68e Add Czech alignment model
This PR adds the following Czech alignment model: https://huggingface.co/comodoro/wav2vec2-xls-r-300m-cs-250.

I have successfully tested this with several Czech audio recordings with length of up to 3 hours, and the results are satisfactory.

However, I have received the following warnings and I am not sure how relevant it is:
```
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Thebys\.cache\torch\whisperx-vad-segmentation.bin`
Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.0. Bad things might happen unless you revert torch to 1.x.
```
2023-05-26 21:17:01 +02:00
23d405e1cf Merge branch 'main' into add_align_for_vi 2023-05-26 17:14:09 +01:00
1d9d630fb9 added Korean wav2vec2 model 2023-05-26 20:33:16 +09:00
9c042c2d28 Add war2vec model for Vietnamese 2023-05-26 16:46:55 +07:00
d8a2b4ffc9 Merge pull request #246 from m-bain/v3
V3
2023-05-13 12:18:09 +01:00
fd8f1003cf add translate, fix word_timestamp error 2023-05-13 12:14:06 +01:00
7642390d0a Merge branch 'main' into danish_alignment 2023-05-09 23:10:13 +01:00
eabf35dff0 Custom result types 2023-05-08 20:45:34 +02:00
4603f010a5 update readme, setup, add option to return char_timestamps 2023-05-07 20:28:33 +01:00
24008aa1ed fix long segments, break into sentences using nltk, improve align logic, improve diarize (sentence-based) 2023-05-07 15:32:58 +01:00
4e2ac4e4e9 torch2.0, remove compile for now, round to times to 3 decimal 2023-05-04 20:38:13 +01:00
cb53661070 Enable Hebrew support 2023-05-03 11:26:12 -05:00
64ca208cc8 Fixed the word_start variable not initialized bug. 2023-05-02 13:13:02 +05:30
601c91140f references #202, attempt to fix speaker diarization failing in v3 2023-04-30 17:33:24 +00:00
558d980535 v3 init 2023-04-24 21:08:43 +01:00
da458863d7 allow custom model_dir for torchaudio models 2023-04-14 21:40:36 +01:00
189aeac83e v2 lets goo 2023-04-01 00:10:45 +01:00
18b63d46e2 skeleton v2 2023-03-30 05:31:57 +01:00
c8404d9805 added a danish alignment model 2023-03-04 13:20:40 +01:00
cfcede41f6 Added Python 3.7 compatibility
- removed use of walrus operator in favor of `np.cumsum`
2023-03-02 15:46:07 +01:00
2e307814dd added if clause for checking 2023-02-10 14:48:51 +05:30
9f26112d5c added turkish wav2vec2 model 2023-02-01 21:38:50 +02:00
fd2a093754 Merge pull request #55 from jonatasgrosman/main
FIX: Error when loading Hugging Face's models with embedded LM
2023-02-01 10:27:45 +00:00
d294e29ad9 fix: error when loading huggingface model with embedded language model 2023-01-31 23:24:26 -03:00
0eae9e1f50 added several wav2vec2 models by jonatasgrosman
since his models were used in other languages before and I tested the arabic model myself, I assumed it's safe to include all the available models
2023-02-01 03:02:10 +02:00
1b08661e42 change arabic model to jonatasgrosman 2023-01-31 19:32:31 +02:00
a49799294b add arabic wav2vec2 model form elgeish 2023-01-31 19:07:48 +02:00
76f79f600a fix short seg timestamps bug 2023-01-28 19:04:19 +00:00
c19cf407d8 handle non-alignable whole segments 2023-01-28 13:53:03 +00:00
5b8c8a7bd3 pandas fix 2023-01-27 15:05:08 +00:00
16d24b1c96 only pad timestamps if not using VAD 2023-01-26 10:46:13 +00:00
286a2f2c14 clean up logic, use pandas where possibl 2023-01-25 18:42:52 +00:00
5a668a7d80 fallback on whisper alignment failures, update readme 2023-01-05 11:15:19 +00:00
9f6fa61160 init commit 2022-12-14 18:59:12 +00:00