da458863d7
allow custom model_dir for torchaudio models
2023-04-14 21:40:36 +01:00
cf252a8592
allow custom path for vad model
2023-04-14 15:02:58 +01:00
6a72b61564
clamp end_timestamp to prevent infinite loop
2023-04-11 20:15:37 +01:00
bb15c9428f
opti the inference loop
2023-04-09 15:58:55 +08:00
4146e56d5b
Added vad_filter type
2023-04-05 17:11:29 +05:00
70a4a0a25c
Fix typo
2023-04-05 10:50:49 +09:00
a582a59493
mkdir for torch cache in case it doesnt exist
2023-04-01 13:05:40 -07:00
189aeac83e
v2 lets goo
2023-04-01 00:10:45 +01:00
11a78d7ced
handle tmp wav file better
2023-04-01 00:06:40 +01:00
b9ca701d69
.wav conversion, handle audio with no detected speech
2023-03-31 23:02:38 +01:00
d0fa028045
fix tfile naming
2023-03-30 19:24:42 +01:00
ae4a9de307
add vad model external dl
2023-03-30 18:57:55 +01:00
18b63d46e2
skeleton v2
2023-03-30 05:31:57 +01:00
33dd3b9bcd
Update decoding.py
...
Changes from https://github.com/openai/whisper/pull/914/
2023-03-24 11:56:41 +01:00
cea42ca470
Fix hugging face error
...
Model should be loaded with an id to avoid this error:
huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'pyannote\segmentation'.
2023-03-04 19:12:13 +01:00
cfcede41f6
Added Python 3.7 compatibility
...
- removed use of walrus operator in favor of `np.cumsum`
2023-03-02 15:46:07 +01:00
847a3cd85b
Merge pull request #96 from smly/fix-batch-processing
...
FIX: Assertion error in batch processing
2023-02-22 12:11:01 +00:00
57f5957e0e
Pass device to pyannote.audio.Inference
2023-02-22 03:48:20 +09:00
27fe502344
Fix assertion error in batch processing
2023-02-22 02:45:13 +09:00
a1d2229416
Improvement to transcription starting point with VAD
2023-02-18 11:12:23 -05:00
2e307814dd
added if clause for checking
2023-02-10 14:48:51 +05:30
d687cf3358
Merge pull request #58 from MahmoudAshraf97/main
...
added turkish wav2vec2 model
2023-02-01 22:11:51 +00:00
0a3fd11562
update readme
2023-02-01 22:09:11 +00:00
039af89a86
support batch processing
2023-02-01 19:41:20 +00:00
9f26112d5c
added turkish wav2vec2 model
2023-02-01 21:38:50 +02:00
fd2a093754
Merge pull request #55 from jonatasgrosman/main
...
FIX: Error when loading Hugging Face's models with embedded LM
2023-02-01 10:27:45 +00:00
d294e29ad9
fix: error when loading huggingface model with embedded language model
2023-01-31 23:24:26 -03:00
0eae9e1f50
added several wav2vec2 models by jonatasgrosman
...
since his models were used in other languages before and I tested the arabic model myself, I assumed it's safe to include all the available models
2023-02-01 03:02:10 +02:00
1b08661e42
change arabic model to jonatasgrosman
2023-01-31 19:32:31 +02:00
a49799294b
add arabic wav2vec2 model form elgeish
2023-01-31 19:07:48 +02:00
76f79f600a
fix short seg timestamps bug
2023-01-28 19:04:19 +00:00
50f3965fdb
fix tsv file ext
2023-01-28 17:39:07 +00:00
df2b1b70cb
increase vad cut default
2023-01-28 14:49:53 +00:00
c19cf407d8
handle non-alignable whole segments
2023-01-28 13:53:03 +00:00
8081ef2dcd
add custom vad binarization for vad cut
2023-01-28 00:22:33 +00:00
c6dbac76c8
cut up vad segments when too long to prevent OOM
2023-01-28 00:01:39 +00:00
5b8c8a7bd3
pandas fix
2023-01-27 15:05:08 +00:00
16d24b1c96
only pad timestamps if not using VAD
2023-01-26 10:46:13 +00:00
e7773358a3
Update transcribe.py
...
added the ability to include HF access token in order to use PyAnnote models
2023-01-26 00:42:35 +02:00
58d7191949
add diarize
2023-01-25 19:40:41 +00:00
286a2f2c14
clean up logic, use pandas where possibl
2023-01-25 18:42:52 +00:00
eec6d1f8d8
missing word timestamps
2023-01-24 16:37:19 +00:00
d1600e5b0f
Merge branch 'main' of https://github.com/m-bain/whisperX into main
...
Conflicts:
whisperx/transcribe.py
whisperx/utils.py
2023-01-24 15:38:05 +00:00
d395c21b83
new logic, diarization, vad filtering
2023-01-24 15:02:08 +00:00
ba102feb7f
vad filter
2023-01-20 12:54:20 +00:00
4569cb982a
fix file_ass display bug
...
sentence start time on .ass files had a bug where if the first word did not have a timestamp, it would set sentence start_time to 0, but this needs to be the local 0 not actual file 0 (i.e. it should be segment['start'])
2023-01-12 12:57:12 +00:00
7adead16e0
Update pt model to wav2vec2-large-xlsr-53-portuguese
2023-01-11 19:50:34 -03:00
7459bf8ad0
Add PT (pt-br) align support
2023-01-11 12:11:41 -03:00
d51353a4b6
uncomment .ass
2023-01-08 18:02:36 +00:00
78c87d3bfd
handle negative / tiny duration segments, final
2023-01-08 14:01:10 +00:00