From 6b64cb079a1ad6aa2370669803a3aafabc71e6e0 Mon Sep 17 00:00:00 2001
From: m-bain <36994049+m-bain@users.noreply.github.com>
Date: Sun, 18 Dec 2022 18:43:33 +0000
Subject: [PATCH] add arch figure, citation
---
README.md | 42 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
diff --git a/README.md b/README.md
index eb68222..195dac7 100644
--- a/README.md
+++ b/README.md
@@ -17,6 +17,9 @@ This repository refines the timestamps of openAI's Whisper model via forced alig
**Forced Alignment** refers to the process by which orthographic transcriptions are aligned to audio recordings to automatically generate phone level segmentation.
+
+
+
Setup ⚙️
Install this package using
@@ -98,7 +101,46 @@ https://user-images.githubusercontent.com/36994049/208298819-6f462b2c-8cae-4c54-
Contact maxbain[at]robots[dot]ox[dot]ac[dot]uk if using this for commerical purposes.
+
Acknowledgements 🙏
Of course, this is mostly just a modification to [openAI's whisper](https://github.com/openai/whisper).
As well as accreditation to this [PyTorch tutorial on forced alignment](https://pytorch.org/tutorials/intermediate/forced_alignment_with_torchaudio_tutorial.html)
+
+
+Citation
+If you use this in your research, just cite the repo,
+
+```bibtex
+@misc{bain2022whisperx,
+ author = {Bain, Max},
+ title = {WhisperX},
+ year = {2022},
+ publisher = {GitHub},
+ journal = {GitHub repository},
+ howpublished = {\url{https://github.com/m-bain/whisperX}},
+}
+```
+
+as well as the whisper paper,
+
+```bibtex
+@article{radford2022robust,
+ title={Robust speech recognition via large-scale weak supervision},
+ author={Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
+ journal={arXiv preprint arXiv:2212.04356},
+ year={2022}
+}
+```
+and any alignment model used, e.g. wav2vec2.0.
+
+```bibtex
+@article{baevski2020wav2vec,
+ title={wav2vec 2.0: A framework for self-supervised learning of speech representations},
+ author={Baevski, Alexei and Zhou, Yuhao and Mohamed, Abdelrahman and Auli, Michael},
+ journal={Advances in Neural Information Processing Systems},
+ volume={33},
+ pages={12449--12460},
+ year={2020}
+}
+```