| speechbrain/speechbrain |
7,166 |
|
0 |
0 |
about 2 years ago |
0 |
|
149 |
apache-2.0 |
Python |
| A PyTorch-based Speech Toolkit |
| pliang279/awesome-multimodal-ml |
4,999 |
|
0 |
0 |
over 2 years ago |
0 |
|
8 |
mit |
|
| Reading list for research topics in multimodal machine learning |
| pyannote/pyannote-audio |
4,460 |
|
1 |
13 |
about 2 years ago |
24 |
December 01, 2023 |
95 |
mit |
Jupyter Notebook |
| Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding |
| microsoft/torchscale |
2,804 |
|
0 |
8 |
about 2 years ago |
5 |
October 20, 2023 |
18 |
mit |
Python |
| Foundation Architecture for (M)LLMs |
| r9y9/deepvoice3_pytorch |
1,906 |
|
0 |
0 |
over 2 years ago |
0 |
|
43 |
other |
Python |
| PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models |
| r9y9/wavenet_vocoder |
1,617 |
|
0 |
0 |
over 5 years ago |
0 |
|
14 |
other |
Python |
| WaveNet vocoder |
| wq2012/awesome-diarization |
1,384 |
|
0 |
0 |
about 2 years ago |
0 |
|
3 |
apache-2.0 |
|
| A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources. |
| linto-ai/whisper-timestamped |
1,217 |
|
0 |
3 |
about 2 years ago |
3 |
December 08, 2023 |
15 |
agpl-3.0 |
Python |
| Multilingual Automatic Speech Recognition with word-level timestamps and confidence |
| midas-research/audino |
988 |
|
0 |
0 |
over 2 years ago |
0 |
|
52 |
mit |
JavaScript |
| Open source audio annotation tool for humans |
| coqui-ai/open-speech-corpora |
830 |
|
0 |
0 |
over 3 years ago |
0 |
|
166 |
mit |
|
| 💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies |