| pliang279/awesome-multimodal-ml |
4,999 |
|
0 |
0 |
over 2 years ago |
0 |
|
8 |
mit |
|
| Reading list for research topics in multimodal machine learning |
| microsoft/torchscale |
2,804 |
|
0 |
8 |
about 2 years ago |
5 |
October 20, 2023 |
18 |
mit |
Python |
| Foundation Architecture for (M)LLMs |
| r9y9/deepvoice3_pytorch |
1,906 |
|
0 |
0 |
over 2 years ago |
0 |
|
43 |
other |
Python |
| PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models |
| wq2012/awesome-diarization |
1,384 |
|
0 |
0 |
about 2 years ago |
0 |
|
3 |
apache-2.0 |
|
| A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources. |
| linto-ai/whisper-timestamped |
1,217 |
|
0 |
3 |
about 2 years ago |
3 |
December 08, 2023 |
15 |
agpl-3.0 |
Python |
| Multilingual Automatic Speech Recognition with word-level timestamps and confidence |
| midas-research/audino |
988 |
|
0 |
0 |
over 2 years ago |
0 |
|
52 |
mit |
JavaScript |
| Open source audio annotation tool for humans |
| drethage/speech-denoising-wavenet |
414 |
|
0 |
0 |
over 6 years ago |
0 |
|
29 |
mit |
Python |
| A neural network for end-to-end speech denoising |
| r9y9/nnmnkwii |
375 |
|
15 |
1 |
about 3 years ago |
26 |
January 04, 2022 |
6 |
other |
Python |
| Library to build speech synthesis systems designed for easy and fast prototyping. |
| novoic/surfboard |
369 |
|
0 |
0 |
about 4 years ago |
5 |
July 17, 2020 |
8 |
gpl-3.0 |
Python |
| Novoic's audio feature extraction library |
| pliang279/MultiBench |
356 |
|
0 |
0 |
over 2 years ago |
0 |
|
10 |
mit |
HTML |
| [NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning |