| speechbrain/speechbrain |
7,166 |
|
0 |
0 |
about 2 years ago |
0 |
|
149 |
apache-2.0 |
Python |
| A PyTorch-based Speech Toolkit |
| pliang279/awesome-multimodal-ml |
4,999 |
|
0 |
0 |
over 2 years ago |
0 |
|
8 |
mit |
|
| Reading list for research topics in multimodal machine learning |
| wq2012/awesome-diarization |
1,384 |
|
0 |
0 |
about 2 years ago |
0 |
|
3 |
apache-2.0 |
|
| A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources. |
| linto-ai/whisper-timestamped |
1,217 |
|
0 |
3 |
over 2 years ago |
3 |
December 08, 2023 |
15 |
agpl-3.0 |
Python |
| Multilingual Automatic Speech Recognition with word-level timestamps and confidence |
| mravanelli/SincNet |
764 |
|
0 |
0 |
about 5 years ago |
0 |
|
22 |
mit |
Python |
| SincNet is a neural architecture for efficiently processing raw audio samples. |
| breizhn/DTLN |
470 |
|
0 |
0 |
over 2 years ago |
0 |
|
31 |
mit |
Python |
| Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support. |
| DigitalPhonetics/IMS-Toucan |
426 |
|
0 |
0 |
about 2 years ago |
0 |
|
29 |
apache-2.0 |
Python |
| Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality. |
| drethage/speech-denoising-wavenet |
414 |
|
0 |
0 |
over 6 years ago |
0 |
|
29 |
mit |
Python |
| A neural network for end-to-end speech denoising |
| SforAiDl/Neural-Voice-Cloning-With-Few-Samples |
379 |
|
0 |
0 |
about 5 years ago |
0 |
|
0 |
mit |
Python |
| This repository has implementation for "Neural Voice Cloning With Few Samples" |
| pliang279/MultiBench |
356 |
|
0 |
0 |
over 2 years ago |
0 |
|
10 |
mit |
HTML |
| [NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning |