Corpora Alternatives

A collection of small corpuses of interesting data for the creation of bots and similar stuff.
Suggest Alternative
Alternatives To dariusk/corpora
Project Name Stars Downloads Repos Using This Packages Using This Most Recent Commit Total Releases Latest Release Open Issues License Language
nltk/nltk 12,699 10,496 2,261 about 2 years ago 59 July 20, 2023 268 apache-2.0 Python
NLTK Source
brightmart/nlp_chinese_corpus 8,344 0 0 almost 3 years ago 0 20 mit
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
nl8590687/ASRT_SpeechRecognition 7,253 0 0 about 2 years ago 1 October 23, 2020 101 gpl-3.0 Python
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
stanfordnlp/GloVe 6,480 0 0 over 2 years ago 0 80 apache-2.0 C
Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
codertimo/BERT-pytorch 5,605 1 0 over 2 years ago 5 October 23, 2018 63 apache-2.0 Python
Google AI 2018 BERT pytorch implementation
ibab/tensorflow-wavenet 5,362 0 0 almost 3 years ago 0 176 mit Python
A TensorFlow implementation of DeepMind's WaveNet paper
niderhoff/nlp-datasets 5,235 0 0 over 3 years ago 0 7
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
vespa-engine/vespa 5,115 5 58 about 2 years ago 741 November 30, 2023 175 apache-2.0 Java
AI + Data, online. https://vespa.ai
shibing624/pycorrector 4,928 0 1 about 2 years ago 30 November 07, 2023 27 apache-2.0 Python
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
dariusk/corpora 4,757 0 2 over 2 years ago 1 May 17, 2018 15 JavaScript
A collection of small corpuses of interesting data for the creation of bots and similar stuff.
Alternatives To dariusk/corpora
Select To Compare


Alternative Project Comparisons
Popular Corpus Projects
Popular Projects Projects
Popular Data Processing Categories
Related Searches
Get A Weekly Email With Trending Projects
No Spam. Unsubscribe easily at any time.
Privacy | About | Terms | Follow Us On Twitter

Downloads, Dependent Repos, Dependent Packages, Total Releases, Latest Releases data powered by Libraries.io.

Copyright 2018-2026 Awesome Open Source.  All rights reserved.