| dbiir/UER-py |
2,802 |
|
0 |
0 |
over 2 years ago |
0 |
|
132 |
apache-2.0 |
Python |
| Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo |
| foxbook/atap |
367 |
|
0 |
0 |
over 3 years ago |
0 |
|
13 |
apache-2.0 |
Python |
| Code for Applied Text Analysis with Python |
| aparrish/gutenberg-dammit |
108 |
|
0 |
0 |
about 7 years ago |
0 |
|
8 |
|
Python |
| I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made this |
| aparrish/gutenberg-poetry-corpus |
83 |
|
0 |
0 |
over 7 years ago |
0 |
|
2 |
|
Jupyter Notebook |
| A corpus of poetry from Project Gutenberg |
| pgcorpus/gutenberg |
74 |
|
0 |
0 |
over 3 years ago |
0 |
|
2 |
gpl-3.0 |
Python |
| Pipeline to generate the Standardized Project Gutenberg Corpus |
| hackerb9/gwordlist |
68 |
|
0 |
0 |
over 3 years ago |
0 |
|
2 |
|
Shell |
| All the words from Google Books, sorted by frequency |
| LG-1/video_music_book_datasets |
57 |
|
0 |
0 |
over 5 years ago |
0 |
|
1 |
mit |
|
| NLP NER datasets video/music/book bio |
| c-w/gutenberg-http |
54 |
|
0 |
0 |
over 6 years ago |
0 |
|
2 |
apache-2.0 |
Python |
| A HTTP interface to the Project Gutenberg corpus. |
| wainshine/Book-Names-Corpus |
45 |
|
0 |
0 |
over 5 years ago |
0 |
|
0 |
apache-2.0 |
|
| 图书名语料库。含部分电影、游戏名称。 |
| proiel/proiel-treebank |
31 |
|
0 |
0 |
almost 3 years ago |
0 |
|
2 |
|
|
| Official releases of the PROIEL treebank of ancient Indo-European languages |