| natasha/corus |
254 |
|
0 |
0 |
over 2 years ago |
10 |
July 24, 2023 |
66 |
mit |
Jupyter Notebook |
| Links to Russian corpora + Python functions for loading and parsing |
| ajinkyakulkarni14/TED-Multilingual-Parallel-Corpus |
152 |
|
0 |
0 |
over 10 years ago |
0 |
|
6 |
|
|
| TED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted from TED talks www.ted.com for 109 world languages. |
| UniversalDependencies/UD_Russian-SynTagRus |
77 |
|
0 |
0 |
over 2 years ago |
0 |
|
16 |
other |
Perl |
| Russian data from the SynTagRus corpus. |
| maxoodf/russian_news_corpus |
76 |
|
0 |
0 |
about 9 years ago |
0 |
|
1 |
apache-2.0 |
|
| Russian mass media stemmed texts corpus / Корпус лемматизированных (морфологически нормализованных) текстов российских СМИ |
| l4rz/gpt-2-training |
65 |
|
0 |
0 |
about 5 years ago |
0 |
|
7 |
|
Python |
| Training GPT-2 on a Russian language corpus |
| TatianaShavrina/taiga_site |
54 |
|
0 |
0 |
almost 6 years ago |
0 |
|
6 |
|
CSS |
| natasha/nerus |
51 |
|
0 |
0 |
over 2 years ago |
7 |
April 09, 2020 |
0 |
mit |
Python |
| Large silver standart Russian corpus with NER, morphology and syntax markup |
| dialogue-evaluation/morphoRuEval-2017 |
41 |
|
0 |
0 |
over 8 years ago |
0 |
|
13 |
other |
Python |
| noise-field/Russian-ULMFit |
27 |
|
0 |
0 |
about 6 years ago |
0 |
|
0 |
|
Jupyter Notebook |
| AWD-LSTM language model trained on newspaper corpora with fast.ai |
| aatimofeev/spacy_russian_tokenizer |
26 |
|
0 |
0 |
almost 7 years ago |
0 |
|
1 |
|
Python |
| Custom Russian tokenizer for spaCy |