| quanteda/readtext |
112 |
|
5 |
4 |
about 2 years ago |
10 |
June 03, 2023 |
30 |
|
R |
| an R package for reading text files |
| amir-zeldes/gum |
76 |
|
0 |
0 |
over 2 years ago |
0 |
|
6 |
other |
Python |
| Repository for the Georgetown University Multilayer Corpus (GUM) |
| tommasoc80/EventStoryLine |
70 |
|
0 |
0 |
over 2 years ago |
0 |
|
3 |
other |
DM |
| Event StoryLine Corpus - annotated data, baselines and evaluation scripts, evaluation data. |
| proycon/folia |
60 |
|
2 |
2 |
over 2 years ago |
93 |
October 08, 2021 |
21 |
gpl-3.0 |
Python |
| FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions |
| UCDenver-ccp/CRAFT |
58 |
|
0 |
0 |
over 3 years ago |
0 |
|
1 |
other |
Clojure |
| adobe-research/deft_corpus |
57 |
|
0 |
0 |
about 6 years ago |
0 |
|
6 |
other |
Python |
| The Definition Extraction From Text corpus and relevant formatting scripts |
| dumitrescustefan/ronec |
54 |
|
0 |
0 |
over 3 years ago |
0 |
|
0 |
mit |
Python |
| Romanian Named Entity Corpus (RONEC) version 2.0 |
| GateNLP/broad_twitter_corpus |
52 |
|
0 |
0 |
almost 4 years ago |
0 |
|
9 |
other |
Jupyter Notebook |
| The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors |
| dialogue-evaluation/morphoRuEval-2017 |
41 |
|
0 |
0 |
over 8 years ago |
0 |
|
13 |
other |
Python |
| arne-cl/discoursegraphs |
34 |
|
1 |
1 |
about 5 years ago |
18 |
March 14, 2021 |
46 |
bsd-3-clause |
Python |
| linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX). |