| thammegowda/mtdata |
115 |
|
0 |
0 |
almost 3 years ago |
21 |
November 25, 2022 |
22 |
apache-2.0 |
Python |
| A tool that locates, downloads, and extracts machine translation corpora |
| veraPDF/veraPDF-corpus |
66 |
|
0 |
0 |
over 2 years ago |
0 |
|
8 |
|
|
| veraPDF test corpus for ISO 19005 (PDF/A) and ISO 14289 (PDF/UA) |
| pdf-association/pdf-corpora |
60 |
|
0 |
0 |
almost 3 years ago |
0 |
|
0 |
cc-by-4.0 |
|
| An index of PDF-centric corpora |
| ColingPaper2018/DialogueAct-Tagger |
42 |
|
0 |
0 |
over 4 years ago |
0 |
|
5 |
|
HTML |
| A resource to create a multi domain Dialog Act Tagger for conversational agents using publicly available data |
| global-asp/asp-source |
18 |
|
0 |
0 |
about 3 years ago |
0 |
|
1 |
|
|
| Source stories from the African Storybook Project in Markdown format |
| vchahun/fast_umorph |
6 |
|
0 |
0 |
almost 13 years ago |
0 |
|
1 |
|
C++ |
| Unsupervised morphology induction with OpenFst |