| shentianxiao/language-style-transfer |
491 |
|
0 |
0 |
about 5 years ago |
0 |
|
21 |
apache-2.0 |
Roff |
| bitextor/bicleaner |
134 |
|
0 |
1 |
over 2 years ago |
37 |
February 09, 2024 |
0 |
gpl-3.0 |
Python |
| Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus. |
| christos-c/bible-corpus |
134 |
|
0 |
0 |
almost 3 years ago |
0 |
|
2 |
cc0-1.0 |
|
| A multilingual parallel corpus created from translations of the Bible. |
| jungyeul/korean-parallel-corpora |
129 |
|
0 |
0 |
about 3 years ago |
0 |
|
1 |
|
|
| Korean Parallel Corpus |
| fnielsen/awesome-danish |
110 |
|
0 |
0 |
about 3 years ago |
0 |
|
0 |
other |
|
| A curated list of awesome resources for Danish language technology |
| averkij/lingtrain-aligner |
98 |
|
0 |
0 |
over 2 years ago |
53 |
November 26, 2023 |
3 |
gpl-3.0 |
Python |
| Lingtrain Aligner — ML powered library for the accurate texts alignment. |
| odashi/small_parallel_enja |
61 |
|
0 |
0 |
over 6 years ago |
0 |
|
0 |
|
Roff |
| 50k English-Japanese Parallel Corpus for Machine Translation Benchmark. |
| clab/wikipedia-parallel-titles |
53 |
|
0 |
0 |
about 11 years ago |
0 |
|
4 |
|
Perl |
| Tools for extracting parallel corpora from article titles across languages in Wikipedia |
| FerreroJeremy/Cross-Language-Dataset |
50 |
|
0 |
0 |
almost 9 years ago |
0 |
|
1 |
other |
|
| A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection |
| pywirrarika/naki |
49 |
|
0 |
0 |
over 5 years ago |
0 |
|
0 |
gpl-3.0 |
|
| List of research and engineering of NLP for American Native/Indigenous Languages. |