| BLKSerene/Wordless |
649 |
|
0 |
0 |
over 2 years ago |
0 |
|
0 |
gpl-3.0 |
Python |
| An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation |
| neurosnap/sentences |
391 |
|
31 |
127 |
over 2 years ago |
7 |
May 26, 2021 |
5 |
mit |
Go |
| A multilingual command line sentence tokenizer in Golang |
| artitw/text2text |
268 |
|
0 |
0 |
about 2 years ago |
134 |
October 21, 2023 |
27 |
other |
Python |
| Text2Text: Crosslingual NLP/G toolkit |
| bitextor/bitextor |
260 |
|
0 |
0 |
over 2 years ago |
0 |
|
4 |
gpl-3.0 |
Python |
| Bitextor generates translation memories from multilingual websites |
| winkjs/wink-tokenizer |
47 |
|
29 |
15 |
about 4 years ago |
19 |
January 27, 2022 |
0 |
mit |
JavaScript |
| Multilingual tokenizer that automatically tags each token with its type |
| hottolink/hottoSNS-bert |
41 |
|
0 |
0 |
almost 5 years ago |
0 |
|
2 |
other |
Python |
| hottoSNS-BERT: 大規模SNSコーパスによる文分散表現モデル |
| yeontaek/BERT-Korean-Model |
34 |
|
0 |
0 |
over 6 years ago |
0 |
|
1 |
apache-2.0 |
|
| BERT with SentencePiece for Korean text |
| jonsafari/tok-tok |
26 |
|
0 |
0 |
almost 9 years ago |
0 |
|
1 |
apache-2.0 |
Python |
| A fast, simple, multilingual tokenizer |
| jerinphilip/ilmulti |
12 |
|
0 |
0 |
over 5 years ago |
2 |
August 30, 2020 |
4 |
mit |
Python |
| Tooling to play around with multilingual machine translation for Indian Languages. |
| liuzl/tokenizer |
11 |
|
0 |
0 |
over 7 years ago |
1 |
November 28, 2018 |
0 |
apache-2.0 |
Go |
| Natural Language Tokenizer |