| shangjingbo1226/AutoPhrase |
978 |
|
0 |
0 |
about 4 years ago |
3 |
November 19, 2020 |
6 |
apache-2.0 |
C++ |
| AutoPhrase: Automated Phrase Mining from Massive Text Corpora |
| cbaziotis/ekphrasis |
583 |
|
7 |
0 |
over 3 years ago |
54 |
May 17, 2022 |
18 |
mit |
Python |
| Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets). |
| Moonshile/ChineseWordSegmentation |
427 |
|
0 |
0 |
over 5 years ago |
0 |
|
2 |
mit |
Python |
| Chinese word segmentation algorithm without corpus(无需语料库的中文分词) |
| rkcosmos/deepcut |
319 |
|
7 |
3 |
over 5 years ago |
30 |
November 06, 2019 |
0 |
mit |
Python |
| A Thai word tokenization library using Deep Neural Network |
| jacksonllee/pycantonese |
290 |
|
0 |
0 |
almost 3 years ago |
24 |
December 28, 2021 |
5 |
mit |
Python |
| Cantonese Linguistics and NLP |
| grantjenks/python-wordsegment |
268 |
|
0 |
0 |
about 6 years ago |
0 |
|
8 |
other |
Python |
| English word segmentation, written in pure-Python, and based on a trillion-word corpus. |
| hankcs/multi-criteria-cws |
260 |
|
0 |
0 |
about 7 years ago |
0 |
|
6 |
gpl-3.0 |
Python |
| Simple Solution for Multi-Criteria Chinese Word Segmentation |
| guokr/gkseg |
242 |
|
0 |
0 |
about 13 years ago |
0 |
|
3 |
other |
C |
| Yet another Chinese word segmentation package based on character-based tagging heuristics and CRF algorithm |
| sertiscorp/thai-word-segmentation |
66 |
|
0 |
0 |
about 6 years ago |
0 |
|
6 |
mit |
Python |
| Thai word segmentation with bi-directional RNN |
| sunpinyin/open-gram |
59 |
|
0 |
0 |
about 10 years ago |
0 |
|
2 |
|
Python |
| an open solution for collecting n-gram Chinese lexicon and n-gram statistics |