| Morizeyao/GPT2-Chinese |
7,249 |
|
0 |
0 |
over 2 years ago |
0 |
|
105 |
mit |
Python |
| Chinese version of GPT2 training code, using BERT tokenizer. |
| ckiplab/ckip-transformers |
439 |
|
0 |
0 |
about 3 years ago |
0 |
|
1 |
gpl-3.0 |
Python |
| CKIP Transformers |
| wangfenjin/simple |
411 |
|
0 |
0 |
over 2 years ago |
0 |
|
10 |
mit |
C++ |
| 支持中文和拼音的 SQLite fts5 全文搜索扩展 | A SQLite3 fts5 tokenizer which supports Chinese and PinYin |
| linonetwo/segmentit |
208 |
|
1 |
6 |
about 3 years ago |
17 |
December 22, 2019 |
6 |
mit |
JavaScript |
| 任何 JS 环境可用的中文分词包,fork from leizongmin/node-segment |
| haifengkao/SqliteSubstringSearch |
76 |
|
0 |
0 |
over 10 years ago |
0 |
|
0 |
|
C |
| An open source tokenizer which supports fast substring search with sqlite FTS (full text search) |
| DCjanus/cang-jie |
65 |
|
0 |
6 |
over 2 years ago |
20 |
November 04, 2023 |
0 |
mit |
Rust |
| Chinese tokenizer for tantivy, based on jieba-rs |
| KoichiYasuoka/UD-Kanbun |
59 |
|
1 |
2 |
over 2 years ago |
249 |
September 25, 2023 |
0 |
mit |
Python |
| Tokenizer POS-tagger and Dependency-parser for Classical Chinese |
| howl-anderson/rasa_chinese |
46 |
|
0 |
0 |
over 4 years ago |
0 |
|
1 |
apache-2.0 |
Python |
| rasa_chinese 专门针对中文语言的 rasa 组件扩展包,提供了许多针对中文语言的组件 |
| yishn/chinese-tokenizer |
39 |
|
3 |
7 |
over 5 years ago |
11 |
June 05, 2019 |
2 |
mit |
JavaScript |
| Tokenizes Chinese texts into words. |
| hscspring/pnlp |
25 |
|
1 |
1 |
over 2 years ago |
38 |
December 25, 2022 |
0 |
apache-2.0 |
Python |
| NLP预/后处理工具。 |