| go-ego/gse |
2,352 |
|
14 |
21 |
over 2 years ago |
82 |
January 16, 2023 |
12 |
apache-2.0 |
Go |
| Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. |
| JasonKessler/scattertext |
2,131 |
|
8 |
2 |
over 2 years ago |
148 |
April 18, 2023 |
22 |
apache-2.0 |
Python |
| Beautiful visualizations of how language differs among document types. |
| google/budou |
1,135 |
|
1 |
0 |
over 3 years ago |
36 |
November 07, 2019 |
6 |
apache-2.0 |
Python |
| Budou is an automatic organizer tool for beautiful line breaking in CJK (Chinese, Japanese, and Korean). |
| megagonlabs/ginza |
676 |
|
0 |
12 |
over 2 years ago |
19 |
September 25, 2023 |
11 |
mit |
Python |
| A Japanese NLP Library using spaCy as framework based on Universal Dependencies |
| taishi-i/awesome-japanese-nlp-resources |
522 |
|
0 |
0 |
about 2 years ago |
0 |
|
0 |
cc0-1.0 |
|
| A curated list of resources dedicated to Python libraries, LLMs, dictionaries, and corpora of NLP for Japanese |
| rinnakk/japanese-pretrained-models |
479 |
|
0 |
0 |
over 3 years ago |
0 |
|
3 |
apache-2.0 |
Python |
| Code for producing Japanese pretrained models provided by rinna Co., Ltd. |
| taishi-i/nagisa |
365 |
|
1 |
7 |
about 2 years ago |
22 |
July 30, 2023 |
4 |
mit |
Python |
| A Japanese tokenizer based on recurrent neural networks |
| miurahr/pykakasi |
349 |
|
11 |
18 |
over 3 years ago |
49 |
April 14, 2022 |
1 |
gpl-3.0 |
Python |
| Lightweight converter from Japanese Kana-kanji sentences into Kana-Roman. |
| polm/fugashi |
339 |
|
0 |
39 |
over 2 years ago |
67 |
August 25, 2023 |
5 |
mit |
C++ |
| A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis. |
| ku-nlp/jumanpp |
334 |
|
0 |
0 |
about 3 years ago |
0 |
|
30 |
apache-2.0 |
C++ |
| Juman++ (a Morphological Analyzer Toolkit) |