| explosion/sense2vec |
1,486 |
|
6 |
7 |
almost 3 years ago |
24 |
April 19, 2021 |
20 |
mit |
Python |
| 🦆 Contextually-keyed word vectors |
| gagolews/stringi |
285 |
|
0 |
0 |
over 2 years ago |
0 |
|
42 |
other |
C++ |
| Fast and portable character string processing in R (with the Unicode ICU) |
| OpenNMT/Tokenizer |
224 |
|
15 |
5 |
over 2 years ago |
68 |
January 11, 2023 |
2 |
mit |
C++ |
| Fast and customizable text tokenization library with BPE and SentencePiece support |
| kavgan/ROUGE-2.0 |
145 |
|
0 |
0 |
about 6 years ago |
0 |
|
1 |
apache-2.0 |
Java |
| ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output. |
| Flight-School/Guide-to-Swift-Strings-Sample-Code |
124 |
|
0 |
0 |
almost 7 years ago |
0 |
|
1 |
|
Swift |
| Xcode Playground Sample Code for the Flight School Guide to Swift Strings |
| miurahr/unihandecode |
71 |
|
0 |
0 |
over 3 years ago |
17 |
July 23, 2020 |
1 |
gpl-3.0 |
Python |
| unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language preference priorities |
| lifeparticle/Bengali-Alphabet |
51 |
|
0 |
0 |
over 2 years ago |
0 |
|
1 |
mit |
JavaScript |
| ✍️ Bengali alphabet (বাংলা বর্ণমালা) |
| clipperhouse/uax29 |
35 |
|
0 |
6 |
over 2 years ago |
40 |
May 26, 2023 |
1 |
mit |
Go |
| A tokenizer based on Unicode text segmentation (UAX #29), for Go. Split words, sentences and graphemes. |
| gagolews/stringx |
25 |
|
0 |
0 |
over 2 years ago |
0 |
|
9 |
other |
HTML |
| Drop-in replacements for base R string functions powered by stringi |
| urduhack/urdu-characters |
18 |
|
0 |
0 |
about 5 years ago |
0 |
|
0 |
mit |
Python |
| 📄 Complete collection of Urdu language characters & unicode code points. |