Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
The Top 10 Tokenizer Open Source Projects
Open source projects categorized as Tokenizer
Categories
>
Compilers
>
Tokenizer
Edit Category
google/sentencepiece
⭐
8,851
Unsupervised text tokenizer for Neural Network-based text generation.
dependent packages
0
total releases
0
most recent commit
about 2 years ago
huggingface/tokenizers
⭐
8,056
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
dependent packages
0
total releases
0
most recent commit
about 2 years ago
Morizeyao/GPT2-Chinese
⭐
7,249
Chinese version of GPT2 training code, using BERT tokenizer.
dependent packages
0
total releases
0
most recent commit
over 2 years ago
sebastianbergmann/php-token-stream
⭐
6,457
Wrapper around PHP's tokenizer extension.
dependent packages
0
total releases
0
most recent commit
over 4 years ago
theseer/tokenizer
⭐
5,084
A small library for converting tokenized PHP source code into XML (and potentially other formats)
dependent packages
0
total releases
0
most recent commit
over 2 years ago
sindresorhus/file-type
⭐
3,366
Detect the file type of a Buffer/Uint8Array/ArrayBuffer
dependent packages
0
total releases
0
most recent commit
over 2 years ago
teamtnt/tntsearch
⭐
3,004
A fully featured full text search engine written in PHP
dependent packages
0
total releases
0
most recent commit
over 2 years ago
Chevrotain/chevrotain
⭐
2,350
Parser Building Toolkit for JavaScript
dependent packages
0
total releases
0
most recent commit
about 2 years ago
roshan-research/hazm
⭐
1,381
Persian NLP Toolkit
dependent packages
0
total releases
0
most recent commit
4 months ago
natasha/natasha
⭐
1,085
Solves basic Russian NLP tasks, API for lower level Natasha projects
dependent packages
0
total releases
0
most recent commit
over 2 years ago
Get A Weekly Email With Trending Tokenizer Projects
No Spam. Unsubscribe easily at any time.
Tokenizer
Subscribe
Javascript must be enabled to subscribe.
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2026 Awesome Open Source. All rights reserved.