| onceupon/Bash-Oneliner |
10,710 |
|
0 |
0 |
3 months ago |
0 |
|
3 |
mit |
|
| A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance. |
| johnkerl/miller |
8,397 |
|
0 |
0 |
about 2 years ago |
65 |
November 26, 2022 |
90 |
other |
Go |
| Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON |
| lorien/awesome-web-scraping |
6,060 |
|
0 |
0 |
over 2 years ago |
0 |
|
1 |
other |
Makefile |
| List of libraries, tools and APIs for web scraping and data processing. |
| NVIDIA/DALI |
4,770 |
|
0 |
4 |
about 2 years ago |
19 |
December 06, 2023 |
209 |
apache-2.0 |
C++ |
| A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications. |
| TomWright/dasel |
4,695 |
|
0 |
14 |
about 2 years ago |
97 |
November 28, 2023 |
28 |
mit |
Go |
| Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package. |
| unionai-oss/pandera |
2,807 |
|
0 |
97 |
about 2 years ago |
79 |
December 08, 2023 |
321 |
mit |
Python |
| A light-weight, flexible, and expressive statistical data testing library |
| dashbitco/broadway |
2,608 |
|
19 |
29 |
3 months ago |
18 |
April 22, 2023 |
5 |
apache-2.0 |
Elixir |
| Concurrent and multi-stage data ingestion and data processing with Elixir |
| microsoft/DialoGPT |
2,283 |
|
0 |
0 |
over 3 years ago |
0 |
|
59 |
mit |
Python |
| Large-scale pretraining for dialogue |
| asyml/texar |
2,008 |
|
2 |
0 |
over 5 years ago |
5 |
November 19, 2019 |
32 |
apache-2.0 |
Python |
| Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow |
| python-bonobo/bonobo |
1,604 |
|
26 |
7 |
almost 3 years ago |
38 |
July 20, 2019 |
108 |
apache-2.0 |
Python |
| Extract Transform Load for Python 3.5+ |