| OpenRefine/OpenRefine |
10,106 |
|
0 |
1 |
about 2 years ago |
10 |
April 05, 2023 |
639 |
bsd-3-clause |
Java |
| OpenRefine is a free, open source power tool for working with messy data and improving it |
| great-expectations/great_expectations |
9,179 |
|
0 |
53 |
about 2 years ago |
256 |
December 08, 2023 |
182 |
apache-2.0 |
Python |
| Always know what to expect from your data. |
| johnkerl/miller |
8,397 |
|
0 |
0 |
about 2 years ago |
65 |
November 26, 2022 |
90 |
other |
Go |
| Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON |
| cleanlab/cleanlab |
7,747 |
|
0 |
8 |
about 2 years ago |
24 |
September 11, 2023 |
97 |
agpl-3.0 |
Python |
| The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels. |
| unionai-oss/pandera |
2,807 |
|
0 |
97 |
about 2 years ago |
79 |
December 08, 2023 |
321 |
mit |
Python |
| A light-weight, flexible, and expressive statistical data testing library |
| justmarkham/pandas-videos |
1,808 |
|
0 |
0 |
almost 4 years ago |
0 |
|
0 |
|
Jupyter Notebook |
| Jupyter notebook and datasets from the pandas Q&A video series |
| sfu-db/dataprep |
1,807 |
|
0 |
2 |
over 2 years ago |
36 |
February 03, 2023 |
156 |
mit |
Python |
| Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code. |
| skrub-data/skrub |
1,591 |
|
0 |
0 |
4 days ago |
3 |
September 14, 2022 |
56 |
bsd-3-clause |
Python |
| Machine learning with dataframes |
| justmarkham/DAT8 |
1,549 |
|
0 |
0 |
over 3 years ago |
0 |
|
0 |
|
Jupyter Notebook |
| General Assembly's 2015 Data Science course in Washington, DC |
| hi-primus/optimus |
1,540 |
|
0 |
0 |
over 1 year ago |
32 |
June 19, 2022 |
29 |
apache-2.0 |
Python |
| :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark |