| Eventual-Inc/Daft |
1,012 |
|
0 |
3 |
about 2 years ago |
53 |
December 05, 2023 |
101 |
apache-2.0 |
Rust |
| Distributed DataFrame for Python designed for the cloud, powered by Rust |
| YotpoLtd/metorikku |
536 |
|
0 |
0 |
about 3 years ago |
126 |
February 27, 2023 |
65 |
mit |
Scala |
| A simplified, lightweight ETL Framework based on Apache Spark |
| zero-one-group/geni |
268 |
|
0 |
0 |
over 2 years ago |
33 |
October 14, 2020 |
14 |
apache-2.0 |
Clojure |
| A Clojure dataframe library that runs on Spark |
| tirthajyoti/Spark-with-Python |
98 |
|
0 |
0 |
almost 6 years ago |
0 |
|
0 |
mit |
Jupyter Notebook |
| Fundamentals of Spark with Python (using PySpark), code examples |
| mahmoudparsian/pyspark-algorithms |
33 |
|
0 |
0 |
over 6 years ago |
0 |
|
2 |
other |
Python |
| PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2 |
| medo/Pandas-Farm |
11 |
|
0 |
0 |
almost 9 years ago |
0 |
|
0 |
mit |
Python |
| Parallelize pandas operations easily on your personal small cluster |
| icaropires/pdf2dataset |
8 |
|
0 |
0 |
over 5 years ago |
15 |
September 13, 2020 |
9 |
apache-2.0 |
Python |
| Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features |