| microsoft/SynapseML |
4,914 |
|
0 |
6 |
about 2 years ago |
12 |
November 27, 2023 |
335 |
mit |
Scala |
| Simple and Distributed Machine Learning |
| hi-primus/optimus |
1,540 |
|
0 |
0 |
over 1 year ago |
32 |
June 19, 2022 |
29 |
apache-2.0 |
Python |
| :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark |
| jadianes/spark-py-notebooks |
1,515 |
|
0 |
0 |
about 3 years ago |
0 |
|
9 |
other |
Jupyter Notebook |
| Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks |
| h2oai/sparkling-water |
957 |
|
0 |
6 |
over 2 years ago |
195 |
October 26, 2023 |
44 |
apache-2.0 |
Scala |
| Sparkling Water provides H2O functionality inside Spark cluster |
| ankurchavda/SparkLearning |
451 |
|
0 |
0 |
almost 4 years ago |
0 |
|
0 |
|
|
| A comprehensive Spark guide collated from multiple sources that can be referred to learn more about Spark or as an interview refresher. |
| paypal/gimel |
230 |
|
0 |
0 |
over 3 years ago |
0 |
|
9 |
apache-2.0 |
Scala |
| Big Data Processing Framework - Unified Data API or SQL on Any Storage |
| mahmoudparsian/data-algorithms-with-spark |
151 |
|
0 |
0 |
almost 3 years ago |
0 |
|
0 |
|
Python |
| O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian |
| locationtech-labs/geopyspark |
151 |
|
0 |
0 |
about 6 years ago |
0 |
|
43 |
other |
Python |
| GeoTrellis for PySpark |
| cartershanklin/pyspark-cheatsheet |
140 |
|
0 |
0 |
over 3 years ago |
0 |
|
0 |
cc0-1.0 |
Python |
| PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster |
| mahmoudparsian/big-data-mapreduce-course |
135 |
|
0 |
0 |
over 2 years ago |
0 |
|
0 |
|
HTML |
| Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University |