| lensacom/sparkit-learn |
1,054 |
|
5 |
0 |
over 5 years ago |
13 |
June 24, 2015 |
35 |
apache-2.0 |
Python |
| PySpark + Scikit-learn = Sparkit-learn |
| CamDavidsonPilon/tdigest |
332 |
|
9 |
19 |
over 3 years ago |
14 |
August 27, 2016 |
12 |
mit |
Python |
| t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark |
| sb-ai-lab/RePlay |
109 |
|
0 |
1 |
about 2 years ago |
14 |
November 24, 2023 |
13 |
apache-2.0 |
Python |
| A Comprehensive Framework for Building End-to-End Recommendation Systems with State-of-the-Art Models |
| tirthajyoti/Spark-with-Python |
98 |
|
0 |
0 |
almost 6 years ago |
0 |
|
0 |
mit |
Jupyter Notebook |
| Fundamentals of Spark with Python (using PySpark), code examples |
| lbdeoliveira/song-playlist-recommendation |
43 |
|
0 |
0 |
almost 3 years ago |
0 |
|
1 |
|
HTML |
| This project was a joint effort by Lucas De Oliveira, Chandrish Ambati, and Anish Mukherjee to create a song and playlist embeddings for recommendations in a distributed fashion using a 1M playlist dataset by Spotify. |
| mahmoudparsian/pyspark-algorithms |
33 |
|
0 |
0 |
over 6 years ago |
0 |
|
2 |
other |
Python |
| PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2 |
| feng-li/dlsa |
33 |
|
0 |
0 |
over 2 years ago |
0 |
|
2 |
gpl-3.0 |
Python |
| Distributed least squares approximation (dlsa) implemented with Apache Spark |
| andersy005/spark-xarray |
8 |
|
0 |
0 |
over 8 years ago |
1 |
December 06, 2023 |
4 |
apache-2.0 |
Jupyter Notebook |
| This is an experimental project that seeks to integrate PySpark and xarray for Climate Data Analysis. |
| BlueGranite/DatabricksTraining |
6 |
|
0 |
0 |
over 6 years ago |
0 |
|
0 |
gpl-3.0 |
Python |
| Repository for Microsoft Databricks Training Events - Hosted by BlueGranite |