Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
The Top 10 Pyspark Open Source Projects
Open source projects categorized as Pyspark
Categories
>
Data Processing
>
Pyspark
Edit Category
kailashahirwar/cheatsheets-ai
⭐
13,281
Essential Cheat Sheets for deep learning and machine learning researchers https://medium.com/@kailashahirwar/essential-cheat-sheets-for-machine-learning-and-deep-learning-researchers-efb6a8ebd2e5
dependent packages
0
total releases
0
most recent commit
over 6 years ago
microsoft/SynapseML
⭐
4,914
Simple and Distributed Machine Learning
dependent packages
0
total releases
0
most recent commit
about 2 years ago
JohnSnowLabs/spark-nlp
⭐
3,578
State of the Art Natural Language Processing
dependent packages
0
total releases
0
most recent commit
about 2 years ago
ibis-project/ibis
⭐
3,404
The flexibility of Python with the scale and performance of modern SQL.
dependent packages
0
total releases
0
most recent commit
about 2 years ago
apache/linkis
⭐
3,200
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
dependent packages
0
total releases
0
most recent commit
about 2 years ago
ethen8181/machine-learning
⭐
2,607
:earth_americas: machine learning tutorials (mainly in Python3)
dependent packages
0
total releases
0
most recent commit
over 2 years ago
uber/petastorm
⭐
1,693
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
dependent packages
0
total releases
0
most recent commit
over 2 years ago
hi-primus/optimus
⭐
1,540
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
dependent packages
0
total releases
0
most recent commit
over 1 year ago
jadianes/spark-py-notebooks
⭐
1,515
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
dependent packages
0
total releases
0
most recent commit
about 3 years ago
combust/mleap
⭐
1,479
MLeap: Deploy ML Pipelines to Production
dependent packages
0
total releases
0
most recent commit
over 2 years ago
Get A Weekly Email With Trending Pyspark Projects
No Spam. Unsubscribe easily at any time.
Pyspark
Subscribe
Javascript must be enabled to subscribe.
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2026 Awesome Open Source. All rights reserved.