| microsoft/SynapseML |
4,914 |
|
0 |
6 |
about 2 years ago |
12 |
November 27, 2023 |
335 |
mit |
Scala |
| Simple and Distributed Machine Learning |
| JohnSnowLabs/spark-nlp |
3,578 |
|
0 |
30 |
about 2 years ago |
134 |
December 08, 2023 |
43 |
apache-2.0 |
Scala |
| State of the Art Natural Language Processing |
| ibis-project/ibis |
3,404 |
|
24 |
29 |
about 2 years ago |
68 |
December 10, 2023 |
157 |
apache-2.0 |
Python |
| The flexibility of Python with the scale and performance of modern SQL. |
| apache/linkis |
3,200 |
|
0 |
38 |
about 2 years ago |
3 |
July 29, 2023 |
215 |
apache-2.0 |
Java |
| Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines. |
| uber/petastorm |
1,693 |
|
0 |
8 |
over 2 years ago |
86 |
February 03, 2023 |
174 |
apache-2.0 |
Python |
| Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code. |
| hi-primus/optimus |
1,540 |
|
0 |
0 |
over 1 year ago |
32 |
June 19, 2022 |
29 |
apache-2.0 |
Python |
| :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark |
| jadianes/spark-py-notebooks |
1,515 |
|
0 |
0 |
about 3 years ago |
0 |
|
9 |
other |
Jupyter Notebook |
| Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks |
| combust/mleap |
1,479 |
|
15 |
12 |
over 2 years ago |
26 |
May 07, 2021 |
109 |
apache-2.0 |
Scala |
| MLeap: Deploy ML Pipelines to Production |
| awesome-spark/awesome-spark |
1,461 |
|
0 |
0 |
almost 3 years ago |
0 |
|
20 |
cc0-1.0 |
Shell |
| A curated list of awesome Apache Spark packages and resources. |
| jupyter-incubator/sparkmagic |
1,272 |
|
25 |
6 |
about 2 years ago |
54 |
September 13, 2023 |
156 |
other |
Python |
| Jupyter magics and kernels for working with remote Spark clusters |