| microsoft/SynapseML |
4,914 |
|
0 |
6 |
about 2 years ago |
12 |
November 27, 2023 |
335 |
mit |
Scala |
| Simple and Distributed Machine Learning |
| ethen8181/machine-learning |
2,607 |
|
0 |
0 |
over 2 years ago |
0 |
|
6 |
mit |
HTML |
| :earth_americas: machine learning tutorials (mainly in Python3) |
| hi-primus/optimus |
1,540 |
|
0 |
0 |
over 1 year ago |
32 |
June 19, 2022 |
29 |
apache-2.0 |
Python |
| :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark |
| jadianes/spark-py-notebooks |
1,515 |
|
0 |
0 |
about 3 years ago |
0 |
|
9 |
other |
Jupyter Notebook |
| Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks |
| logicalclocks/hopsworks |
1,041 |
|
0 |
0 |
about 2 years ago |
1 |
September 11, 2019 |
12 |
agpl-3.0 |
Java |
| Hopsworks - Data-Intensive AI platform with a Feature Store |
| AlexIoannides/pyspark-example-project |
1,034 |
|
0 |
0 |
over 3 years ago |
0 |
|
11 |
|
Python |
| Example project implementing best practices for PySpark ETL jobs and applications. |
| kuwala-io/kuwala |
610 |
|
0 |
0 |
over 3 years ago |
0 |
|
22 |
apache-2.0 |
JavaScript |
| Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times |
| firmai/pandapy |
483 |
|
0 |
0 |
over 4 years ago |
22 |
January 25, 2020 |
2 |
|
Python |
| PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai) |
| capitalone/datacompy |
339 |
|
0 |
10 |
about 2 years ago |
20 |
November 15, 2023 |
16 |
apache-2.0 |
Python |
| Pandas and Spark DataFrame comparison for humans and more! |
| Ibotta/sk-dist |
283 |
|
2 |
0 |
about 3 years ago |
12 |
May 14, 2020 |
8 |
apache-2.0 |
Python |
| Distributed scikit-learn meta-estimators in PySpark |