| pachyderm/pachyderm |
6,035 |
|
0 |
1 |
about 2 years ago |
613 |
December 04, 2023 |
897 |
apache-2.0 |
Go |
| Data-Centric Pipelines and Data Versioning |
| Moataz-Elmesmary/Data-Science-Roadmap |
2,445 |
|
0 |
0 |
over 2 years ago |
0 |
|
3 |
mit |
|
| Data Science Roadmap from A to Z |
| root-project/root |
2,329 |
|
0 |
20 |
about 2 years ago |
16 |
October 24, 2022 |
848 |
other |
C++ |
| The official repository for ROOT: analyzing, storing and visualizing big data, scientifically |
| hi-primus/optimus |
1,540 |
|
0 |
0 |
over 1 year ago |
32 |
June 19, 2022 |
29 |
apache-2.0 |
Python |
| :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark |
| jadianes/spark-py-notebooks |
1,515 |
|
0 |
0 |
about 3 years ago |
0 |
|
9 |
other |
Jupyter Notebook |
| Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks |
| intel/scikit-learn-intelex |
1,116 |
|
0 |
24 |
about 2 years ago |
23 |
November 28, 2023 |
70 |
apache-2.0 |
Python |
| Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application |
| man-group/ArcticDB |
920 |
|
0 |
3 |
about 2 years ago |
35 |
December 07, 2023 |
260 |
other |
C++ |
| ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem. |
| GoogleCloudPlatform/DataflowJavaSDK |
853 |
|
249 |
14 |
over 5 years ago |
38 |
June 26, 2018 |
54 |
|
|
| Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines. |
| visualpython/visualpython |
748 |
|
0 |
0 |
over 2 years ago |
88 |
November 18, 2023 |
20 |
other |
JavaScript |
| GUI-based Python code generator for data science, extension to Jupyter Lab, Jupyter Notebook and Google Colab. |
| WeBankFinTech/WeDataSphere |
624 |
|
0 |
0 |
about 2 years ago |
0 |
|
24 |
|
|
| WeDataSphere is a financial grade, one-stop big data platform suite. |