| apache/spark |
37,661 |
|
2,394 |
939 |
about 2 years ago |
46 |
May 09, 2021 |
186 |
apache-2.0 |
Scala |
| Apache Spark - A unified analytics engine for large-scale data processing |
| donnemartin/data-science-ipython-notebooks |
25,668 |
|
0 |
0 |
over 2 years ago |
0 |
|
34 |
other |
Python |
| Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. |
| dmlc/xgboost |
25,253 |
|
796 |
972 |
about 2 years ago |
79 |
November 13, 2023 |
412 |
apache-2.0 |
C++ |
| Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow |
| spotify/luigi |
17,046 |
|
338 |
76 |
about 2 years ago |
80 |
October 05, 2023 |
124 |
apache-2.0 |
Python |
| Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in. |
| Tencent/APIJSON |
16,277 |
|
0 |
0 |
about 2 years ago |
0 |
|
251 |
other |
Java |
| 🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code. |
| heibaiying/BigData-Notes |
14,872 |
|
0 |
0 |
over 2 years ago |
0 |
|
39 |
|
Java |
| 大数据入门指南 :star: |
| deeplearning4j/deeplearning4j |
13,290 |
|
175 |
119 |
over 2 years ago |
54 |
August 10, 2022 |
624 |
apache-2.0 |
Java |
| Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learning using automatic differentiation. |
| andkret/Cookbook |
12,557 |
|
0 |
0 |
over 2 years ago |
0 |
|
111 |
apache-2.0 |
|
| The Data Engineering Cookbook |
| apache/doris |
10,666 |
|
0 |
0 |
about 2 years ago |
8 |
September 27, 2023 |
2,332 |
apache-2.0 |
Java |
| Apache Doris is an easy-to-use, high performance and unified analytics database. |
| trinodb/trino |
9,118 |
|
0 |
29 |
about 2 years ago |
83 |
November 30, 2023 |
2,496 |
apache-2.0 |
Java |
| Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io) |