| apache/doris |
10,666 |
|
0 |
0 |
about 2 years ago |
8 |
September 27, 2023 |
2,332 |
apache-2.0 |
Java |
| Apache Doris is an easy-to-use, high performance and unified analytics database. |
| wgzhao/Addax |
1,034 |
|
0 |
67 |
over 2 years ago |
10 |
July 29, 2023 |
8 |
apache-2.0 |
Java |
| Addax is a versatile open-source ETL tool that can seamlessly transfer data between various RDBMS and NoSQL databases, making it an ideal solution for data migration. |
| awslabs/aws-glue-libs |
568 |
|
0 |
0 |
over 2 years ago |
0 |
|
96 |
other |
Python |
| AWS Glue Libraries are additions and enhancements to Spark for ETL operations. |
| houshanren/big_data_architect_skills |
353 |
|
0 |
0 |
over 6 years ago |
0 |
|
1 |
|
|
| 一个大数据架构师应该掌握的技能 |
| Cascading/cascading |
321 |
|
0 |
0 |
over 7 years ago |
0 |
|
|
other |
Java |
| Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms. Please see https://github.com/cwensel/cascading for access to all WIP branches. |
| jondot/crunch |
196 |
|
0 |
0 |
over 11 years ago |
0 |
December 22, 2023 |
1 |
|
Go |
| A fast to develop, fast to run, Go based toolkit for ETL and feature extraction on Hadoop. |
| 51zero/eel-sdk |
140 |
|
1 |
17 |
over 5 years ago |
103 |
February 11, 2019 |
25 |
apache-2.0 |
Scala |
| Big Data Toolkit for the JVM |
| sequenceiq/sequenceiq-samples |
119 |
|
0 |
0 |
over 10 years ago |
0 |
|
0 |
apache-2.0 |
Java |
| SequenceIQ Hadoop examples |
| pranab/chombo |
102 |
|
0 |
0 |
almost 5 years ago |
0 |
|
5 |
|
Java |
| Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm |
| dimajix/flowman |
85 |
|
0 |
24 |
over 2 years ago |
65 |
October 16, 2023 |
55 |
apache-2.0 |
Scala |
| Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines. |