| apache/iceberg |
5,179 |
|
0 |
0 |
about 2 years ago |
3 |
October 29, 2022 |
1,485 |
apache-2.0 |
Java |
| Apache Iceberg |
| projectnessie/nessie |
762 |
|
0 |
32 |
about 2 years ago |
40 |
November 21, 2023 |
110 |
apache-2.0 |
Java |
| Nessie: Transactional Catalog for Data Lakes with Git-like semantics |
| Netflix/iceberg |
409 |
|
0 |
0 |
over 4 years ago |
0 |
|
27 |
apache-2.0 |
Java |
| Iceberg is a table format for large, slow-moving tabular data |
| delta-io/connectors |
383 |
|
0 |
0 |
almost 3 years ago |
5 |
December 06, 2022 |
0 |
apache-2.0 |
Java |
| This library allows Scala and Java-based projects (including Apache Flink, Apache Hive, Apache Beam, and PrestoDB) to read from and write to Delta Lake. |
| lightcopy/parquet-index |
113 |
|
0 |
0 |
almost 5 years ago |
0 |
|
16 |
apache-2.0 |
Scala |
| Spark SQL index for Parquet tables |
| ssavvides/tpch-spark |
91 |
|
0 |
0 |
about 2 years ago |
0 |
|
1 |
mit |
C |
| TPC-H queries in Apache Spark SQL using native DataFrames API |
| traviscrawford/spark-dynamodb |
90 |
|
0 |
0 |
over 4 years ago |
12 |
March 21, 2018 |
17 |
apache-2.0 |
Scala |
| DynamoDB data source for Apache Spark |
| dimajix/flowman |
85 |
|
0 |
24 |
over 2 years ago |
65 |
October 16, 2023 |
55 |
apache-2.0 |
Scala |
| Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines. |
| hortonworks-spark/spark-llap |
82 |
|
0 |
0 |
about 6 years ago |
0 |
|
31 |
apache-2.0 |
Java |
| qubole/spark-acid |
79 |
|
0 |
0 |
almost 5 years ago |
0 |
|
19 |
apache-2.0 |
Scala |
| ACID Data Source for Apache Spark based on Hive ACID |