| apache/hudi |
4,901 |
|
0 |
58 |
about 2 years ago |
21 |
November 11, 2023 |
886 |
apache-2.0 |
Java |
| Upserts, Deletes And Incremental Processing on Big Data. |
| bigdatagenomics/adam |
966 |
|
20 |
17 |
about 2 years ago |
14 |
December 16, 2020 |
35 |
apache-2.0 |
Scala |
| ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed. |
| miguno/kafka-storm-starter |
729 |
|
0 |
0 |
about 4 years ago |
0 |
|
0 |
other |
Scala |
| [PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format. |
| HariSekhon/DevOps-Python-tools |
709 |
|
0 |
0 |
over 2 years ago |
0 |
|
37 |
mit |
Python |
| 80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc. |
| xubo245/SparkLearning |
573 |
|
0 |
0 |
over 4 years ago |
0 |
|
3 |
apache-2.0 |
Scala |
| Learning Apache spark,including code and data .Most part can run local. |
| OBenner/data-engineering-interview-questions |
554 |
|
0 |
0 |
over 2 years ago |
0 |
|
0 |
|
|
| More than 2000+ Data engineer interview questions. |
| databricks/spark-avro |
535 |
|
47 |
39 |
over 7 years ago |
8 |
October 30, 2017 |
77 |
apache-2.0 |
Scala |
| Avro Data Source for Apache Spark |
| hortonworks-spark/shc |
484 |
|
0 |
0 |
almost 6 years ago |
0 |
|
158 |
apache-2.0 |
Scala |
| The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink. |
| uber/marmaray |
444 |
|
0 |
0 |
about 4 years ago |
0 |
|
14 |
other |
Java |
| Generic Data Ingestion & Dispersal Library for Hadoop |
| gorillalabs/sparkling |
423 |
|
10 |
0 |
about 4 years ago |
23 |
January 16, 2018 |
13 |
epl-1.0 |
Clojure |
| A Clojure library for Apache Spark: fast, fully-features, and developer friendly |