| TresAmigosSD/SMV |
41 |
|
0 |
0 |
almost 6 years ago |
10 |
September 19, 2019 |
73 |
apache-2.0 |
Python |
| Spark Modularized View |
| Parsely/pyspark-cassandra |
35 |
|
0 |
0 |
about 11 years ago |
0 |
|
2 |
apache-2.0 |
Python |
| Utilities and examples to asssist in working with PySpark and Cassandra. |
| aliyun/aliyun-cupid-sdk |
30 |
|
0 |
1 |
about 6 years ago |
5 |
October 26, 2020 |
3 |
apache-2.0 |
Java |
| SDK for open source framwork to interact with MaxCompute |
| globocom/MicroDrill |
7 |
|
0 |
0 |
about 10 years ago |
3 |
March 01, 2016 |
1 |
apache-2.0 |
Python |
| Simple Apache Drill alternative using PySpark |
| GabrielAmazonas/airflow-pyspark-emr |
7 |
|
0 |
0 |
about 4 years ago |
0 |
|
7 |
|
Python |
| This project demonstrate how to process data stored in a data lake fashion, transforming it into an OLAP optimized structure by using PySpark. The PySpark Job runs on AWS EMR, and the Data Pipeline is orchestrated by Apache Airflow, including the infrastructure creation and the EMR cluster termination. |
| sahilbhange/spark-slowly-changing-dimension |
6 |
|
0 |
0 |
over 7 years ago |
0 |
|
1 |
|
Scala |
| Spark implementation of Slowly Changing Dimension type 2 |