Pyspark Example Project Alternatives

Example project implementing best practices for PySpark ETL jobs and applications.
Suggest Alternative
Alternatives To AlexIoannides/pyspark-example-project
Project Name Stars Downloads Repos Using This Packages Using This Most Recent Commit Total Releases Latest Release Open Issues License Language
AlexIoannides/pyspark-example-project 1,034 0 0 over 3 years ago 0 11 Python
Example project implementing best practices for PySpark ETL jobs and applications.
quintoandar/butterfree 269 0 1 over 2 years ago 35 November 14, 2023 6 apache-2.0 Python
A tool for building feature stores.
martandsingh/ApacheSpark 59 0 0 over 3 years ago 0 0 Python
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
vim89/datapipelines-essentials-python 45 0 0 almost 3 years ago 0 1 apache-2.0 Python
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
basin-etl/basin 29 0 0 over 3 years ago 0 42 other TypeScript
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
mozilla/python_mozetl 26 0 0 over 2 years ago 0 23 mit Python
ETL jobs for Firefox Telemetry
guidok91/spark-movies-etl 21 0 0 over 2 years ago 0 2 Python
Spark data pipeline that ingests and transforms movie ratings data.
ksbg/sparklanes 16 1 0 about 6 years ago 5 January 31, 2019 2 mit Python
A lightweight data processing framework for Apache Spark
datayoga-io/lineage 14 0 2 about 4 years ago 11 January 26, 2022 0 apache-2.0 TypeScript
Generate beautiful documentation for your data pipelines in markdown format
telia-oss/birgitta 12 0 0 about 3 years ago 34 September 10, 2020 20 mit Python
Birgitta is a Python ETL test and schema framework, providing automated tests for pyspark notebooks/recipes.
Alternatives To AlexIoannides/pyspark-example-project
Select To Compare


Alternative Project Comparisons
Popular Etl Projects
Popular Pyspark Projects
Popular Data Processing Categories
Related Searches
Get A Weekly Email With Trending Projects
No Spam. Unsubscribe easily at any time.
Privacy | About | Terms | Follow Us On Twitter

Downloads, Dependent Repos, Dependent Packages, Total Releases, Latest Releases data powered by Libraries.io.

Copyright 2018-2026 Awesome Open Source.  All rights reserved.