| HariSekhon/DevOps-Python-tools |
709 |
|
0 |
0 |
over 2 years ago |
0 |
|
37 |
mit |
Python |
| 80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc. |
| adobe-research/spindle |
333 |
|
0 |
0 |
about 11 years ago |
0 |
|
2 |
apache-2.0 |
JavaScript |
| Next-generation web analytics processing with Scala, Spark, and Parquet. |
| Eugene-Mark/bigdata-file-viewer |
269 |
|
0 |
0 |
over 2 years ago |
0 |
|
2 |
gpl-2.0 |
Java |
| A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc. |
| RumbleDB/rumble |
194 |
|
0 |
0 |
almost 3 years ago |
4 |
December 03, 2019 |
134 |
other |
Java |
| ⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more |
| xitongsys/parquet-go-source |
92 |
|
0 |
57 |
about 2 years ago |
10 |
August 06, 2020 |
8 |
apache-2.0 |
Go |
| source provider for parquet-go |
| dbiir/rainbow |
61 |
|
0 |
0 |
almost 8 years ago |
0 |
|
2 |
apache-2.0 |
Java |
| A data layout optimization framework for wide tables stored on HDFS. See rainbow's webpage |
| monix/monix-connect |
60 |
|
0 |
1 |
over 3 years ago |
14 |
November 07, 2022 |
49 |
apache-2.0 |
Scala |
| A set of connectors for Monix. 🔛 |
| KeithSSmith/spark-compaction |
52 |
|
0 |
0 |
about 7 years ago |
0 |
|
3 |
apache-2.0 |
Java |
| File compaction tool that runs on top of the Spark framework. |
| adobe-research/spark-parquet-thrift-example |
44 |
|
0 |
0 |
over 11 years ago |
0 |
|
1 |
apache-2.0 |
Scala |
| Example Spark project using Parquet as a columnar store with Thrift objects. |
| SIDN/entrada |
44 |
|
0 |
0 |
over 2 years ago |
0 |
|
10 |
gpl-3.0 |
Java |
| Entrada - A tool for DNS big data analytics |