| apache/parquet-mr |
2,296 |
|
259 |
208 |
about 2 years ago |
17 |
May 12, 2023 |
133 |
apache-2.0 |
Java |
| Apache Parquet |
| bigdatagenomics/adam |
966 |
|
20 |
17 |
about 2 years ago |
14 |
December 16, 2020 |
35 |
apache-2.0 |
Scala |
| ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed. |
| HariSekhon/DevOps-Python-tools |
709 |
|
0 |
0 |
over 2 years ago |
0 |
|
37 |
mit |
Python |
| 80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc. |
| Cinchoo/ChoETL |
693 |
|
1 |
9 |
over 2 years ago |
177 |
September 21, 2023 |
62 |
mit |
C# |
| ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files) |
| Netflix/PigPen |
513 |
|
0 |
6 |
almost 9 years ago |
34 |
May 19, 2016 |
19 |
apache-2.0 |
Clojure |
| Map-Reduce for Clojure |
| RandomFractals/vscode-data-preview |
447 |
|
0 |
0 |
almost 3 years ago |
0 |
|
54 |
apache-2.0 |
TypeScript |
| Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files |
| stanford-futuredata/sparser |
411 |
|
0 |
0 |
over 7 years ago |
0 |
|
5 |
bsd-3-clause |
C |
| Sparser: Raw Filtering for Faster Analytics over Raw Data |
| Netflix/iceberg |
409 |
|
0 |
0 |
over 4 years ago |
0 |
|
27 |
apache-2.0 |
Java |
| Iceberg is a table format for large, slow-moving tabular data |
| spotify/ratatool |
333 |
|
0 |
7 |
about 2 years ago |
36 |
November 22, 2023 |
23 |
apache-2.0 |
Scala |
| A tool for data sampling, data generation, and data diffing |
| Eugene-Mark/bigdata-file-viewer |
269 |
|
0 |
0 |
over 2 years ago |
0 |
|
2 |
gpl-2.0 |
Java |
| A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc. |