| seaweedfs/seaweedfs |
19,155 |
|
0 |
2 |
about 2 years ago |
296 |
April 24, 2021 |
312 |
apache-2.0 |
Go |
| SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. |
| ceph/ceph |
16,154 |
|
5 |
1 |
3 months ago |
1 |
August 26, 2014 |
705 |
other |
C++ |
| Ceph is a distributed object, block, and file storage platform |
| juicedata/juicefs |
9,252 |
|
0 |
1 |
about 2 years ago |
136 |
November 28, 2023 |
120 |
apache-2.0 |
Go |
| JuiceFS is a distributed POSIX file system built on top of Redis and S3. |
| piskvorky/smart_open |
3,028 |
|
0 |
0 |
over 2 years ago |
0 |
|
94 |
mit |
Python |
| Utils for streaming large files (S3, HDFS, gzip, bz2...) |
| TileDB-Inc/TileDB |
1,700 |
|
0 |
6 |
about 2 years ago |
87 |
November 05, 2022 |
133 |
mit |
C++ |
| The Universal Storage Engine |
| lensesio/kafka-connect-ui |
483 |
|
0 |
0 |
over 4 years ago |
0 |
|
24 |
other |
JavaScript |
| Web tool for Kafka Connect | |
| uber/storagetapper |
269 |
|
0 |
0 |
over 4 years ago |
4 |
November 19, 2021 |
21 |
mit |
Go |
| StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service |
| Eugene-Mark/bigdata-file-viewer |
269 |
|
0 |
0 |
over 2 years ago |
0 |
|
2 |
gpl-2.0 |
Java |
| A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc. |
| svenkreiss/pysparkling |
253 |
|
7 |
1 |
over 3 years ago |
69 |
November 13, 2022 |
9 |
other |
Python |
| A pure Python implementation of Apache Spark's RDD and DStream interfaces. |
| RumbleDB/rumble |
194 |
|
0 |
0 |
almost 3 years ago |
4 |
December 03, 2019 |
134 |
other |
Java |
| ⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more |