| piskvorky/smart_open |
3,028 |
|
0 |
0 |
over 2 years ago |
0 |
|
94 |
mit |
Python |
| Utils for streaming large files (S3, HDFS, gzip, bz2...) |
| Stratio/sparta |
526 |
|
0 |
0 |
over 6 years ago |
0 |
|
9 |
apache-2.0 |
Scala |
| Real Time Analytics and Data Pipelines based on Spark Streaming |
| confluentinc/kafka-connect-hdfs |
473 |
|
0 |
0 |
over 2 years ago |
0 |
|
153 |
other |
Java |
| Kafka Connect HDFS connector |
| uber/storagetapper |
269 |
|
0 |
0 |
over 4 years ago |
4 |
November 19, 2021 |
21 |
mit |
Go |
| StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service |
| megvii-research/megfile |
99 |
|
0 |
2 |
about 2 years ago |
68 |
November 27, 2023 |
5 |
apache-2.0 |
Python |
| Megvii FILE Library - Working with Files in Python same as the standard library |
| soundcloud/spdt |
46 |
|
0 |
0 |
over 8 years ago |
0 |
|
1 |
mit |
Scala |
| Streaming Parallel Decision Tree |
| kevinweil/stream-to-hdfs |
27 |
|
0 |
0 |
about 16 years ago |
0 |
|
1 |
|
|
| A simple utility for streaming stdin to a file in HDFS |
| agile-lab-dev/wasp |
25 |
|
0 |
15 |
over 2 years ago |
25 |
September 14, 2023 |
4 |
other |
Scala |
| WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you. |
| lanjiang/streamingstopgraceful |
23 |
|
0 |
0 |
over 8 years ago |
0 |
|
1 |
|
Scala |
| Example to show how to stop the Spark Streaming Application Gracefully |
| looker/spark_log_data |
21 |
|
0 |
0 |
almost 10 years ago |
0 |
|
0 |
mit |
Scala |
| Flume-to-Spark-Streaming Log Parser |