| mjakubowski84/parquet4s |
267 |
|
0 |
6 |
about 2 years ago |
57 |
November 12, 2023 |
6 |
mit |
Scala |
| Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster. |
| vbay/big-data |
190 |
|
0 |
0 |
over 6 years ago |
0 |
|
1 |
|
Shell |
| 一个开源、成体系的大数据学习教程。spark学习 hadoop hive hbase flink教程 linux 从入门到精通 |
| Hurence/logisland |
106 |
|
2 |
34 |
about 3 years ago |
12 |
January 24, 2023 |
183 |
other |
Java |
| Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available. |
| GoCollaborate/src |
62 |
|
0 |
0 |
almost 8 years ago |
0 |
|
1 |
bsd-3-clause |
Go |
| A light-weight distributed stream computing framework for Golang |
| palantir/hadoop-crypto |
41 |
|
0 |
2 |
about 2 years ago |
32 |
September 15, 2023 |
4 |
apache-2.0 |
Java |
| Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3. |
| kevinweil/stream-to-hdfs |
27 |
|
0 |
0 |
about 16 years ago |
0 |
|
1 |
|
|
| A simple utility for streaming stdin to a file in HDFS |
| giorgioinf/twitter-stream-ml |
26 |
|
0 |
0 |
almost 10 years ago |
0 |
|
0 |
gpl-3.0 |
Scala |
| Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server. |
| authorjapps/hello-kafka-stream-testing |
13 |
|
0 |
0 |
almost 6 years ago |
0 |
|
2 |
mit |
Java |
| The most simple way to test Kafka based applications or micro-services e.g. Read/Write during HBase/Hadoop or other Data Ingestion Pipe Lines |
| yennanliu/NYC_Taxi_Pipeline |
12 |
|
0 |
0 |
about 5 years ago |
0 |
|
0 |
|
Scala |
| Design/Implement stream/batch architecture on NYC taxi data | #DE |
| projetoeureka/akka-mapreduce |
11 |
|
0 |
0 |
over 10 years ago |
0 |
|
1 |
other |
Scala |
| A Scala and Akka based map-reduce framework |