| numaproj/numaflow |
866 |
|
0 |
1 |
about 2 years ago |
37 |
November 03, 2023 |
101 |
apache-2.0 |
Go |
| Kubernetes-native platform to run massively parallel data/streaming jobs |
| GoogleCloudPlatform/DataflowJavaSDK |
853 |
|
249 |
14 |
over 5 years ago |
38 |
June 26, 2018 |
54 |
|
|
| Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines. |
| ml6team/fondant |
293 |
|
0 |
0 |
about 2 years ago |
24 |
December 12, 2023 |
52 |
apache-2.0 |
Python |
| Production-ready data processing made easy and shareable |
| asyml/forte |
215 |
|
0 |
12 |
about 3 years ago |
14 |
June 29, 2022 |
104 |
apache-2.0 |
Python |
| Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/ |
| analysiscenter/batchflow |
195 |
|
0 |
0 |
about 2 years ago |
15 |
August 01, 2023 |
33 |
apache-2.0 |
Python |
| BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory. |
| UMMS-Biocore/dolphinnext |
92 |
|
0 |
0 |
over 3 years ago |
0 |
|
34 |
gpl-3.0 |
PHP |
| A graphical user interface for distributed data processing of high throughput genomics |
| Jean-njoroge/Breast-cancer-risk-prediction |
83 |
|
0 |
0 |
about 5 years ago |
0 |
|
1 |
mit |
Jupyter Notebook |
| Classification of Breast Cancer diagnosis Using Support Vector Machines |
| Sentieon/sentieon-scripts |
53 |
|
0 |
0 |
over 2 years ago |
0 |
|
3 |
bsd-2-clause |
Shell |
| Helper scripts for biological data processing from Sentieon |
| gatk-workflows/five-dollar-genome-analysis-pipeline |
40 |
|
0 |
0 |
about 6 years ago |
0 |
|
4 |
bsd-3-clause |
wdl |
| Workflows used for WGS data processing -- replaced by https://github.com/gatk-workflows/gatk4-genome-processing-pipeline |
| zazuko/barnard59 |
20 |
|
1 |
5 |
about 2 years ago |
16 |
June 08, 2022 |
63 |
|
JavaScript |
| An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management. |