| microsoft/DeepSpeed |
31,015 |
|
0 |
87 |
about 2 years ago |
79 |
December 01, 2023 |
920 |
apache-2.0 |
Python |
| DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. |
| devitocodes/devito |
493 |
|
0 |
1 |
about 2 years ago |
15 |
October 16, 2023 |
109 |
mit |
Python |
| DSL and compiler framework for automated finite-differences and stencil computation |
| hpcaitech/FastFold |
485 |
|
0 |
0 |
almost 3 years ago |
0 |
|
30 |
apache-2.0 |
Python |
| Optimizing AlphaFold Training and Inference on GPU Clusters |
| SciML/DiffEqGPU.jl |
261 |
|
0 |
0 |
about 2 years ago |
0 |
|
26 |
mit |
Julia |
| GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem |
| msr-fiddle/pipedream |
239 |
|
0 |
0 |
over 4 years ago |
0 |
|
26 |
mit |
Python |
| cuMF/cumf_als |
157 |
|
0 |
0 |
over 7 years ago |
0 |
|
3 |
apache-2.0 |
Cuda |
| CUDA Matrix Factorization Library with Alternating Least Square (ALS) |
| sandialabs/omega_h |
105 |
|
0 |
0 |
over 2 years ago |
0 |
|
40 |
other |
C++ |
| Simplex mesh adaptivity for HPC |
| bat67/Deep-Learning-with-PyTorch-A-60-Minute-Blitz-cn |
95 |
|
0 |
0 |
about 7 years ago |
0 |
|
0 |
other |
Jupyter Notebook |
| PyTorch1.0 深度学习:60分钟入门与实战(Deep Learning with PyTorch: A 60 Minute Blitz 中文翻译与学习) |
| tugrul512bit/Cekirdekler |
81 |
|
0 |
0 |
almost 4 years ago |
0 |
|
20 |
gpl-3.0 |
C# |
| Multi-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity). |
| ashwin/gDel3D |
76 |
|
0 |
0 |
over 7 years ago |
0 |
|
5 |
|
C++ |
| gDel3D is the fastest 3D Delaunay triangulation algorithm. It uses the GPU for massive parallelism. |