| sfujim/TD3 |
1,383 |
|
0 |
0 |
almost 3 years ago |
0 |
|
2 |
mit |
Python |
| Author's PyTorch implementation of TD3 for OpenAI gym tasks |
| openai/multiagent-competition |
614 |
|
0 |
0 |
over 6 years ago |
0 |
|
12 |
|
Python |
| Code for the paper "Emergent Complexity via Multi-agent Competition" |
| openai/mlsh |
520 |
|
0 |
0 |
almost 7 years ago |
0 |
|
16 |
|
Python |
| Code for the paper "Meta-Learning Shared Hierarchies" |
| giuse/DNE |
122 |
|
0 |
0 |
over 6 years ago |
0 |
|
3 |
mit |
Ruby |
| A set of neuroevolution experiments with/towards deep networks |
| openai/safety-starter-agents |
86 |
|
0 |
0 |
over 5 years ago |
0 |
|
5 |
mit |
Python |
| Basic constrained RL agents used in experiments for the "Benchmarking Safe Exploration in Deep Reinforcement Learning" paper. |
| nicklashansen/policy-adaptation-during-deployment |
46 |
|
0 |
0 |
over 5 years ago |
0 |
|
0 |
|
Python |
| Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper. |
| ferreirafabio/mppi_pendulum |
39 |
|
0 |
0 |
over 6 years ago |
0 |
|
3 |
|
Python |
| The implementation of Model Predictive Path Integral (MPPI) from the paper "Information Theoretic MPC for Model-Based Reinforcement Learning" (Williams et al., 2017) for the pendulum OpenAI Gym environment |
| stepjam/TecNets |
26 |
|
0 |
0 |
over 6 years ago |
0 |
|
3 |
other |
Python |
| Official code for "Task-Embedded Control Networks for Few-Shot Imitation Learning". |
| addy1997/COVID-19-Resources |
24 |
|
0 |
0 |
over 5 years ago |
0 |
|
0 |
gpl-3.0 |
|
| Resources for Covid-19 |
| tshrjn/env-zoo |
22 |
|
0 |
0 |
about 7 years ago |
0 |
|
0 |
|
|
| A curated list of reinforcement learning environments and frameworks. |