-
Tsinghua University
- Beijing, China
-
03:47
(UTC +08:00) - [email protected]
- @UbecWang
Highlights
- Pro
Pinned Loading
-
OpenRLHF/OpenRLHF
OpenRLHF/OpenRLHF PublicAn Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)
-
THUDM/SWE-Dev
THUDM/SWE-Dev PublicSWE-Dev is an open-source SWE agent with a scalable test case construction pipeline. This pipeline synthesizes test cases through a two-step process: generating Gherkin descriptions and correspondi…
Python 11
-
Generalization-of-Transformers
Generalization-of-Transformers Public[ICLR'25] Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Python 2
-
Shape-Control-of-DLO
Shape-Control-of-DLO PublicDeep Reinforcement Learning spring 24, Tsinghua Univ.
Python 4
If the problem persists, check the GitHub status page or contact support.