搜索: grpo - iMakething 开源项目库

搜索项目

搜索 "grpo" 找到 2 个结果

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300

3/5 5764

deepseek-r1 embedding grpo

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job traini

3/5 3504

agent agentic-ai grpo