Skillzwave The Package Manager for Enterprise AI Agents

Search skills... ⌘K

fine-tuning-with-trl

62 stars 2 forks

26

"Fine-tune LLMs using reinforcement learning with TRL: SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers."

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →

Claude Code (CLI) Fast

skilz install zechenzhangAGI/AI-research-SKILLs/fine-tuning-with-trl

OpenCode (CLI) Fast

skilz install zechenzhangAGI/AI-research-SKILLs/fine-tuning-with-trl --agent opencode

OpenAI Codex (CLI) Native

skilz install zechenzhangAGI/AI-research-SKILLs/fine-tuning-with-trl --agent codex

Gemini CLI (Project) Project

skilz install zechenzhangAGI/AI-research-SKILLs/fine-tuning-with-trl --agent gemini

First time? Install Skilz: pip install skilz

Works with 22+ AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents

For Claude Desktop Easy

Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

Manual Installation

1. Clone the repository:

git clone https://github.com/zechenzhangAGI/AI-research-SKILLs

2. Copy the agent skill directory:

cp -r AI-research-SKILLs/06-post-training/trl-fine-tuning ~/.claude/skills/

Need detailed installation help? Check our platform-specific guides:

Claude Desktop Guide Claude Code Guide Troubleshooting

Related Agentic Skills

flow-nexus-neural

by ruvnet

Train and deploy neural networks in distributed E2B sandboxes with Flow Nexus

TECHmachine learning

hooks-automation

by ruvnet

Automated coordination, formatting, and learning from Claude Code operations using intelligent hooks with MCP integration. Includes pre/post task h...

TECHmachine learning

ml-pipeline-workflow

by wshobson

Build end-to-end MLOps pipelines from data preparation through model training, validation, and production deployment. Use when creating ML pipeline...

TECHmachine learning

book-sft-pipeline

by muratcankoylan

End-to-end system for creating supervised fine-tuning datasets from books and training style-transfer models. Covers text extraction, intelligent s...

TECHmachine learning

Agentic Skill Details

Owner: zechenzhangAGI (GitHub)
Repository: AI-research-SKILLs
Stars: 62
Forks: 2
Type: Technical
Meta-Domain: data ai
Primary Domain: machine learning
Market Score: 26

Agentic Skill Grades →

Browse Category

More data ai Agentic Skills

Report Security Issue

Found a security vulnerability in this agent skill?