llm-evaluation

Name: llm-evaluation
Rating: 1.1 (1 reviews)
Author: Microck

22.1

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →

Claude Code (CLI) Fast

skilz install Microck/ordinary-claude-skills/llm-evaluation

OpenCode (CLI) Fast

skilz install Microck/ordinary-claude-skills/llm-evaluation --agent opencode

OpenAI Codex (CLI) Native

skilz install Microck/ordinary-claude-skills/llm-evaluation --agent codex

Gemini CLI (Project) Project

skilz install Microck/ordinary-claude-skills/llm-evaluation --agent gemini

First time? Install Skilz: pip install skilz

Works with 14 AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents

For Claude Desktop Easy

Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

Manual Installation

1. Clone the repository:

git clone https://github.com/Microck/ordinary-claude-skills

2. Copy the agent skill directory:

cp -r ordinary-claude-skills/skills_categorized/machine-learning/llm-evaluation ~/.claude/skills/

View on GitHub

Need detailed installation help? Check our platform-specific guides:

Claude Desktop Guide Claude Code Guide Troubleshooting

Related Agentic Skills

llm-evaluation

by varunisrani

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM per...

general› evaluation llm

architect-agent

by SpillwaveSolutions

"Use this skill ONLY when user explicitly requests: (1) 'write instructions for code agent' or 'create instructions', (2) 'this is a new architect age...

100

general› schema query

Agents

confluence

by SpillwaveSolutions

This skill should be used when working with Confluence documentation - downloading pages to Markdown, converting between Wiki Markup and Markdown, cre...

100

general› documentation skill

design-doc-mermaid

by SpillwaveSolutions

Create Mermaid diagrams for any purpose - activity diagrams, deployment diagrams, architecture diagrams, or complete design documents. This skill uses...

100

general› react code

Agentic Skill Details

Owner: Microck (GitHub)
Repository: ordinary-claude-skills
Type: Non-Technical
Meta-Domain: general
Primary Domain: general
Sub-Domain: evaluation llm
Market Score: 22.1

Agentic Skill Grades →

Browse Category

More general Agentic Skills

Report Security Issue

Found a security vulnerability in this agent skill?