Skillzwave Logo
Skillzwave

llm-evaluation

22.1

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →
skilz install Microck/ordinary-claude-skills/llm-evaluation
skilz install Microck/ordinary-claude-skills/llm-evaluation --agent opencode
skilz install Microck/ordinary-claude-skills/llm-evaluation --agent codex
skilz install Microck/ordinary-claude-skills/llm-evaluation --agent gemini

First time? Install Skilz: pip install skilz

Works with 14 AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents
Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

1. Clone the repository:
git clone https://github.com/Microck/ordinary-claude-skills
2. Copy the agent skill directory:
cp -r ordinary-claude-skills/skills_categorized/machine-learning/llm-evaluation ~/.claude/skills/

Need detailed installation help? Check our platform-specific guides:

Related Agentic Skills

llm-evaluation

by varunisrani

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM per...

22
generalevaluation llm

architect-agent

by SpillwaveSolutions

"Use this skill ONLY when user explicitly requests: (1) 'write instructions for code agent' or 'create instructions', (2) 'this is a new architect age...

100
generalschema query
Agents

confluence

by SpillwaveSolutions

This skill should be used when working with Confluence documentation - downloading pages to Markdown, converting between Wiki Markup and Markdown, cre...

100
generaldocumentation skill

design-doc-mermaid

by SpillwaveSolutions

Create Mermaid diagrams for any purpose - activity diagrams, deployment diagrams, architecture diagrams, or complete design documents. This skill uses...

100
generalreact code

Agentic Skill Details

Type
Non-Technical
Meta-Domain
general
Primary Domain
general
Sub-Domain
evaluation llm
Market Score
22.1

Report Security Issue

Found a security vulnerability in this agent skill?