prompt-benchmark

12

Systematic prompt evaluation framework with MATH, GSM8K, and Game of 24 benchmarks. Use when evaluating prompt effectiveness on standard benchmarks, comparing meta-prompting strategies quantitatively, measuring prompt quality improvements, or validating categorical prompt optimizations against ground truth datasets.

Also in: github data analysis

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →
skilz install manutej/categorical-meta-prompting/prompt-benchmark
skilz install manutej/categorical-meta-prompting/prompt-benchmark --agent opencode
skilz install manutej/categorical-meta-prompting/prompt-benchmark --agent codex
skilz install manutej/categorical-meta-prompting/prompt-benchmark --agent gemini

First time? Install Skilz: pip install skilz

Works with 14 AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents
Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

1. Clone the repository:
git clone https://github.com/manutej/categorical-meta-prompting
2. Copy the agent skill directory:
cp -r categorical-meta-prompting/.claude/skills/prompt-benchmark ~/.claude/skills/

Need detailed installation help? Check our platform-specific guides:

Related Agentic Skills

Agentic Skill Details

Type
Non-Technical
Meta-Domain
development
Primary Domain
javascript
Market Score
12

Report Security Issue

Found a security vulnerability in this agent skill?