speculative-decoding

25

Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation, and production deployment strategies.

Marketplace

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →
skilz install zechenzhangAGI/AI-research-SKILLs/speculative-decoding
skilz install zechenzhangAGI/AI-research-SKILLs/speculative-decoding --agent opencode
skilz install zechenzhangAGI/AI-research-SKILLs/speculative-decoding --agent codex
skilz install zechenzhangAGI/AI-research-SKILLs/speculative-decoding --agent gemini

First time? Install Skilz: pip install skilz

Works with 22+ AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents
Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

1. Clone the repository:
git clone https://github.com/zechenzhangAGI/AI-research-SKILLs
2. Copy the agent skill directory:
cp -r AI-research-SKILLs/19-emerging-techniques/speculative-decoding ~/.claude/skills/

Need detailed installation help? Check our platform-specific guides:

Related Agentic Skills

Agentic Skill Details

Type
Non-Technical
Meta-Domain
general
Primary Domain
general
Sub-Domain
machine learning models model
Market Score
25

Report Security Issue

Found a security vulnerability in this agent skill?