ai-multimodal
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security
Installation for Agentic Skill
View all platforms →skilz install mrgoonie/claudekit-skills/ai-multimodal skilz install mrgoonie/claudekit-skills/ai-multimodal --agent opencode skilz install mrgoonie/claudekit-skills/ai-multimodal --agent codex skilz install mrgoonie/claudekit-skills/ai-multimodal --agent gemini
First time? Install Skilz: pip install skilz
Works with 14 AI coding assistants
Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...
Extract and copy to ~/.claude/skills/ then restart Claude Desktop
git clone https://github.com/mrgoonie/claudekit-skills cp -r claudekit-skills/.claude/skills/ai-multimodal ~/.claude/skills/ Need detailed installation help? Check our platform-specific guides:
Related Agentic Skills
claude-opus-4-5-migration
by anthropicsMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5. Use when the user wants to update their codebase, prompts, or AP...
hook-development
by anthropicsThis skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-base...
algorithmic-art
by anthropicsCreating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code...
slack-gif-creator
by anthropicsKnowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users...
Agentic Skill Details
- Repository
- claudekit-skills
- Type
- Technical
- Meta-Domain
- web api
- Primary Domain
- api
- Market Score
- 47.6
Browse Category
More web api Agentic SkillsReport Security Issue
Found a security vulnerability in this agent skill?