ai-multimodal
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security
Installation for Agentic Skill
View all platforms →skilz install mrgoonie/claudekit-skills/ai-multimodal skilz install mrgoonie/claudekit-skills/ai-multimodal --agent opencode skilz install mrgoonie/claudekit-skills/ai-multimodal --agent codex skilz install mrgoonie/claudekit-skills/ai-multimodal --agent gemini
First time? Install Skilz: pip install skilz
Works with 22+ AI coding agents
Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...
Extract and copy to ~/.claude/skills/ then restart Claude Desktop
git clone https://github.com/mrgoonie/claudekit-skills cp -r claudekit-skills/.claude/skills/ai-multimodal ~/.claude/skills/ Need detailed installation help? Check our platform-specific guides:
Related Agentic Skills
api-error-handling
by aj-geddesImplement comprehensive API error handling with standardized error responses, logging, monitoring, and user-friendly messages. Use when building resil...
markdownlint-integration
by TheBushidoCollectiveIntegrate markdownlint into development workflows including CLI usage, programmatic API, CI/CD pipelines, and editor integration.
route-tester
by diet103Test authenticated routes in the your project using cookie-based authentication. Use this skill when testing API endpoints, validating route functiona...
hook-development
by anthropicsThis skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-base...
Agentic Skill Details
- Repository
- claudekit-skills
- Type
- Technical
- Meta-Domain
- web api
- Primary Domain
- api
- Market Score
- 47.6
Browse Category
More web api Agentic SkillsReport Security Issue
Found a security vulnerability in this agent skill?