Skillzwave Logo
Skillzwave

ai-multimodal

47.6

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

Also in: video data analysis pdf

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →
skilz install mrgoonie/claudekit-skills/ai-multimodal
skilz install mrgoonie/claudekit-skills/ai-multimodal --agent opencode
skilz install mrgoonie/claudekit-skills/ai-multimodal --agent codex
skilz install mrgoonie/claudekit-skills/ai-multimodal --agent gemini

First time? Install Skilz: pip install skilz

Works with 22+ AI coding agents

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents
Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

1. Clone the repository:
git clone https://github.com/mrgoonie/claudekit-skills
2. Copy the agent skill directory:
cp -r claudekit-skills/.claude/skills/ai-multimodal ~/.claude/skills/

Need detailed installation help? Check our platform-specific guides:

Related Agentic Skills

api-error-handling

by aj-geddes

Implement comprehensive API error handling with standardized error responses, logging, monitoring, and user-friendly messages. Use when building resil...

79
C
TECHapi
+monitoring

markdownlint-integration

by TheBushidoCollective

Integrate markdownlint into development workflows including CLI usage, programmatic API, CI/CD pipelines, and editor integration.

58
F
TECHapi
Marketplace

route-tester

by diet103

Test authenticated routes in the your project using cookie-based authentication. Use this skill when testing API endpoints, validating route functiona...

57
F
TECHapi
+testing+security

hook-development

by anthropics

This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-base...

54
TECHapi
Marketplace

Agentic Skill Details

Type
Technical
Meta-Domain
web api
Primary Domain
api
Market Score
47.6

Report Security Issue

Found a security vulnerability in this agent skill?