Skillzwave Logo
Skillzwave

ai-multimodal

47.6

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

Also in: video data analysis pdf

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →
skilz install mrgoonie/claudekit-skills/ai-multimodal
skilz install mrgoonie/claudekit-skills/ai-multimodal --agent opencode
skilz install mrgoonie/claudekit-skills/ai-multimodal --agent codex
skilz install mrgoonie/claudekit-skills/ai-multimodal --agent gemini

First time? Install Skilz: pip install skilz

Works with 14 AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents
Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

1. Clone the repository:
git clone https://github.com/mrgoonie/claudekit-skills
2. Copy the agent skill directory:
cp -r claudekit-skills/.claude/skills/ai-multimodal ~/.claude/skills/

Need detailed installation help? Check our platform-specific guides:

Related Agentic Skills

claude-opus-4-5-migration

by anthropics

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5. Use when the user wants to update their codebase, prompts, or AP...

54
TECHapi
Marketplace

hook-development

by anthropics

This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-base...

54
TECHapi
Marketplace

algorithmic-art

by anthropics

Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code...

54
TECHapi
Marketplace

slack-gif-creator

by anthropics

Knowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users...

54
TECHapi
Marketplace

Agentic Skill Details

Type
Technical
Meta-Domain
web api
Primary Domain
api
Market Score
47.6

Report Security Issue

Found a security vulnerability in this agent skill?