ai-multimodal
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, com
Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security
Installation for Agentic Skill
View all platforms →skilz install Linhv14/claude-skill/ai-multimodal skilz install Linhv14/claude-skill/ai-multimodal --agent opencode skilz install Linhv14/claude-skill/ai-multimodal --agent codex skilz install Linhv14/claude-skill/ai-multimodal --agent gemini
First time? Install Skilz: pip install skilz
Works with 14 AI coding assistants
Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...
Extract and copy to ~/.claude/skills/ then restart Claude Desktop
git clone https://github.com/Linhv14/claude-skill cp -r claude-skill/.claude/skills/ai-multimodal ~/.claude/skills/ Need detailed installation help? Check our platform-specific guides:
Related Agentic Skills
treatment-plans
by davila7"Generate concise (3-4 page), focused medical treatment plans in LaTeX/PDF format for all clinical specialties. Supports general medical treatment, re...
google-gemini-api
by jezweb| Integrate Gemini API with correct current SDK (@google/genai v1.27+, NOT deprecated @google/generative-ai). Supports text generation, multimodal (im...
elevenlabs-agents
by jezweb| Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/...
cloudflare-mcp-server
by jezweb| Build Model Context Protocol (MCP) servers on Cloudflare Workers - the only platform with official remote MCP support. TypeScript-based with OAuth, ...
Agentic Skill Details
- Repository
- claude-skill
- Type
- Non-Technical
- Meta-Domain
- general
- Primary Domain
- general
- Sub-Domain
- patterns skill
- Market Score
- 35.1
Browse Category
More general Agentic SkillsReport Security Issue
Found a security vulnerability in this agent skill?