gemini-audio

109 stars 19 forks
34

Guide for implementing Google Gemini API audio capabilities - analyze audio with transcription, summarization, and understanding (up to 9.5 hours), plus generate speech with controllable TTS. Use when processing audio files, creating transcripts, analyzing speech/music/sounds, or generating natural speech from text.

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →
skilz install einverne/dotfiles/gemini-audio
skilz install einverne/dotfiles/gemini-audio --agent opencode
skilz install einverne/dotfiles/gemini-audio --agent codex
skilz install einverne/dotfiles/gemini-audio --agent gemini

First time? Install Skilz: pip install skilz

Works with 14 AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents
Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

1. Clone the repository:
git clone https://github.com/einverne/dotfiles
2. Copy the agent skill directory:
cp -r dotfiles/claude/skills/gemini-audio ~/.claude/skills/

Need detailed installation help? Check our platform-specific guides:

Related Agentic Skills

Agentic Skill Details

Repository
dotfiles
Stars
109
Forks
19
Type
Technical
Meta-Domain
web api
Primary Domain
api
Market Score
34

Report Security Issue

Found a security vulnerability in this agent skill?