markitdown

304 stars 61 forks

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.

Also in: markdown json word

Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security

Installation for Agentic Skill

View all platforms →

Claude Code (CLI) Fast

skilz install jimmc414/Kosmos/markitdown

OpenCode (CLI) Fast

skilz install jimmc414/Kosmos/markitdown --agent opencode

OpenAI Codex (CLI) Native

skilz install jimmc414/Kosmos/markitdown --agent codex

Gemini CLI (Project) Project

skilz install jimmc414/Kosmos/markitdown --agent gemini

First time? Install Skilz: pip install skilz

Works with 22+ AI coding assistants

Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...

View All Agents

For Claude Desktop Easy

Download Agent Skill ZIP

Extract and copy to ~/.claude/skills/ then restart Claude Desktop

Manual Installation

1. Clone the repository:

git clone https://github.com/jimmc414/Kosmos

2. Copy the agent skill directory:

cp -r Kosmos/kosmos-claude-scientific-skills/scientific-skills/markitdown ~/.claude/skills/

Owner: jimmc414 (GitHub)
Repository: Kosmos
Stars: 304
Forks: 61
Type: Technical
Meta-Domain: productivity
Primary Domain: pdf
Market Score: 80

Agentic Skill Grades →

Agent Skill Grade

Score: 80/100 Click to see breakdown

Score Breakdown

Spec Compliance

12/15

PDA Architecture

26/30

Ease of Use

20/25

Writing Style

7/10

Utility

17/20

Modifiers: -2

Areas to Improve

Second-person voice in references
Verbose code comments
Missing TOC in long references

Recommendations

Add trigger phrases to description for discoverability
Add table of contents for files over 100 lines

Graded: 2026-01-05

Developer Feedback

I took a look at your markitdown skill and wanted to share some thoughts.

Links:

The TL;DR

You're at 80/100, solidly in B territory. This evaluation is based on Anthropic's Claude Skills best practices across five pillars. Your strongest area is Progressive Disclosure Architecture (26/30) — you've nailed the layered structure with a clean SKILL.md overview and five focused reference files. The weakest area is Spec Compliance (12/15) and Writing Style (7/10), where some smaller refinements would push you higher.

What's Working Well

Progressive disclosure is chef's kiss — Your five reference files (structured_data, web_content, document_conversion, media_processing, advanced_integrations) sit exactly one level deep from SKILL.md. That's the sweet spot for token economy and discoverability.
Practical utility — You're solving a real problem: converting 20+ file formats to Markdown for LLM processing. The input/output examples and batch processing templates show you understand actual workflows.
Modular design — Trigger phrases cover the common cases (convert, extract, transcribe, OCR, batch). The "When to Use" section helps developers understand scope without reading everything.
Rich examples — Both CLI and Python code examples; good error handling patterns scattered through the references.

The Big One

Your writing voice is inconsistent, and it's costing you points. The spec wants imperative/instructional voice throughout, but your references slip into second-person statements like "Use high-resolution images for better accuracy" in media_processing.md. This violates the voice requirements and pulls down y...

Browse Category

More productivity Agentic Skills

Report Security Issue

Found a security vulnerability in this agent skill?