operating-production-services
SRE patterns for production service reliability: SLOs, error budgets, postmortems, and incident response.Use when defining reliability targets, writing postmortems, implementing SLO alerting, or establishingon-call practices. NOT for initial service development (use scaffolding skills instead).
Third-Party Agent Skill: Review the code before installing. Agent skills execute in your AI assistant's environment and can access your files. Learn more about security
Installation for Agentic Skill
View all platforms →skilz install mjunaidca/mjs-agent-skills/operating-production-servicesskilz install mjunaidca/mjs-agent-skills/operating-production-services --agent opencodeskilz install mjunaidca/mjs-agent-skills/operating-production-services --agent codexskilz install mjunaidca/mjs-agent-skills/operating-production-services --agent geminiFirst time? Install Skilz: pip install skilz
Works with 22+ AI coding assistants
Cursor, Aider, Copilot, Windsurf, Qwen, Kimi, and more...
Extract and copy to ~/.claude/skills/ then restart Claude Desktop
git clone https://github.com/mjunaidca/mjs-agent-skillscp -r mjs-agent-skills/.claude/skills/operating-production-services ~/.claude/skills/Need detailed installation help? Check our platform-specific guides:
Related Agentic Skills
microservices-patterns
by secondsky
Design microservices architectures with service boundaries, event-driven communication, and resilience patterns. Use when building distributed syst...
writing-skills
by obra
Use when creating new skills, editing existing skills, or verifying skills work before deployment
flow-nexus-swarm
by ruvnet
Cloud-based AI swarm deployment and event-driven workflow automation with Flow Nexus platform
airflow-dag-patterns
by wshobson
Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchest...
Agentic Skill Details
- Repository
- mjs-agent-skills
- Stars
- 1
- Forks
- 2
- Type
- Technical
- Meta-Domain
- cloud infrastructure
- Primary Domain
- kubernetes
- Market Score
- 17
Agent Skill Grade
A Score: 97/100 Click to see breakdown
Score Breakdown
Areas to Improve
- Missing TOC in slo-alerting.md
Recommendations
- Add trigger phrases to description for discoverability
- Add table of contents for files over 100 lines
Graded: 2026-01-24
Developer Feedback
Found your operating-production-services skill while browsing the registry—the way you've structured the progressive disclosure for such a dense topic (97/100 for a reason) makes me curious how you'd handle even more edge cases around observability and incident response.
Links:
The TL;DR
You're at 97/100, solidly in A-grade territory. This is based on Anthropic's skill best practices rubric. Your strongest area is Writing Style (10/10)—the skill reads like documentation written by someone who actually runs production systems, not a marketing pamphlet. Weakest spot is Spec Compliance (12/15), mostly because you're leaving discoverability points on the table with trigger phrases.
What's Working Well
- Blameless postmortem framework - The 5 Whys template and postmortem meeting checklist give Claude concrete structure for handling incidents. That's the kind of thing teams actually need.
- Token economy is chef's kiss - slo-alerting.md delegates heavy technical details while SKILL.md stays lean. You're not dumping a 200-line reference file on someone; you're layering it thoughtfully.
- Practical burn rate guidance - The multi-window alerting patterns with specific Prometheus queries and Grafana dashboard structure mean Claude can actually implement this, not just read philosophy.
- Clear scope boundaries - Your description explicitly calls out SLO alerting and postmortems while noting what you don't cover (deployment strategies, team structure). That's rare and helpful.
The Big One
slo-alerting.md (189 lines) is missing a table of contents. This hurts your navigation score because at 100+ lines, readers need an anchor point. Right now someone has t...
Browse Category
More cloud infrastructure Agentic SkillsReport Security Issue
Found a security vulnerability in this agent skill?
Report Security Issue
Thank you for helping keep SkillzWave secure. We'll review your report and take appropriate action.
Note: For critical security issues that require immediate attention, please also email security@skillzwave.ai directly.