Infrastructure Knowledge, Packaged for AI Agents

AI coding agents have gotten remarkably good at writing application code. But ask one to deploy a Kubernetes cluster on bare metal, configure Dell server BIOS settings, or troubleshoot a degraded Ceph storage pool, and you’ll watch it hallucinate commands, invent config fields, and suggest procedures that would take down your production environment.

The problem isn’t intelligence — it’s knowledge. Infrastructure operations require the kind of deep, specific, version-aware expertise that takes years to build. Which firmware update order prevents bricked servers. Which RAID controller uses perccli64 vs mvcli. Why you never run talosctl reset without checking etcd quorum first.

This knowledge exists in documentation, runbooks, and the heads of senior engineers. It’s scattered, version-specific, and hard to keep current. And it’s exactly the kind of knowledge AI agents need but don’t have.

Introducing Agentic Stacks

Agentic Stacks packages domain expertise into installable skill packs for AI agents. A “stack” is a git repo containing structured markdown that teaches an agent how to operate in a specific domain.

agentic-stacks init my-deployment
cd my-deployment
agentic-stacks pull kubernetes-talos
agentic-stacks pull hardware-dell

After pulling those two stacks, your AI agent knows:

How to bootstrap a Kubernetes cluster on Talos Linux, including machine configs, networking options (Cilium, Flannel), and storage setup
How to configure Dell PowerEdge servers — BIOS, iDRAC, RAID, firmware updates — using a containerized toolkit
Safety rules like “never run racadm set -t xml -f without operator approval” and “always export a backup SCP before importing”
Version-specific known issues and workarounds
Decision guides for choosing between components

The agent combines expertise from all pulled stacks. “Deploy Kubernetes on these Dell servers” becomes a conversation, not a project.

Not Just for Agents — For Humans Too

Every project includes common-skills, a shared stack that provides four cross-cutting capabilities:

Training mode. Say “train me on this stack” and the agent switches from task-execution to teaching mode. It assesses what you already know, builds a curriculum from the stack’s skills, and walks you through it interactively with exercises and quizzes. This makes stacks useful for onboarding — a new team member can learn Ceph storage or Dell iDRAC management interactively before touching production.

Guided walkthroughs. Say “guide me through deploying Kubernetes” and the agent asks about your environment — how many hosts, what hardware, what’s your network topology — then builds a tailored step-by-step plan and walks you through it, checking at each step that things worked.

Project orientation. Say “what can you help me with?” and the agent reads all your pulled stacks and gives a unified overview of capabilities, suggests how stacks compose together, and recommends where to start.

Feedback capture. Hit an issue? Say “capture that NTP fix” and the agent writes the learning to the right place in the domain stack — formatted consistently, ready to PR upstream. Every operator encounter is a chance to improve the stack for the next person.

How It Works Under the Hood

Stacks are plain git repos with three key files:

CLAUDE.md — the agent’s brain. Sets identity (“you are a Dell PowerEdge hardware management expert”), safety rules (10 hard guardrails), and a routing table mapping operator needs to skills.
skills/ — directories of markdown files organized by operational phase: foundation, deploy, operations, diagnose, reference.
stack.yaml — machine-readable manifest with metadata.

When your AI agent enters the project directory, it reads .stacks/*/CLAUDE.md and gains the combined expertise of every pulled stack. The routing table tells it which skill to read for which task. The safety rules prevent it from doing damage.

This works with any AI coding agent that reads markdown — Claude Code, Codex CLI, Gemini, Cursor, VS Code with Copilot. No model-specific magic. The knowledge is structured text.

The Feedback Loop

Stacks get smarter over time. When you hit a version-specific bug, a config gotcha, or a better procedure, ask your agent to document it. The fix goes into the stack’s known issues or skill content. Next person who pulls the stack gets the benefit of your experience.

This is how operational knowledge should work — not locked in someone’s head or buried in a wiki, but versioned, composable, and available to every agent and every operator.

Available Now

15 stacks and growing: OpenStack, Kubernetes (Talos), Ceph, Docker, Dell/HPE/Supermicro hardware, Ansible, Terraform, FRR routing, iPXE, Prometheus/Grafana, Rails.

pipx install agentic-stacks

Browse stacks at agentic-stacks.com/stacks. Author your own with the authoring guide. Everything is MIT licensed.

GitHub: github.com/agentic-stacks