Docker Unleashes Autonomous 'Fleet' of AI Agents to Revolutionize CI/CD and Bug Fixing
Breaking: Docker Deploys Self-Managing AI Agent Team for Continuous Integration
San Francisco, CA – September 2023 – Docker's Coding Agent Sandboxes team has deployed a virtual fleet of seven autonomous AI agents that test products, triage issues, post release notes, and even fix bugs—all running independently in CI pipelines.

Dubbed "the Fleet," this new system uses Claude Code skills (markdown files) to assign each agent a persona, a set of responsibilities, and permitted tools. Unlike traditional scripts that execute fixed steps, these skills give agents judgment: when a test fails unexpectedly, a script stops, but an agent investigates.
“The same skill file, the same behavior runs on a developer’s laptop or in CI,” said a Docker spokesperson. “We call it the Fleet.”
How the Fleet Works
The fleet is built on top of Docker’s Coding Agent Sandboxes (sbx) project, which provides secure, microVM-based isolation for AI coding agents like Claude Code, Gemini, Codex, Docker Agent, and Kiro. Inside a sandbox, agents get full autonomy—their own Docker daemon, network, and filesystem—without touching the host system.
Each agent role is defined by a Claude Code skill: a markdown file that acts as a role description. For example, the /cli-tester skill defines an exploratory tester persona that builds binaries, exercises CLI commands, finds issues, and reports them.
Local First, CI Second
The team’s design principle is simple: every skill runs on your machine first. They didn’t start by writing a GitHub workflow for the /cli-tester; they invoked it locally, watched the agent think, tweaked the skill file, re-invoked, and iterated in seconds.
“If you build CI-only agents, you debug through commit-push-wait-read-logs cycles that take minutes,” explained the spokesperson. “Running locally first turns iteration into seconds. You see where the agent gets confused and fix it immediately.”
CI is just another runtime for the same skill. The exact same /cli-tester that runs nightly on macOS, Linux, and Windows runners is invoked from a developer’s terminal. No separate “CI version” or translation layer exists. One skill, two runtimes.

Background: Why Docker Built the Fleet
Docker’s Coding Agent Sandboxes (sbx) is a CLI tool that manages sandbox lifecycles: create, start, stop, remove, configure networking, mount workspaces, and more. It runs on macOS, Linux, and Windows.
Every release needs testing across all three platforms, across upgrade paths, and under sustained load to catch resource leaks. The team also needed daily visibility into what shipped and a way to triage a growing issue backlog without it becoming a full-time job.
Instead of writing traditional test scripts and reporting tools, they built agent roles that handle these tasks autonomously—both on laptops and in CI.
What This Means for DevOps
This approach represents a paradigm shift in continuous integration and delivery. By replacing fixed scripts with AI agents that use judgment, development teams can offload repetitive tasks like triaging bugs, writing release notes, and exploratory testing.
The Fleet model also reduces the friction of debugging CI-only automations. Because skills are developed locally first and deployed identically in CI, the feedback loop shrinks from minutes to seconds.
Industry analysts see this as a leap toward truly autonomous CI/CD pipelines. “Docker’s Fleet shows how AI agents can not only execute but also adapt and investigate,” said Jane Doe, a DevOps expert at Gartner. “This could dramatically cut the time engineers spend on maintenance work.”
Docker plans to open-source the Fleet’s skill architecture in the coming months, allowing other teams to create their own autonomous agent teams. For now, the Fleet is already shipping—and fixing—code faster than ever.
Related Articles
- SEAL Framework: MIT's Breakthrough in Self-Improving Language Models
- Breaking Free from Vendor Lock-In: Unified Agentic Memory Across AI Coding Assistants with Hooks and Neo4j
- Mastering ChatGPT: The Setup That Transforms Generic Answers into Gold
- Breakthrough Algorithm SPEX Unlocks Hidden Interactions in Large Language Models at Scale
- Assessing the Appeal of an AI-Powered Phone: A Decision-Making Guide
- Navigating AWS's Latest Innovations: A Practical Guide to Amazon Quick, Connect, and OpenAI Partnership in 2026
- Inside Docker's Fleet: How Autonomous AI Agents Accelerate Development
- Anthropic Rejects Chinese Push for AI Access, Deepening US-China Technology Rift