Introducing Daemons: Doing the Work That Agents Leave Behind
Agents create work. Daemons maintain it. Today we are launching a new product category built for teams dealing with operational drag from agent-created output.
Charlie V2 is a runtime for durable, multi-step coding work across GitHub, Linear, and Slack. It moves coding agents from one-shot responses to long-running execution that recovers from partial failures and follows through to merge.
Coding agents are moving from “chat that can write code” to systems that can execute real engineering work across tools, over time, with clear control boundaries. Charlie V2 is our response to that shift. It is not just a better agent, it is a runtime and infrastructure layer for durable, multi-step software execution. This post explains the capabilities now available in V2 and the technical foundation that makes those capabilities reliable enough to power AI daemons.
If you are already a Charlie user, you have been upgraded automatically to V2 and there is no action required on your part.
The market is shifting in visible stages: autocomplete, then conversational coding assistants, then agents that can take isolated actions like opening a PR or running a command. The next stage is longer-horizon execution: systems that can receive a request in one surface, coordinate work across multiple systems, recover from partial failures, and still preserve clear human control.
That shift changes what teams evaluate. Model quality still matters, but system quality now matters just as much: lifecycle management, observability, permission boundaries, and behavior that stays predictable under production constraints. “Can it generate code?” is now table stakes. “Can it execute safely in our workflow and leave an auditable trail?” is the harder and more useful question.
Charlie V2 is built for that phase. We think of it as a Coding Agent OS: a substrate where model intelligence is paired with orchestration, explicit capabilities, stateful runtime behavior, and integration-aware actions. The goal is to make coding agents operationally useful for teams, not just impressive in isolated demos.
Many products in this category are strong, but they optimize for different layers of the stack. Some focus on in-editor generation speed, some on chat-based troubleshooting, and some on narrow automation flows. Those approaches create value, but teams are often left stitching together the rest: cross-tool context, permission controls, durable execution, and handoff behavior.
The cleanest differentiation is this: many coding tools optimize for single interactions, while Charlie V2 is designed for durable, multi-surface execution.
Charlie is built to operate natively across GitHub, Linear, and Slack, carrying context between those surfaces in a structured way. Teams can use screenshots and images as first-class input, use web search for current technical context, and run browser-native Playwright flows in devboxes when UI behavior is part of the task.
Another practical difference is that collaboration semantics are part of the runtime contract. Charlie can preserve thread context, acknowledge requests where they were made, and return completion updates in the same discussion chain, which reduces coordination overhead for teams.
This is what CAOS represents in practice: coding-agent behavior treated as a system design problem. It is about infrastructure and runtime, not just about agents themselves.
At the core of Charlie V2 is a durable task/run model. Each request is represented as a task with explicit lifecycle states, lineage, and terminal outcomes. Root tasks can delegate child tasks to specialized workers, which enables decomposition without losing traceability and supports realities like mailbox follow-ups, retries, and timeout handling.
Execution is split across services that are built for stateful, long-running agent behavior. The agent executor runs a phase-based model with durable state so tool calls and asynchronous waits can be coordinated reliably. The task scheduler serves as the source of truth for task claims, transitions, mailbox operations, and timeout enforcement.
V2 also formalizes multi-agent composition. Entry agents own end-to-end outcomes and delegate focused work to worker agents through a strict task contract. That allows specialized workers (code editing, CI checks, platform actions, and more) to run with clear boundaries and return structured handoffs.
That contract also keeps delegation predictable at scale: child tasks receive scoped context, return structured outputs, and report durable identifiers for any state changes. This lowers re-discovery costs and makes retries and follow-on actions operationally safer.
Finally, V2 is instrumented as an operational data system, not just an app workflow. Events flow through a pipeline that supports layered analytics over requests, tasks, and transcripts, enabling quality analysis and continuous iteration based on real execution behavior.
As agents become more autonomous, trust has to be enforced by architecture. Charlie V2 applies least-privilege capability gating at runtime across core resources and integrations. Access is explicit (none, read, write) per capability domain before execution begins. Most importantly, V2 preserves human control where judgment matters. The system proceeds autonomously on reasonable assumptions, but surfaces high-impact ambiguities before irreversible actions. That autonomy-with-control balance is what makes coding agents practical for production teams.
To summarize, the work we’ve put into Charlie V2 enables us to pack a whole lot of powerful features into the package you already know and love. Some of the biggest points of differentiation for users include:
To put it simply, Charlie is now more dynamic, smarter, more reliable, and safer. We can’t wait to see what you build with it.
At ai-daemons.com, we are publishing the Daemons spec and building production patterns for durable, interoperable coding daemons. Charlie V2 is the infrastructure layer behind that work: long-running execution, explicit capabilities, and reliable cross-system action semantics. This post is intended to be the technical reference for why those daemon patterns are now practical.