From Chatbots to Digital Workers
Building Autonomous Infrastructure with Claude Managed Agents
From chat to work
The shift is small in words, large in consequence:
yesterday → "answer my question"
today → "finish this job"
A chatbot returns text. A digital worker returns a completed artifact — a draft, a ticket, a report, a closed loop.
A paradigm shift
| Messages API | Managed Agents | |
|---|---|---|
| Infrastructure | You build the loop, manage the sandbox, handle tools by hand. | Pre-built agent harness running in managed cloud infrastructure. |
| State & Memory | Stateless. You resend the whole story every time. | Stateful sessions. Filesystem and history survive sleep. |
| Capability | Answers and fine-grained control. | Long-running async work with built-in tools (Bash, files, web). |
From hand-rolled loops over stateless prompts to managed, stateful agents that finish.
Eight tools on the table
Agent
The job description. Who they are, what they're allowed to touch.
Environment
The private office. Clean desk, locked doors, pre-installed software.
Session
The workday. Starts, takes breaks, comes back with the papers still on the desk.
Skills
A table of contents — not a textbook. Read only the chapters you need.
Vaults
A safe deposit box. Agent knows the lock; the session brings the key.
Outcomes
The grader. Checks the work against the rubric until it's right.
Sandboxes
Your kitchen, their chef. Tool calls run on infrastructure you control.
MCP Connector
The universal adapter. One agent, the entire MCP ecosystem.
The next eight slides go one tool at a time. Analogy first. Code second.
The Agent — a job description
Think: hiring paperwork
An Agent is who the worker is and what tools the role is allowed to touch. Same Agent can be hired into many jobs — the description doesn't change between shifts.
Label: create an agent
curl -X POST https://api.anthropic.com/v1/agents \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2026-04-01" \
-H "content-type: application/json" \
-d '{
"name": "social-asset-generator",
"model": {"id": "claude-opus-4-7"},
"system": "You draft social posts...",
"tools": [{"type": "agent_toolset_20260401"}]
}'
One Agent definition, versioned and reused. The brain, separated from any single task.
The Environment — a private office
Think: a clean desk in a locked room
A pre-built workspace with the right software already installed and locked doors to systems the worker shouldn't touch. Same room shape, fresh for every workday.
Label: create an environment
curl -X POST https://api.anthropic.com/v1/environments \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2026-04-01" \
-H "content-type: application/json" \
-d '{
"os": "ubuntu-22.04",
"packages": ["python3.12", "pandas2.2.0"],
"networking": "limited",
"allowed_hosts": ["api.internal-data.com"]
}'
A reproducible container — secure, isolated, predictable. Your core systems stay untouched.
The Session — a workday
Think: a desk that remembers
A worker clocks in, does the job, takes a break — and when they return, the papers are still on the desk. Sessions checkpoint when idle and resume exactly where they left off.
Label: start a session
curl -X POST https://api.anthropic.com/v1/sessions \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2026-04-01" \
-d '{"agent_id": "agt_...", "environment_id": "env_..."}'
# send a message
curl -X POST https://api.anthropic.com/v1/sessions/$ID/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d '{"content": "Draft next weeks campaign."}'
# container checkpoints on idle. resume tomorrow.
Long jobs don't need to fit in one conversation. State survives sleep.
Skills — a table of contents, not a textbook
Think: scanning the index
Skills are folders of expertise. The agent scans the titles, opens only the chapters it needs, ignores the rest. The whole library is available; the context window stays light.
Label: attach a skill
curl -X PATCH https://api.anthropic.com/v1/agents/$AGENT_ID \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2026-04-01" \
-d '{
"version": 3,
"skills": [
{"type": "anthropic", "skill": "docx"},
{"type": "custom", "skill_id": "skl_brand_voice"}
]
}'
Progressive disclosure: load on demand, not all at once. Deep expertise without token bloat.
Vaults — a safe deposit box
Think: the lock vs the key
The Agent knows the shape of the lock — it knows it needs Slack. The Session brings the user's actual key. Build the product once; serve thousands of users without ever co-mingling their credentials.
Label: store a credential, then use it
# store the user's credential in a vault
curl -X POST https://api.anthropic.com/v1/vaults/$VAULT_ID/credentials \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d '{"name": "slack_oauth", "value": "xoxb-..."}'
# attach it at session creation — agent never sees the secret
curl -X POST https://api.anthropic.com/v1/sessions \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d '{"agent_id": "agt_...", "vault_ids": ["vlt_steve_slack"]}'
Manage your product at the agent level. Manage your users at the session level.
Outcomes — replace the back-and-forth
Direct prompt
You ask. It answers. You read it, decide if it's right, and re-ask until it is.
You are the grader. You can't go to dinner.
Outcome
You state the rubric once. An independent grader checks each draft and sends it back until it passes.
You read the final draft only.
Label: define an outcome
curl -X POST https://api.anthropic.com/v1/sessions/$ID/outcomes \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2026-04-01" \
-d '{
"rubric": "10 LinkedIn posts. Each under 280 chars. Each ends with a question.",
"max_iterations": 5
}'
# returns: status = satisfied | needs_revision | max_iterations_reached
Conversation becomes work the moment you can name "done."
Sandboxes — your kitchen, their chef
Think: bring-your-own-kitchen
Anthropic is the head chef sending tickets to a queue. The line cook works in your kitchen — your knives, your pantry, your health inspector. Tool calls, file writes, and network egress stay on infrastructure you control.
Label: declare a self-hosted environment, run the worker
# 1. create the environment
curl -X POST https://api.anthropic.com/v1/environments \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-beta: managed-agents-2026-04-01" \
-d '{"name":"self-hosted","config":{"type":"self_hosted"}}'
# 2. start a worker on your box (scoped env key, not the org key)
export ANTHROPIC_ENVIRONMENT_KEY="sk-ant-oat01-..."
ant beta:worker poll --workdir /workspace
# 3. point a session at it
curl -X POST https://api.anthropic.com/v1/sessions \
-d '{"agent":"agt_...","environment_id":"env_..."}'
Your agent. Your perimeter. Our brain. Fits HIPAA, SOC2, on-prem residency, intranet APIs. Cookbooks ship for Cloudflare, Daytona, Docker, Modal, Vercel.
MCP Connector — the universal adapter
Think: socket vs power cord
The Agent declares which sockets exist — GitHub, Linear, Slack, Notion, your private server. The Session brings the power cord — a vault-held OAuth token. One agent definition, the entire MCP ecosystem.
Label: declare an MCP server on the agent, attach creds at the session
# the agent knows the shape; no secrets here
curl -X POST https://api.anthropic.com/v1/agents \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d '{
"name":"GitHub Assistant",
"model":"claude-opus-4-7",
"mcp_servers":[
{"type":"url","name":"github",
"url":"https://api.githubcopilot.com/mcp/"}
],
"tools":[
{"type":"agent_toolset_20260401"},
{"type":"mcp_toolset","mcp_server_name":"github"}
]
}'
# session brings the user's token via a vault
curl -X POST https://api.anthropic.com/v1/sessions \
-d '{"agent":"agt_...","vault_ids":["vlt_user_gh"]}'
Your laptop already speaks MCP. Ship the same servers to production — swap the runtime, keep the tools.
Webhooks — call me when it's done
Think: a tap on the shoulder
You don't sit and wait. You hand off the job, go to dinner, and the agent calls you back when the artifact is ready. Hours of work happen in the background.
Label: register a webhook + receive it
# tell the platform where to call you
curl -X POST https://api.anthropic.com/v1/webhooks \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d '{"url": "https://yourapp.com/hook", "events": ["session.outcome.satisfied"]}'
# later, you receive:
# POST https://yourapp.com/hook
# { "id": "evt_...", "type": "session.outcome.satisfied", "session_id": "sess_..." }
# fetch the artifact with a GET on receipt.
Long jobs no longer block humans. Work that finishes itself.
Permissions — how much rope?
- Always ask — human approves every action — training wheels
- Ask once — approve at the start of a session, then run free
- Always allow — read-only or well-tested tasks — full autonomy
Label: set a permission policy
curl -X PATCH https://api.anthropic.com/v1/sessions/$ID/tools/slack \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d '{"permission_policy": "always_ask"}'
Trust isn't binary. Turn it tool-by-tool, agent-by-agent, as confidence grows.
A crew, not a soloist
Manager
└── Coordinator Agent
├── Drafter Agent # writes the post copy
└── Reviewer Agent # checks tone + brand
Coordinator
The manager. Splits the work, hands out tasks, gathers results.
Specialists
Each agent has its own job description, tools, and rubric.
Shared substrate
Same office, same files. Agents work in parallel without stepping on each other.
Parallelization and specialization. More hands, sharper work.
Steve ships a week of campaigns over the weekend
The human story
- Friday 5:30pm — Steve kicks off the agent: "Draft next week's launch posts for LinkedIn, X, and Instagram."
- 5:35pm — He closes his laptop and goes home for the weekend.
- Monday 8:00am — His phone buzzes. The agent is done.
- 8:10am — He reviews 15 posts and 5 images. Approves them. Coffee.
Under the hood
POST /sessionswith the social-asset-generator agent- Session checkpoints when Steve disconnects
- Grader returns
satisfiedon iteration 3 - Webhook fires; Steve gets the artifact link
The agent worked most of the weekend. Steve worked ten minutes.
What just changed
Work that finishes itself. Permissions you control. Memory that survives the meeting.
Ten patterns you can ship Monday
Each is a pre-built starting point — a job description, a toolbelt, a system prompt — ready to clone and customize.
- Research — Blank Agent · Deep Researcher · Structured Extractor
- Marketing & Ops — Social Asset Generator · Sprint Retro · Field Monitor
- Customer — Support Agent · Support-to-Eng · Contract Tracker · Data Analyst
Three groups. Steal what fits.
Turn information into answers
Blank Agent
The core toolset, nothing more. A foundation to build any custom agent from scratch.
no MCP
Deep Researcher
Breaks a question into sub-questions, hunts authoritative sources, synthesizes with citations.
no MCP
Structured Extractor
Messy text in, typed JSON out. Validated against your schema.
no MCP
Best when the input is text or web data and the output is structured truth.
Ship the recurring work
Social Asset Generator
Drafts posts across platforms, generates images, schedules the week.
Figma · Buffer · Slack
Sprint Retro Facilitator
Pulls a closed sprint, synthesizes themes, writes the retro doc before the meeting.
Linear · Slack
Field Monitor
Scans blogs on a topic, writes a weekly "what changed" brief.
Notion
Best when work spans multiple tools and happens on a cadence.
Closer to the customer
Support Agent
Answers from docs and the knowledge base. Escalates when it's stuck.
Notion · Slack
Support-to-Eng Escalator
Reads an Intercom thread, reproduces the bug, files a Jira ticket with repro steps.
Intercom · Atlassian · Slack
Contract Tracker
Extracts clauses, sets deadline reminders, tracks obligations in Asana.
Box · Asana
Data Analyst
Loads, explores, visualizes. Answers ad-hoc questions from datasets.
Amplitude
Best when there's a human on the other end waiting for an answer.
Social Asset Generator — the full template
Label: social-asset-generator.yaml
name: Social asset generator
model: claude-sonnet-4-6
system: |
You draft a week of social posts
across LinkedIn, X, and Instagram
with images and schedules them.
1. Read the brand brief
2. Draft posts per platform tone
3. Generate images in Figma
4. Schedule via Buffer
5. Notify the team in Slack
mcp_servers:
- figma
- buffer
- slack
tools:
- agent_toolset_20260401
Why this template
claude-sonnet-4-6 — fast and cost-effective. This work is volume, not depth.
Three MCP servers — the toolbelt is the whole point. Each one is a tab a marketer would otherwise switch between.
Numbered system prompt — five clear steps. The agent has a playbook, not a vibe.
Clone it. Swap Buffer for your scheduler. Ship Monday.
You came for chatbots.
You're leaving with a workforce.
/claude-api — the docs write the code
Think: built-in foreman
A bundled Skill in Claude Code that activates on any Anthropic SDK work — and explicitly on Managed Agents. It knows the beta header, the agent → session pattern, vaults, sandboxes, MCP wiring, and seven language SDKs.
Label: scaffold a production agent from your terminal
# in Claude Code:
/claude-api managed-agents-onboard
# interactive walkthrough:
# → picks your language (TS, Python, Go, Ruby, Java, PHP, cURL)
# → emits Agent create + Session create boilerplate
# → wires mcp_servers, vaults, environments, tools
# → sets the right anthropic-beta header
# bonus
/claude-api migrate ./src to claude-opus-4-7
The agent that built your codebase knows how to deploy itself into it. From prompt to
ant beta:worker pollin one session.
BRAID
Create and Observe Claude Managed Agents
Clone Monday.
What BRAID is
A Claude Code skill that turns the eight tools we just walked through into slash-commands and folder templates.
You describe a flow in a sentence. BRAID writes the agent YAML, wires the MCP servers, provisions the vault, and hands you a run command.
The six tools, by another name
- Agent — the job description
- Environment — the private office
- Session — the workday that survives sleep
- Skills — table of contents, not textbook
- Vaults — agent knows the lock; session brings the key
- Outcomes — the grader
⚠️ Experimental. Research project. APIs and flows will change. Not for production.
Six commands
Inside Claude Code, BRAID exposes one slash-command surface for the whole lifecycle — scaffold, run, inspect, consolidate, tear down.
# describe what you want in a sentence
/braid create "3-shot fundraiser site for a dog rescue"
# or pick a template directly
/braid create --template fundraiser-video-site --name my-flow
# provision env, vault, stores, agents
/braid setup my-flow
# start (or resume) a streaming run
/braid run my-flow "brief goes here"
# inspect / pick / kill in-flight sessions
/braid sessions my-flow --pick
# consolidate past runs into a memory store
/braid dream my-flow --sessions 10
One command per verb. No glue code.
Templates, examples, and host-side hooks
Start from a working flow
flows/_templates/ — fill-in-the-blank starters: fundraiser-video-site, static-site-builder-deployer, image-series-with-memory, url-to-ad-set, viral-video-ad.
flows/_examples/ — runnable reference flows: fundraiser (Fal MCP + Vercel), snapshots (reflection-into-memory), gauntlet-ads (Playwright screenshots), hiking-boots, fraction-blocks.
Credentialed work stays on the host
run:
post_session_hook:
command: bun run ".../post-hooks/vercel-deploy.ts"
env_passthrough: [VERCEL_TOKEN]
timeout_ms: 300000
The agent drafts and renders inside the sandbox. The post-session hook runs on the host with a strictly scoped token after the session ends. VERCEL_TOKEN never enters the agent's context.
Vaults for the agent. Hooks for the deploy. The right credential in the right room.