From Chatbots to Digital Workers

Building Autonomous Infrastructure with Claude Managed Agents

The call

From chat to work

The shift is small in words, large in consequence:

yesterday   →   "answer my question"
today       →   "finish this job"

A chatbot returns text. A digital worker returns a completed artifact — a draft, a ticket, a report, a closed loop.

The core abstraction

A paradigm shift

	Messages API	Managed Agents
Infrastructure	You build the loop, manage the sandbox, handle tools by hand.	Pre-built agent harness running in managed cloud infrastructure.
State & Memory	Stateless. You resend the whole story every time.	Stateful sessions. Filesystem and history survive sleep.
Capability	Answers and fine-grained control.	Long-running async work with built-in tools (Bash, files, web).

From hand-rolled loops over stateless prompts to managed, stateful agents that finish.

The mentor's gift

Eight tools on the table

Agent

The job description. Who they are, what they're allowed to touch.

Environment

The private office. Clean desk, locked doors, pre-installed software.

Session

The workday. Starts, takes breaks, comes back with the papers still on the desk.

Skills

A table of contents — not a textbook. Read only the chapters you need.

Vaults

A safe deposit box. Agent knows the lock; the session brings the key.

Outcomes

The grader. Checks the work against the rubric until it's right.

Sandboxes

Your kitchen, their chef. Tool calls run on infrastructure you control.

MCP Connector

The universal adapter. One agent, the entire MCP ecosystem.

The next eight slides go one tool at a time. Analogy first. Code second.

Layer 1

The Agent — a job description

Think: hiring paperwork

An Agent is who the worker is and what tools the role is allowed to touch. Same Agent can be hired into many jobs — the description doesn't change between shifts.

Label: create an agent

curl -X POST https://api.anthropic.com/v1/agents \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2026-04-01" \
  -H "content-type: application/json" \
  -d '{
    "name": "social-asset-generator",
    "model": {"id": "claude-opus-4-7"},
    "system": "You draft social posts...",
    "tools": [{"type": "agent_toolset_20260401"}]
  }'

One Agent definition, versioned and reused. The brain, separated from any single task.

Layer 2

The Environment — a private office

Think: a clean desk in a locked room

A pre-built workspace with the right software already installed and locked doors to systems the worker shouldn't touch. Same room shape, fresh for every workday.

Label: create an environment

curl -X POST https://api.anthropic.com/v1/environments \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2026-04-01" \
  -H "content-type: application/json" \
  -d '{
    "os": "ubuntu-22.04",
    "packages": ["python3.12", "pandas2.2.0"],
    "networking": "limited",
    "allowed_hosts": ["api.internal-data.com"]
  }'

A reproducible container — secure, isolated, predictable. Your core systems stay untouched.

Layer 3

The Session — a workday

Think: a desk that remembers

A worker clocks in, does the job, takes a break — and when they return, the papers are still on the desk. Sessions checkpoint when idle and resume exactly where they left off.

Label: start a session

curl -X POST https://api.anthropic.com/v1/sessions \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2026-04-01" \
  -d '{"agent_id": "agt_...", "environment_id": "env_..."}'

# send a message
curl -X POST https://api.anthropic.com/v1/sessions/$ID/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -d '{"content": "Draft next weeks campaign."}'

# container checkpoints on idle. resume tomorrow.

Long jobs don't need to fit in one conversation. State survives sleep.

Layer 4

Skills — a table of contents, not a textbook

Think: scanning the index

Skills are folders of expertise. The agent scans the titles, opens only the chapters it needs, ignores the rest. The whole library is available; the context window stays light.

Label: attach a skill

curl -X PATCH https://api.anthropic.com/v1/agents/$AGENT_ID \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2026-04-01" \
  -d '{
    "version": 3,
    "skills": [
      {"type": "anthropic", "skill": "docx"},
      {"type": "custom", "skill_id": "skl_brand_voice"}
    ]
  }'

Progressive disclosure: load on demand, not all at once. Deep expertise without token bloat.

Layer 5

Vaults — a safe deposit box

Think: the lock vs the key

The Agent knows the shape of the lock — it knows it needs Slack. The Session brings the user's actual key. Build the product once; serve thousands of users without ever co-mingling their credentials.

Label: store a credential, then use it

# store the user's credential in a vault
curl -X POST https://api.anthropic.com/v1/vaults/$VAULT_ID/credentials \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -d '{"name": "slack_oauth", "value": "xoxb-..."}'

# attach it at session creation — agent never sees the secret
curl -X POST https://api.anthropic.com/v1/sessions \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -d '{"agent_id": "agt_...", "vault_ids": ["vlt_steve_slack"]}'

Manage your product at the agent level. Manage your users at the session level.

Layer 6

Outcomes — replace the back-and-forth

Direct prompt

You ask. It answers. You read it, decide if it's right, and re-ask until it is.

You are the grader. You can't go to dinner.

Outcome

You state the rubric once. An independent grader checks each draft and sends it back until it passes.

You read the final draft only.

Label: define an outcome

curl -X POST https://api.anthropic.com/v1/sessions/$ID/outcomes \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2026-04-01" \
  -d '{
    "rubric": "10 LinkedIn posts. Each under 280 chars. Each ends with a question.",
    "max_iterations": 5
  }'
# returns: status = satisfied | needs_revision | max_iterations_reached

Conversation becomes work the moment you can name "done."

Layer 7

Sandboxes — your kitchen, their chef

Think: bring-your-own-kitchen

Anthropic is the head chef sending tickets to a queue. The line cook works in your kitchen — your knives, your pantry, your health inspector. Tool calls, file writes, and network egress stay on infrastructure you control.

Label: declare a self-hosted environment, run the worker

# 1. create the environment
curl -X POST https://api.anthropic.com/v1/environments \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-beta: managed-agents-2026-04-01" \
  -d '{"name":"self-hosted","config":{"type":"self_hosted"}}'

# 2. start a worker on your box (scoped env key, not the org key)
export ANTHROPIC_ENVIRONMENT_KEY="sk-ant-oat01-..."
ant beta:worker poll --workdir /workspace

# 3. point a session at it
curl -X POST https://api.anthropic.com/v1/sessions \
  -d '{"agent":"agt_...","environment_id":"env_..."}'

Your agent. Your perimeter. Our brain. Fits HIPAA, SOC2, on-prem residency, intranet APIs. Cookbooks ship for Cloudflare, Daytona, Docker, Modal, Vercel.

Layer 8

MCP Connector — the universal adapter

Think: socket vs power cord

The Agent declares which sockets exist — GitHub, Linear, Slack, Notion, your private server. The Session brings the power cord — a vault-held OAuth token. One agent definition, the entire MCP ecosystem.

Label: declare an MCP server on the agent, attach creds at the session

# the agent knows the shape; no secrets here
curl -X POST https://api.anthropic.com/v1/agents \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -d '{
    "name":"GitHub Assistant",
    "model":"claude-opus-4-7",
    "mcp_servers":[
      {"type":"url","name":"github",
       "url":"https://api.githubcopilot.com/mcp/"}
    ],
    "tools":[
      {"type":"agent_toolset_20260401"},
      {"type":"mcp_toolset","mcp_server_name":"github"}
    ]
  }'

# session brings the user's token via a vault
curl -X POST https://api.anthropic.com/v1/sessions \
  -d '{"agent":"agt_...","vault_ids":["vlt_user_gh"]}'

Your laptop already speaks MCP. Ship the same servers to production — swap the runtime, keep the tools.

Async

Webhooks — call me when it's done

Think: a tap on the shoulder

You don't sit and wait. You hand off the job, go to dinner, and the agent calls you back when the artifact is ready. Hours of work happen in the background.

Label: register a webhook + receive it

# tell the platform where to call you
curl -X POST https://api.anthropic.com/v1/webhooks \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -d '{"url": "https://yourapp.com/hook", "events": ["session.outcome.satisfied"]}'

# later, you receive:
# POST https://yourapp.com/hook
# { "id": "evt_...", "type": "session.outcome.satisfied", "session_id": "sess_..." }
# fetch the artifact with a GET on receipt.

Long jobs no longer block humans. Work that finishes itself.

Trust

Permissions — how much rope?

Always ask — human approves every action — training wheels
Ask once — approve at the start of a session, then run free
Always allow — read-only or well-tested tasks — full autonomy

Label: set a permission policy

curl -X PATCH https://api.anthropic.com/v1/sessions/$ID/tools/slack \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -d '{"permission_policy": "always_ask"}'

Trust isn't binary. Turn it tool-by-tool, agent-by-agent, as confidence grows.

The apex

A crew, not a soloist

Manager
└── Coordinator Agent
    ├── Drafter Agent        # writes the post copy
    └── Reviewer Agent       # checks tone + brand

Coordinator

The manager. Splits the work, hands out tasks, gathers results.

Specialists

Each agent has its own job description, tools, and rubric.

Shared substrate

Same office, same files. Agents work in parallel without stepping on each other.

Parallelization and specialization. More hands, sharper work.

A day in the life

Steve ships a week of campaigns over the weekend

The human story

Friday 5:30pm — Steve kicks off the agent: "Draft next week's launch posts for LinkedIn, X, and Instagram."
5:35pm — He closes his laptop and goes home for the weekend.
Monday 8:00am — His phone buzzes. The agent is done.
8:10am — He reviews 15 posts and 5 images. Approves them. Coffee.

Under the hood

POST /sessions with the social-asset-generator agent
Session checkpoints when Steve disconnects
Grader returns satisfied on iteration 3
Webhook fires; Steve gets the artifact link

The agent worked most of the weekend. Steve worked ten minutes.

What just changed

Work that finishes itself. Permissions you control. Memory that survives the meeting.

Take this with you

Ten patterns you can ship Monday

Each is a pre-built starting point — a job description, a toolbelt, a system prompt — ready to clone and customize.

Research — Blank Agent · Deep Researcher · Structured Extractor
Marketing & Ops — Social Asset Generator · Sprint Retro · Field Monitor
Customer — Support Agent · Support-to-Eng · Contract Tracker · Data Analyst

Three groups. Steal what fits.

Group 1 — Research & Extraction

Turn information into answers

Blank Agent

The core toolset, nothing more. A foundation to build any custom agent from scratch.

no MCP

Deep Researcher

Breaks a question into sub-questions, hunts authoritative sources, synthesizes with citations.

no MCP

Structured Extractor

Messy text in, typed JSON out. Validated against your schema.

no MCP

Best when the input is text or web data and the output is structured truth.

Group 2 — Marketing & Ops

Ship the recurring work

Social Asset Generator

Drafts posts across platforms, generates images, schedules the week.

Figma · Buffer · Slack

Sprint Retro Facilitator

Pulls a closed sprint, synthesizes themes, writes the retro doc before the meeting.

Linear · Slack

Field Monitor

Scans blogs on a topic, writes a weekly "what changed" brief.

Notion

Best when work spans multiple tools and happens on a cadence.

Group 3 — Customer & Revenue

Closer to the customer

Support Agent

Answers from docs and the knowledge base. Escalates when it's stuck.

Notion · Slack

Support-to-Eng Escalator

Reads an Intercom thread, reproduces the bug, files a Jira ticket with repro steps.

Intercom · Atlassian · Slack

Contract Tracker

Extracts clauses, sets deadline reminders, tracks obligations in Asana.

Box · Asana

Data Analyst

Loads, explores, visualizes. Answers ad-hoc questions from datasets.

Amplitude

Best when there's a human on the other end waiting for an answer.

Spotlight

Social Asset Generator — the full template

Label: social-asset-generator.yaml

name: Social asset generator
model: claude-sonnet-4-6
system: |
  You draft a week of social posts
  across LinkedIn, X, and Instagram
  with images and schedules them.

  1. Read the brand brief
  2. Draft posts per platform tone
  3. Generate images in Figma
  4. Schedule via Buffer
  5. Notify the team in Slack
mcp_servers:
  - figma
  - buffer
  - slack
tools:
  - agent_toolset_20260401

Why this template

claude-sonnet-4-6 — fast and cost-effective. This work is volume, not depth.

Three MCP servers — the toolbelt is the whole point. Each one is a tab a marketer would otherwise switch between.

Numbered system prompt — five clear steps. The agent has a playbook, not a vibe.

Clone it. Swap Buffer for your scheduler. Ship Monday.

You came for chatbots.

You're leaving with a workforce.

On-ramp

/claude-api — the docs write the code

Think: built-in foreman

A bundled Skill in Claude Code that activates on any Anthropic SDK work — and explicitly on Managed Agents. It knows the beta header, the agent → session pattern, vaults, sandboxes, MCP wiring, and seven language SDKs.

Label: scaffold a production agent from your terminal

# in Claude Code:
/claude-api managed-agents-onboard

# interactive walkthrough:
#  → picks your language (TS, Python, Go, Ruby, Java, PHP, cURL)
#  → emits Agent create + Session create boilerplate
#  → wires mcp_servers, vaults, environments, tools
#  → sets the right anthropic-beta header

# bonus
/claude-api migrate ./src to claude-opus-4-7

The agent that built your codebase knows how to deploy itself into it. From prompt to ant beta:worker poll in one session.

Demo

QR code to github.com/mattwoodco/braid

github.com/mattwoodco/braid

BRAID

Create and Observe Claude Managed Agents

Clone Monday.

BRAID

What BRAID is

A Claude Code skill that turns the eight tools we just walked through into slash-commands and folder templates.

You describe a flow in a sentence. BRAID writes the agent YAML, wires the MCP servers, provisions the vault, and hands you a run command.

The six tools, by another name

Agent — the job description
Environment — the private office
Session — the workday that survives sleep
Skills — table of contents, not textbook
Vaults — agent knows the lock; session brings the key
Outcomes — the grader

⚠️ Experimental. Research project. APIs and flows will change. Not for production.

BRAID

Six commands

Inside Claude Code, BRAID exposes one slash-command surface for the whole lifecycle — scaffold, run, inspect, consolidate, tear down.

# describe what you want in a sentence
/braid create "3-shot fundraiser site for a dog rescue"

# or pick a template directly
/braid create --template fundraiser-video-site --name my-flow

# provision env, vault, stores, agents
/braid setup my-flow

# start (or resume) a streaming run
/braid run my-flow "brief goes here"

# inspect / pick / kill in-flight sessions
/braid sessions my-flow --pick

# consolidate past runs into a memory store
/braid dream my-flow --sessions 10

One command per verb. No glue code.

BRAID

Templates, examples, and host-side hooks

Start from a working flow

flows/_templates/ — fill-in-the-blank starters: fundraiser-video-site, static-site-builder-deployer, image-series-with-memory, url-to-ad-set, viral-video-ad.

flows/_examples/ — runnable reference flows: fundraiser (Fal MCP + Vercel), snapshots (reflection-into-memory), gauntlet-ads (Playwright screenshots), hiking-boots, fraction-blocks.

Credentialed work stays on the host

run:
  post_session_hook:
    command: bun run ".../post-hooks/vercel-deploy.ts"
    env_passthrough: [VERCEL_TOKEN]
    timeout_ms: 300000

The agent drafts and renders inside the sandbox. The post-session hook runs on the host with a strictly scoped token after the session ends. VERCEL_TOKEN never enters the agent's context.

Vaults for the agent. Hooks for the deploy. The right credential in the right room.