OpenAI Agents SDK Native Sandbox and Manifest Guide

OpenAI shipped the important Agents SDK update on April 15, 2026, and the tell was the install line in the launch post: pip install "openai-agents>=0.14.0" (OpenAI). That version line matters. This was not a new prompt template or another function-calling wrapper. It was OpenAI moving file work, shell work, patching, sandbox lifecycle, and workspace description into SDK-level primitives.

For developers building coding agents, document agents, data-cleanup agents, or repo-maintenance bots, the design shift is simple: stop making every team rebuild the same brittle harness around Docker, temp directories, tool schemas, file staging, and retry logic. The SDK now gives you a model-native harness, sandbox-native execution, filesystem tools, MCP, AGENTS.md, shell access, apply_patch, and a Manifest abstraction for describing portable workspaces.

Architecture diagram showing a user request flowing into Agents SDK Runner, then SandboxAgent, Manifest, capabilities la

The change: from tool calls to a real workspace

The original Agents SDK was already useful for orchestration: agents, tools, handoffs, guardrails, tracing. The April update adds the missing runtime shape for agents that need to work with files over time.

OpenAI describes the updated SDK as helping developers build agents that can inspect files, run commands, edit code, and work on long-horizon tasks inside controlled sandbox environments (OpenAI). The phrase “controlled workspace” is the key. A serious file-working agent needs more than a list of tools. It needs a root directory, mounted inputs, output locations, a shell, permissions, snapshots, and a way to resume when the container dies.

Before this update, a typical production setup looked like this:

create a temp workspace
copy files into it
expose read, write, shell, and patch tools
validate paths manually
start a sandbox or container
collect artifacts
snapshot state if the job runs long
translate all of that into model-facing instructions

That glue code is where many agent projects quietly get messy. The new SDK turns much of it into first-class configuration.

OpenAI’s launch post listed built-in support for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel as sandbox providers, with a “bring your own sandbox” path as well (OpenAI). That is seven hosted providers at launch, plus local development paths.

Manifest is the portability layer

The Manifest abstraction is the most practical part of the release. It describes what the sandbox workspace should contain before the model starts working.

In the Python docs, Sandbox Agents are marked beta, require Python 3.10 or higher, and are presented as a way to give the model a persistent workspace where it can search document sets, edit files, run commands, generate artifacts, and resume from saved sandbox state (OpenAI Agents SDK Python docs).

A compact Python shape looks like this:

from agents import Runner
from agents.run import RunConfig
from agents.sandbox import Manifest, SandboxAgent, SandboxRunConfig
from agents.sandbox.entries import LocalDir
from agents.sandbox.sandboxes.unix_local import UnixLocalSandboxClient

agent = SandboxAgent(
    name="Repo maintainer",
    model="gpt-5.5",
    instructions="Read repo/task.md, edit with apply_patch, then run the targeted test.",
    default_manifest=Manifest(entries={"repo": LocalDir(src="./repo")}),
)

result = await Runner.run(
    agent,
    "Fix the failing test and summarize the change.",
    run_config=RunConfig(
        sandbox=SandboxRunConfig(client=UnixLocalSandboxClient())
    ),
)

The important part is not the syntax. It is the contract. The Manifest can describe local files, directories, Git repos, synthetic files, environment variables, users, groups, output directories, and remote storage mounts. The JavaScript docs say Manifest entry paths are workspace-relative, cannot be absolute, and cannot escape the workspace with .., which is exactly the kind of boring constraint you want enforced by the runtime rather than remembered in every prompt (OpenAI Agents SDK JS docs).

Compact visual table comparing Manifest entry types: file, dir, localDir, gitRepo, S3, GCS, Azure Blob, R2; rows show so

Capabilities: shell, filesystem, skills, memory, compaction

A SandboxAgent is not just a normal agent with a temp folder. It carries sandbox-specific capabilities.

The JS concepts docs list built-in capabilities including shell(), filesystem(), skills(), memory(), and compaction() (OpenAI Agents SDK JS docs). Defaults matter here: the docs state that Capabilities.default() includes filesystem, shell, and compaction. That means the common coding-agent loop is no longer a pile of bespoke tool definitions.

The filesystem capability exposes patch-style file edits. The shell capability exposes command execution inside the sandbox session. Skills let you progressively disclose specialized instructions or procedures. Memory and compaction help longer runs keep useful state without stuffing every prior token back into the next turn.

This matches how strong coding agents actually work. They inspect. They run a command. They edit a file. They run a smaller command. They inspect the diff. They summarize what changed. If your harness treats each step as an unrelated API call, the model spends too much attention reconstructing its world. A sandbox session gives the model a place to stand.

AGENTS.md also fits naturally into this model. The open AGENTS.md site describes it as a Markdown format for guiding coding agents and says it is used by more than 60,000 open-source projects (AGENTS.md). That file should contain build commands, test instructions, style rules, and repo-specific warnings. In the sandbox world, AGENTS.md becomes workspace-local operating context rather than a giant prompt pasted into every task.

Python-first, TypeScript catching up

At launch, this was Python-first. TechCrunch reported on April 15 that the new harness and sandbox capabilities were launching first in Python, with TypeScript support planned later (TechCrunch). PyPI backs up the date: openai-agents versions 0.14.0 and 0.14.1 were uploaded on April 15, 2026 (PyPI).

As of June 16, 2026, the practical picture is more balanced. The official JS docs now include beta Sandbox Agents, require Node.js 22 or higher, and show Manifest, SandboxAgent, UnixLocalSandboxClient, Docker support, and hosted provider clients through @openai/agents-extensions (OpenAI Agents SDK JS quickstart). The JS docs also note Deno and Bun can work when package resolution and runtime APIs are compatible.

Area	Python	TypeScript / JavaScript
Launch status on Apr 15	First supported path	Planned later
Current sandbox docs	Beta, Python 3.10+	Beta, Node.js 22+
Local sandbox	`UnixLocalSandboxClient`	`UnixLocalSandboxClient`
Docker sandbox	`openai-agents[docker]`	`DockerSandboxClient`
Hosted providers	Supported via SDK integrations	`@openai/agents-extensions` provider paths

That does not mean the two ecosystems are identical. Python was the original launch surface, and many examples still land there first. TypeScript now has enough official surface area to prototype real sandbox agents, but hosted-provider details, PTY behavior, mounts, and lifecycle support still need careful reading per backend.

How I would structure a file-working agent now

The mistake is to treat the sandbox as a magic safety box. It is a runtime boundary, not a product spec. You still need to design the workspace.

A clean structure looks like this:

repo/: the working tree or mounted repository
task.md: the task spec the model must read first
inputs/: read-only documents, datasets, screenshots, or logs
output/: the only place final generated artifacts should go
AGENTS.md: build, test, style, and safety instructions
sandbox user: a non-root identity where the backend supports it
Manifest env: non-secret config persisted by default, secrets marked ephemeral

The Manifest should describe inputs. The agent instructions should describe workflow. The user prompt should describe the one-off task. Keep those separate. The JS docs explicitly warn against stuffing long reference material into instructions when it belongs in the Manifest (OpenAI Agents SDK JS docs).

For production, pick the backend based on the blast radius, not the demo path. Unix-local is fine for development. Docker is a better default when you need repeatability. Hosted providers make sense when you need clean isolation per run, remote execution, scaling, or provider-specific snapshot behavior. The JS clients docs state that hosted provider support varies and developers should check provider docs for environment variables, ports, PTY, snapshots, and cleanup behavior (OpenAI Agents SDK JS clients).

The ecosystem read

This update is important because it standardizes the shape of file-working agents. The industry already converged on a few primitives: MCP for external tools, AGENTS.md for repo instructions, shell for real inspection, patching for reviewable edits, and sandboxes for containment. OpenAI’s Agents SDK now packages those pieces into a runtime developers can actually compose.

The sharp edge remains permissions. A sandboxed agent with broad network, writable mounts, long-lived credentials, and vague instructions can still do damage. The Manifest helps because it makes workspace inputs and grants visible. It does not remove the need for approval policies, secret hygiene, dependency pinning, and artifact review.

The best use case today is not “agent does everything.” It is narrower and more valuable: give the model a bounded workspace, a clear task file, repo-local instructions, shell and patch tools, and one explicit verification path. Let it work like a junior engineer in a disposable environment. Then inspect the diff.

That is a much healthier abstraction than hand-rolling another half-sandbox around a chat loop.

Readers who want to try these models hands-on can call them via onehop with an OpenAI-compatible API by changing one base_url. It is cheaper than first-party, and new accounts get $10 free credit with no card required: call Claude and other models on onehop, or sign up for $10 free credit.

OpenAI Agents SDK Native Sandbox and Manifest Guide

The change: from tool calls to a real workspace

Manifest is the portability layer

Capabilities: shell, filesystem, skills, memory, compaction

Python-first, TypeScript catching up

How I would structure a file-working agent now

The ecosystem read

Related reading

Claude Code Loop Engineering: How to Build an Agent That Actually Finishes

Google Antigravity CLI vs Gemini CLI: What Developers Need to Migrate Before June 18, 2026

Call Qwen3.7 Plus with the OpenAI SDK via DashScope Compatible Mode