bashkit

Agentic coding tools for Vercel AI SDK. Give AI agents the ability to execute code, read/write files, and perform coding tasks in a sandboxed environment.

Overview

bashkit provides a set of tools that work with the Vercel AI SDK to enable agentic coding capabilities. It gives AI models like Claude the ability to:

Execute bash commands in a persistent shell
Read files and list directories
Create and write files
Edit existing files with string replacement
Search for files by pattern
Search file contents with regex
Spawn sub-agents for complex tasks
Track task progress with todos
Search the web and fetch URLs
Load skills on-demand via the Agent Skills standard

Breaking Changes in v0.4.0

Nullable Types for OpenAI Compatibility

All optional tool parameters now use .nullable() instead of .optional() in Zod schemas. This change enables compatibility with OpenAI's structured outputs, which require all properties to be in the required array.

What changed:

Tool input types changed from T | undefined to T | null
Exported interfaces (QuestionOption, StructuredQuestion) use T | null
AI models will send explicit null values instead of omitting properties

Migration:

// Before v0.4.0
const option: QuestionOption = { label: "test", description: undefined };

// v0.4.0+
const option: QuestionOption = { label: "test", description: null };

Why this matters:

Works with both OpenAI and Anthropic models
OpenAI structured outputs require nullable (not optional) fields
Anthropic/Claude handles nullable fields correctly
The ?? operator handles both null and undefined, so runtime behavior is unchanged

Installation

bun add bashkit ai zod

For web tools, also install:

bun add parallel-web

Quick Start

With Filesystem Access (Desktop Apps, Local Scripts, Servers)

When you have direct filesystem access, use LocalSandbox:

import { createAgentTools, createLocalSandbox } from 'bashkit';
import { anthropic } from '@ai-sdk/anthropic';
import { generateText, stepCountIs } from 'ai';

// Create a local sandbox (runs directly on your filesystem)
const sandbox = createLocalSandbox({ cwd: '/tmp/workspace' });

// Create tools bound to the sandbox
const { tools } = createAgentTools(sandbox);

// Use with Vercel AI SDK
const result = await generateText({
  model: anthropic('claude-sonnet-4-5'),
  tools,
  prompt: 'Create a simple Express server in server.js',
  stopWhen: stepCountIs(10),
});

console.log(result.text);

// Cleanup
await sandbox.destroy();

Without Filesystem Access (Web/Serverless Environments)

When you're in a web or serverless environment without filesystem access, use VercelSandbox or E2BSandbox:

import { createAgentTools, createVercelSandbox } from 'bashkit';
import { anthropic } from '@ai-sdk/anthropic';
import { streamText, stepCountIs } from 'ai';

// Create a Vercel sandbox (isolated Firecracker microVM)
// Note: async - automatically sets up ripgrep for Grep tool
const sandbox = await createVercelSandbox({
  runtime: 'node22',
  resources: { vcpus: 2 },
});

const { tools } = createAgentTools(sandbox);

const result = streamText({
  model: anthropic('claude-sonnet-4-5'),
  messages,
  tools,
  stopWhen: stepCountIs(10),
});

// Cleanup
await sandbox.destroy();

Available Tools

Default Tools (always included)

Tool	Purpose	Key Inputs
`Bash`	Execute shell commands	`command`, `timeout?`, `description?`
`Read`	Read files or list directories	`file_path`, `offset?`, `limit?`
`Write`	Create/overwrite files	`file_path`, `content`
`Edit`	Replace strings in files	`file_path`, `old_string`, `new_string`, `replace_all?`
`Glob`	Find files by pattern	`pattern`, `path?`
`Grep`	Search file contents	`pattern`, `path?`, `output_mode?`, `-i?`, `-C?`

Optional Tools (via config)

Tool	Purpose	Config Key
`AskUser`	Ask user clarifying questions	`askUser: true`
`EnterPlanMode`	Enter planning/exploration mode	`planMode: true`
`ExitPlanMode`	Exit planning mode with a plan	`planMode: true`
`Skill`	Execute skills	`skill: { skills }`
`WebSearch`	Search the web	`webSearch: { apiKey }`
`WebFetch`	Fetch URL and process with AI	`webFetch: { apiKey, model }`

Workflow Tools (created separately)

Tool	Purpose	Factory
`Task`	Spawn sub-agents	`createTaskTool({ model, tools, subagentTypes? })`
`TodoWrite`	Track task progress	`createTodoWriteTool(state, config?, onUpdate?)`

Web Tools (require `parallel-web` peer dependency)

Tool	Purpose	Factory
`WebSearch`	Search the web	`createWebSearchTool({ apiKey })`
`WebFetch`	Fetch URL and process with AI	`createWebFetchTool({ apiKey, model })`

Sandbox Types

LocalSandbox

Runs commands directly on your filesystem. Use when you have filesystem access (desktop apps, local scripts, servers you control).

import { createLocalSandbox } from 'bashkit';

const sandbox = createLocalSandbox({ cwd: '/tmp/workspace' });

VercelSandbox

Runs in isolated Firecracker microVMs on Vercel's infrastructure. Use when you don't have filesystem access (web apps, serverless functions, browser environments).

import { createVercelSandbox } from 'bashkit';

// Async - automatically installs ripgrep for Grep tool
const sandbox = await createVercelSandbox({
  runtime: 'node22',
  resources: { vcpus: 2 },
  // ensureTools: true (default) - auto-setup ripgrep
  // ensureTools: false - skip for faster startup if you don't need Grep
});

// Sandbox ID available immediately after creation
console.log(sandbox.id); // Sandbox ID for reconnection

// Later: reconnect to the same sandbox (fast - ripgrep already installed)
const reconnected = await createVercelSandbox({
  sandboxId: 'existing-sandbox-id',
});

E2BSandbox

Runs in E2B's cloud sandboxes. Requires @e2b/code-interpreter peer dependency. Use when you don't have filesystem access and need E2B's features.

import { createE2BSandbox } from 'bashkit';

// Async - automatically installs ripgrep for Grep tool
const sandbox = await createE2BSandbox({
  apiKey: process.env.E2B_API_KEY,
  // ensureTools: true (default) - auto-setup ripgrep
  // ensureTools: false - skip for faster startup if you don't need Grep
});

// Sandbox ID available immediately after creation
console.log(sandbox.id); // "sbx_abc123..."

// Later: reconnect to the same sandbox (fast - ripgrep already installed)
const reconnected = await createE2BSandbox({
  apiKey: process.env.E2B_API_KEY,
  sandboxId: 'sbx_abc123...', // Reconnect to existing sandbox
});

Configuration

You can configure tools with security restrictions and limits, and enable optional tools:

const { tools, planModeState } = createAgentTools(sandbox, {
  // Enable optional tools
  askUser: true,
  planMode: true, // Enables EnterPlanMode and ExitPlanMode
  skill: {
    skills: discoveredSkills, // From discoverSkills()
  },
  webSearch: {
    apiKey: process.env.PARALLEL_API_KEY,
  },
  webFetch: {
    apiKey: process.env.PARALLEL_API_KEY,
    model: anthropic('claude-haiku-4'),
  },

  // Tool-specific config
  tools: {
    Bash: {
      timeout: 30000,
      blockedCommands: ['rm -rf', 'curl'],
      maxOutputLength: 10000,
    },
    Read: {
      allowedPaths: ['/workspace/**'],
    },
    Write: {
      maxFileSize: 1_000_000, // 1MB limit
    },
  },
});

Configuration Options

Global Config

defaultTimeout (number): Default timeout for all tools in milliseconds
workingDirectory (string): Default working directory for the sandbox

Per-Tool Config

timeout (number): Tool-specific timeout
maxFileSize (number): Maximum file size in bytes (Write)
maxOutputLength (number): Maximum output length (Bash)
allowedPaths (string[]): Restrict file operations to specific paths
blockedCommands (string[]): Block commands containing these strings (Bash)

AI SDK Tool Options (v6+)

All tools support AI SDK v6 tool options:

const { tools } = createAgentTools(sandbox, {
  tools: {
    Bash: {
      timeout: 30000,
      // AI SDK v6 options
      needsApproval: true, // Require user approval before execution
      strict: true, // Strict schema validation
      providerOptions: { /* provider-specific options */ },
    },
    Write: {
      // Dynamic approval based on input
      needsApproval: async ({ file_path }) => {
        return file_path.includes('package.json');
      },
    },
  },
});

needsApproval (boolean | function): Require user approval before tool execution
strict (boolean): Enable strict schema validation
providerOptions (object): Provider-specific tool options

Sub-agents with Task Tool

The Task tool spawns new agents for complex subtasks:

import { createTaskTool } from 'bashkit';

const taskTool = createTaskTool({
  model: anthropic('claude-sonnet-4-5'),
  tools: sandboxTools,
  subagentTypes: {
    research: {
      model: anthropic('claude-haiku-4'), // Cheaper model for research
      systemPrompt: 'You are a research specialist. Find information only.',
      tools: ['Read', 'Grep', 'Glob'], // Limited tools
    },
    coding: {
      systemPrompt: 'You are a coding expert. Write clean code.',
      tools: ['Read', 'Write', 'Edit', 'Bash'],
    },
  },
});

// Add to tools
const allTools = { ...sandboxTools, Task: taskTool };

The parent agent calls Task like any other tool:

// Agent decides to delegate:
{ tool: "Task", args: {
  description: "Research API patterns",
  prompt: "Find best practices for REST APIs",
  subagent_type: "research"
}}

Dynamic Agents

You can create custom agents on the fly by passing system_prompt and/or tools directly, without predefined subagent types:

// Agent creates a specialized agent dynamically:
{ tool: "Task", args: {
  description: "Analyze security vulnerabilities",
  prompt: "Review the auth code for security issues",
  subagent_type: "custom",
  system_prompt: "You are a security expert. Focus on OWASP top 10 vulnerabilities.",
  tools: ["Read", "Grep", "Glob"]
}}

This is useful when:

The parent agent needs to create specialized agents based on context
You want agents to delegate with custom instructions
Predefined subagent types don't fit the task

Streaming Sub-agent Activity to UI

Pass a streamWriter to stream real-time sub-agent activity to the UI:

import { createUIMessageStream } from 'ai';

const stream = createUIMessageStream({
  execute: async ({ writer }) => {
    const taskTool = createTaskTool({
      model,
      tools: sandboxTools,
      streamWriter: writer, // Enable real-time streaming
      subagentTypes: { ... },
    });

    // Use with streamText
    const result = streamText({
      model,
      tools: { Task: taskTool },
      ...
    });

    writer.merge(result.toUIMessageStream());
  },
});

When streamWriter is provided:

Uses streamText internally (instead of generateText)
Emits data-subagent events to the UI stream:
- start - Sub-agent begins work
- tool-call - Each tool the sub-agent uses (with args)
- done - Sub-agent finished
- complete - Full messages array for UI access

These appear in message.parts on the client as { type: "data-subagent", data: SubagentEventData }.

Important: The TaskOutput returned to the lead agent does NOT include messages (to avoid context bloat). The UI accesses the full conversation via the streamed complete event.

Context Management

Conversation Compaction

Automatically summarize conversations when they exceed token limits:

import { compactConversation, MODEL_CONTEXT_LIMITS } from 'bashkit';

let compactState = { conversationSummary: '' };

const result = await compactConversation(messages, {
  maxTokens: MODEL_CONTEXT_LIMITS['claude-sonnet-4-5'],
  summarizerModel: anthropic('claude-haiku-4'), // Fast/cheap model
  compactionThreshold: 0.85, // Trigger at 85% usage
  protectRecentMessages: 10, // Keep last 10 messages intact
}, compactState);

messages = result.messages;
compactState = result.state;

Context Status Monitoring

Monitor context usage and inject guidance to prevent agents from rushing:

import { getContextStatus, contextNeedsCompaction } from 'bashkit';

const status = getContextStatus(messages, MODEL_CONTEXT_LIMITS['claude-sonnet-4-5']);

if (status.guidance) {
  // Inject into system prompt
  system = `${system}\n\n<context_status>${status.guidance}</context_status>`;
}

if (contextNeedsCompaction(status)) {
  // Trigger compaction
  const compacted = await compactConversation(messages, config, state);
}

Tool Result Caching

Cache tool execution results to avoid repeated expensive operations:

const { tools } = createAgentTools(sandbox, {
  // Enable caching with defaults (LRU, 5min TTL)
  cache: true,
});

Cache Configuration Options

const { tools } = createAgentTools(sandbox, {
  cache: {
    // Custom TTL (default: 5 minutes)
    ttl: 10 * 60 * 1000,

    // Enable debug logging
    debug: true,

    // Per-tool control (defaults: Read, Glob, Grep, WebFetch, WebSearch)
    Read: true,
    Glob: true,
    Grep: false,  // Disable for this tool

    // Enable caching for tools not cached by default
    Bash: true,  // Use with caution - has side effects
  },
});

Default Cached Tools

By default, these read-only tools are cached when cache: true:

Read - File reading
Glob - File pattern matching
Grep - Content searching
WebFetch - URL fetching
WebSearch - Web searches

Tools with side effects (Bash, Write, Edit) are NOT cached by default but can be enabled.

Custom Cache Store

Implement your own cache backend (e.g., Redis):

import type { CacheStore } from 'bashkit';

const redisStore: CacheStore = {
  async get(key) {
    const data = await redis.get(key);
    return data ? JSON.parse(data) : undefined;
  },
  async set(key, entry) {
    await redis.set(key, JSON.stringify(entry));
  },
  async delete(key) {
    await redis.del(key);
  },
  async clear() {
    await redis.flushdb();
  },
  size() {
    return redis.dbsize();
  },
};

const { tools } = createAgentTools(sandbox, {
  cache: redisStore,
});

Standalone Cached Wrapper

Wrap individual tools with caching:

import { cached, LRUCacheStore } from 'bashkit';

const cachedTool = cached(myTool, 'MyTool', {
  ttl: 5 * 60 * 1000,
  debug: true,
});

// Check cache stats
console.log(await cachedTool.getStats());
// { hits: 5, misses: 2, hitRate: 0.71, size: 2 }

// Clear cache
await cachedTool.clearCache();

Prompt Caching

Enable Anthropic prompt caching to reduce costs on repeated prefixes:

import { wrapLanguageModel } from 'ai';
// AI SDK v6+
import { anthropicPromptCacheMiddleware } from 'bashkit';
// AI SDK v5
// import { anthropicPromptCacheMiddlewareV2 } from 'bashkit';

const model = wrapLanguageModel({
  model: anthropic('claude-sonnet-4-5'),
  middleware: anthropicPromptCacheMiddleware,
});

// Check cache stats in result
console.log({
  cacheCreation: result.providerMetadata?.anthropic?.cacheCreationInputTokens,
  cacheRead: result.providerMetadata?.anthropic?.cacheReadInputTokens,
});

Context Layer

bashkit ships a context layer that handles two concerns most agent loops end up reinventing:

Static system prompt assembly — discover project docs (AGENTS.md / CLAUDE.md), collect environment info (cwd, shell, platform, git branch), and build tool guidance. Runs once at init so the system prompt stays stable for Anthropic prompt caching.
Dynamic per-step layers — intercept every tool call with beforeExecute gates (plan mode, custom allow/deny) and afterExecute transforms (output truncation, redirection hints, optional disk stash). Compose into an AI SDK prepareStep with auto-compaction and context-status monitoring.

Building a System Prompt

import { buildSystemContext, createLocalSandbox } from 'bashkit';

const sandbox = createLocalSandbox({ cwd: process.cwd() });

const context = await buildSystemContext(sandbox, {
  instructions: true,       // walk up from cwd, load AGENTS.md / CLAUDE.md
  environment: true,        // inject <environment_context> XML
  toolGuidance: {
    tools: ['Bash', 'Read', 'Write', 'Edit', 'Grep', 'Glob'],
  },
});

// context.combined -> ready to drop into streamText({ system })
// context.instructions / context.environment / context.toolGuidance -> individual sections
// context.meta.instructionSources -> which files were discovered

Call this once at init. The output must stay stable across turns for prompt caching to work — never regenerate it mid-conversation.

Tool Execution Layers

import {
  applyContextLayers,
  createExecutionPolicy,
  createOutputPolicy,
  createAgentTools,
  createLocalSandbox,
} from 'bashkit';

const sandbox = createLocalSandbox({ cwd: '/tmp/workspace' });
const { tools, planModeState } = createAgentTools(sandbox, { planMode: true });

const wrappedTools = applyContextLayers(tools, [
  // Gate: block Bash/Write/Edit while plan mode is active
  createExecutionPolicy(planModeState),

  // Transform: truncate oversized results, inject redirection hints,
  // optionally stash full output to disk
  createOutputPolicy({
    maxOutputLength: 30_000,
    redirectionThreshold: 20_000,
    stashOutput: {
      sandbox,
      tools: ['Bash', 'Grep'],  // only these get full output stashed
    },
  }),
]);

Layers compose: beforeExecute runs in order (first rejection wins), afterExecute transforms pipe. Custom layers just implement the ContextLayer interface — see src/context/AGENTS.md for the full contract.

prepareStep Composition

import { createPrepareStep, MODEL_CONTEXT_LIMITS } from 'bashkit';

const prepareStep = createPrepareStep({
  compaction: {
    maxTokens: MODEL_CONTEXT_LIMITS['claude-sonnet-4-5'],
    summarizerModel: anthropic('claude-haiku-4'),
    compactionThreshold: 0.85,
  },
  contextStatus: {
    maxTokens: MODEL_CONTEXT_LIMITS['claude-sonnet-4-5'],
  },
  planModeState,   // injects a plan-mode hint as a user message
});

await streamText({
  model,
  tools: wrappedTools,
  system: context.combined,  // from buildSystemContext
  messages,
  prepareStep,
});

Important: createPrepareStep never touches system — it only modifies messages. That's load-bearing for Anthropic prompt caching. If you extend it via the extend callback, do not set system either.

Agent Skills

bashkit supports the Agent Skills standard - an open format for giving agents new capabilities and expertise. Skills are folders containing a SKILL.md file with instructions that agents can load on-demand.

Note: Skill discovery is designed for LocalSandbox use cases where the agent has access to the user's filesystem. For cloud sandboxes (VercelSandbox/E2B), you would bundle skills with your app and inject them directly into the system prompt.

Progressive Disclosure

Skills use progressive disclosure to keep context lean:

At startup: Only skill metadata (name, description, path) is loaded (~50-100 tokens per skill)
On activation: Agent reads the full SKILL.md via the Read tool when needed

Discovering Skills

When using LocalSandbox, skills are discovered from:

.skills/ in the project directory (highest priority)
~/.bashkit/skills/ for user-global skills

This allows agents to pick up project-specific skills and user-installed skills automatically.

import { discoverSkills, skillsToXml } from 'bashkit';

// Discover skills (metadata only - fast, low context)
const skills = await discoverSkills();

// Or with custom paths
const skills = await discoverSkills({
  paths: ['.skills', '/path/to/shared/skills'],
  cwd: '/my/project',
});

Using Skills with Agents

Inject skill metadata into the system prompt using XML format (recommended for Claude):

import { discoverSkills, skillsToXml, createAgentTools, createLocalSandbox } from 'bashkit';

const skills = await discoverSkills();
const sandbox = createLocalSandbox({ cwd: '/tmp/workspace' });
const { tools } = createAgentTools(sandbox);

const result = await generateText({
  model: anthropic('claude-sonnet-4-5'),
  tools,
  system: `You are a coding assistant.

${skillsToXml(skills)}

When a task matches a skill, use the Read tool to load its full instructions from the location path.`,
  prompt: 'Extract text from invoice.pdf',
  stopWhen: stepCountIs(10),
});

// Agent will call Read({ file_path: "/path/to/.skills/pdf-processing/SKILL.md" })
// when it decides to use the pdf-processing skill

Creating Skills

Create a folder with a SKILL.md file:

.skills/
└── pdf-processing/
    └── SKILL.md

The SKILL.md file has YAML frontmatter and markdown instructions:

---
name: pdf-processing
description: Extract text and tables from PDF files, fill forms, merge documents.
license: MIT
compatibility: Requires poppler-utils
metadata:
  author: my-org
  version: "1.0"
---

# PDF Processing

## When to use this skill
Use when the user needs to work with PDF files...

## How to extract text
1. Use pdftotext for text extraction...

Required fields:

name: 1-64 chars, lowercase letters, numbers, and hyphens. Must match folder name.
description: 1-1024 chars. Describes when to use this skill.

Optional fields:

license: License info
compatibility: Environment requirements
metadata: Arbitrary key-value pairs
allowed-tools: Space-delimited list of pre-approved tools (experimental)

Using Remote Skills

Fetch complete skill folders from GitHub repositories, including all scripts and resources:

import { fetchSkill, fetchSkills, setupAgentEnvironment } from 'bashkit';

// Fetch a complete skill folder from Anthropic's official skills repo
const pdfSkill = await fetchSkill('anthropics/skills/pdf');
// Returns a SkillBundle:
// {
//   name: 'pdf',
//   files: {
//     'SKILL.md': '...',
//     'scripts/extract_text.py': '...',
//     'forms.md': '...',
//     // ... all files in the skill folder
//   }
// }

// Or batch fetch multiple skills
const remoteSkills = await fetchSkills([
  'anthropics/skills/pdf',
  'anthropics/skills/web-research',
]);
// Returns: { 'pdf': SkillBundle, 'web-research': SkillBundle }

// Use with setupAgentEnvironment - writes all files to .skills/
const config = {
  skills: {
    ...remoteSkills,                    // SkillBundles (all files)
    'my-custom': myCustomSkillContent,  // Inline string (just SKILL.md)
  },
};

const { skills } = await setupAgentEnvironment(sandbox, config);
// Creates: .skills/pdf/SKILL.md, .skills/pdf/scripts/*, etc.

GitHub reference format: owner/repo/skillName

anthropics/skills/pdf → fetches all files from https://github.com/anthropics/skills/tree/main/skills/pdf

API Reference

// Discover skills from filesystem
discoverSkills(options?: DiscoverSkillsOptions): Promise<SkillMetadata[]>

// Fetch complete skill folders from GitHub
fetchSkill(ref: string): Promise<SkillBundle>
fetchSkills(refs: string[]): Promise<Record<string, SkillBundle>>

// SkillBundle type
interface SkillBundle {
  name: string;
  files: Record<string, string>;  // relative path -> content
}

// Generate XML for system prompts
skillsToXml(skills: SkillMetadata[]): string

// Parse a single SKILL.md file
parseSkillMetadata(content: string, skillPath: string): SkillMetadata

Setting Up Agent Environments

For cloud sandboxes (VercelSandbox/E2B), use setupAgentEnvironment to create workspace directories and seed skills.

import {
  setupAgentEnvironment,
  skillsToXml,
  createAgentTools,
  createVercelSandbox
} from 'bashkit';

// Define your environment config
const config = {
  workspace: {
    notes: 'files/notes/',
    outputs: 'files/outputs/',
  },
  skills: {
    'web-research': `---
name: web-research
description: Research topics using web search and save findings.
---
# Web Research
Use WebSearch to find information...
`,
  },
};

// Create sandbox and set up environment
const sandbox = createVercelSandbox({});
const { skills } = await setupAgentEnvironment(sandbox, config);

// Build prompt using the same config (stays in sync!)
const systemPrompt = `You are a research assistant.

**ENVIRONMENT:**
- Save notes to: ${config.workspace.notes}
- Save outputs to: ${config.workspace.outputs}

${skillsToXml(skills)}
`;

// Create tools and run
const { tools } = createAgentTools(sandbox);

const result = await generateText({
  model: anthropic('claude-sonnet-4-5'),
  tools,
  system: systemPrompt,
  messages,
});

What setupAgentEnvironment Does

Creates workspace directories - All paths in config.workspace are created
Seeds skills - Skills in config.skills are written to .skills/ directory
Returns skill metadata - For use with skillsToXml()

Using with Subagents

Use the same config for subagent prompts:

const taskTool = createTaskTool({
  model,
  tools,
  subagentTypes: {
    researcher: {
      systemPrompt: `You are a researcher.
Save findings to: ${config.workspace.notes}`,
      tools: ['WebSearch', 'Write'],
    },
    'report-writer': {
      systemPrompt: `Read from: ${config.workspace.notes}
Save reports to: ${config.workspace.outputs}`,
      tools: ['Read', 'Glob', 'Write'],
    },
  },
});

Sandbox Interface

bashkit uses a bring-your-own-sandbox architecture. You can implement custom sandboxes:

interface Sandbox {
  exec(command: string, options?: ExecOptions): Promise<ExecResult>;
  readFile(path: string): Promise<string>;
  writeFile(path: string, content: string): Promise<void>;
  readDir(path: string): Promise<string[]>;
  fileExists(path: string): Promise<boolean>;
  isDirectory(path: string): Promise<boolean>;
  destroy(): Promise<void>;

  // Optional: Sandbox ID for reconnection (cloud providers only)
  readonly id?: string;

  // Path to ripgrep binary (set by ensureSandboxTools)
  rgPath?: string;
}

The id property is available on cloud sandboxes (E2B, Vercel) after creation. Use it to persist the sandbox ID and reconnect later.

The rgPath property is set by ensureSandboxTools() (called automatically during sandbox creation). It points to the ripgrep binary for the Grep tool. Supports x86_64 and ARM64 architectures.

Custom Sandbox Example

import type { Sandbox } from 'bashkit';

class DockerSandbox implements Sandbox {
  // Your implementation
  async exec(command: string) { /* ... */ }
  async readFile(path: string) { /* ... */ }
  // ... other methods
}

const sandbox = new DockerSandbox();
const { tools } = createAgentTools(sandbox);

Architecture

┌─────────────────────────────────────┐
│   Your Next.js App / Script         │
│                                     │
│   ┌─────────────────────────────┐   │
│   │  Vercel AI SDK              │   │
│   │  (streamText/generateText)  │   │
│   └──────────┬──────────────────┘   │
│              │                      │
│   ┌──────────▼──────────────────┐   │
│   │  bashkit Tools              │   │
│   │  (Bash, Read, Write, etc)   │   │
│   └──────────┬──────────────────┘   │
│              │                      │
│   ┌──────────▼──────────────────┐   │
│   │  Sandbox                    │   │
│   │  (Local/Vercel/E2B/Custom)  │   │
│   └─────────────────────────────┘   │
└─────────────────────────────────────┘

Flow:

User sends prompt to AI via Vercel AI SDK
AI decides it needs to use a tool (e.g., create a file)
Tool receives the call and executes via the Sandbox
Result returns to AI, which continues or completes

Design Principles

Bring Your Own Sandbox: Start with LocalSandbox for dev, swap in VercelSandbox/E2BSandbox for production
Type-Safe: Full TypeScript support with proper type inference
Configurable: Security controls and limits at the tool level
Vercel AI SDK Native: Uses standard tool() format
Composable: Mix and match tools, utilities, and middleware as needed

Examples

See the examples/ directory for complete working examples:

basic.ts - Full example with todos, sub-agents, and prompt caching
test-tools.ts - Testing individual tools
test-web-tools.ts - Web search and fetch examples

API Reference

`createAgentTools(sandbox, config?)`

Creates a set of agent tools bound to a sandbox instance.

Parameters:

sandbox (Sandbox): Sandbox instance for code execution
config (AgentConfig, optional): Configuration for tools and web tools

Returns: Object with tool definitions compatible with Vercel AI SDK

Sandbox Factories

createLocalSandbox(config?) - Local execution sandbox (sync)
createVercelSandbox(config?) - Vercel Firecracker sandbox (async, auto-installs ripgrep)
createE2BSandbox(config?) - E2B cloud sandbox (async, auto-installs ripgrep)
ensureSandboxTools(sandbox) - Manually setup tools (called automatically by default)

Workflow Tools

createTaskTool(config) - Spawn sub-agents for complex tasks
createTodoWriteTool(state, config?, onUpdate?) - Track task progress

Optional Tools (also available via config)

createAskUserTool(config?) - Emit a deferred AskUser tool call for the client
createEnterPlanModeTool(state) - Enter planning/exploration mode
createExitPlanModeTool(state, onPlanSubmit?) - Exit planning mode with a plan
createSkillTool(skills) - Execute loaded skills

Utilities

compactConversation(messages, config, state) - Summarize long conversations
getContextStatus(messages, maxTokens, config?) - Monitor context usage
pruneMessagesByTokens(messages, config?) - Remove old messages
estimateMessagesTokens(messages) - Estimate token count

Skills

discoverSkills(options?) - Discover skills from filesystem (metadata only)
skillsToXml(skills) - Generate XML for system prompts
parseSkillMetadata(content, path) - Parse a SKILL.md file

Setup

setupAgentEnvironment(sandbox, config) - Set up workspace directories and seed skills

Middleware

anthropicPromptCacheMiddleware - Enable prompt caching for Anthropic models (AI SDK v6+)
anthropicPromptCacheMiddlewareV2 - Enable prompt caching for Anthropic models (AI SDK v5)

Context Layer

buildSystemContext(sandbox, config?) - Assemble instructions + environment + tool guidance into a system prompt
discoverInstructions(sandbox, config?) - Walk up from cwd loading AGENTS.md / CLAUDE.md files
collectEnvironment(sandbox, config?) / formatEnvironment(env) - Capture and format cwd/shell/platform/git state
buildToolGuidance(config) - Generate one-line hints for registered tools
withContext(tool, name, layers) / applyContextLayers(tools, layers) - Wrap tools with gate + transform layers
createExecutionPolicy(planModeState, config?) - Plan-mode + custom gate ContextLayer
createOutputPolicy(config?) - Truncation + redirection hints + optional disk stash ContextLayer
createPrepareStep(config) - Compose compaction + context-status + plan-mode hints into an AI SDK PrepareStepFunction

Future Roadmap

The following features are planned for future releases:

Agent Profiles Loader

Load pre-configured subagent types from JSON/TypeScript configs:

// .bashkit/agents.json
{
  "subagentTypes": {
    "research": {
      "systemPrompt": "You are a research specialist...",
      "tools": ["Read", "Grep", "Glob", "WebSearch"]
    },
    "coding": {
      "systemPrompt": "You are a coding expert...",
      "tools": ["Read", "Write", "Edit", "Bash"]
    }
  }
}

Helper function to auto-load profiles:

import { createTaskToolWithProfiles } from 'bashkit';

const taskTool = createTaskToolWithProfiles({
  model,
  tools,
  profilesPath: '.bashkit/agents.json', // Auto-loads
});

This will make it easy to:

Share agent configurations across projects
Standardize agent patterns within teams
Quickly set up specialized agents for different tasks

Contributing

Contributions welcome! Please open an issue or PR.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.claude/commands		.claude/commands
.github/workflows		.github/workflows
docs		docs
examples		examples
scripts		scripts
skills/bashkit-debug		skills/bashkit-debug
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json
bun.lockb		bun.lockb
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

bashkit

Overview

Breaking Changes in v0.4.0

Nullable Types for OpenAI Compatibility

Installation

Quick Start

With Filesystem Access (Desktop Apps, Local Scripts, Servers)

Without Filesystem Access (Web/Serverless Environments)

Available Tools

Default Tools (always included)

Optional Tools (via config)

Workflow Tools (created separately)

Web Tools (require parallel-web peer dependency)

Sandbox Types

LocalSandbox

VercelSandbox

E2BSandbox

Configuration

Configuration Options

Global Config

Per-Tool Config

AI SDK Tool Options (v6+)

Sub-agents with Task Tool

Dynamic Agents

Streaming Sub-agent Activity to UI

Context Management

Conversation Compaction

Context Status Monitoring

Tool Result Caching

Cache Configuration Options

Default Cached Tools

Custom Cache Store

Standalone Cached Wrapper

Prompt Caching

Context Layer

Building a System Prompt

Tool Execution Layers

prepareStep Composition

Agent Skills

Progressive Disclosure

Discovering Skills

Using Skills with Agents

Creating Skills

Using Remote Skills

API Reference

Setting Up Agent Environments

What setupAgentEnvironment Does

Using with Subagents

Sandbox Interface

Custom Sandbox Example

Architecture

Design Principles

Examples

API Reference

createAgentTools(sandbox, config?)

Sandbox Factories

Workflow Tools

Optional Tools (also available via config)

Utilities

Skills

Setup

Middleware

Context Layer

Future Roadmap

Agent Profiles Loader

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Web Tools (require `parallel-web` peer dependency)

`createAgentTools(sandbox, config?)`

Packages