diff --git a/docs/features/image-input.md b/docs/features/image-input.md index aa3bf2f6..79c80d2b 100644 --- a/docs/features/image-input.md +++ b/docs/features/image-input.md @@ -1,6 +1,9 @@ # Image Input -Send images to Copilot sessions by attaching them as file attachments. The runtime reads the file from disk, converts it to base64 internally, and sends it to the LLM as an image content block — no manual encoding required. +Send images to Copilot sessions as attachments. There are two ways to attach images: + +- **File attachment** (`type: "file"`) — provide an absolute path; the runtime reads the file from disk, converts it to base64, and sends it to the LLM. +- **Blob attachment** (`type: "blob"`) — provide base64-encoded data directly; useful when the image is already in memory (e.g., screenshots, generated images, or data from an API). ## Overview @@ -25,11 +28,12 @@ sequenceDiagram | Concept | Description | |---------|-------------| | **File attachment** | An attachment with `type: "file"` and an absolute `path` to an image on disk | -| **Automatic encoding** | The runtime reads the image, converts it to base64, and sends it as an `image_url` block | +| **Blob attachment** | An attachment with `type: "blob"`, base64-encoded `data`, and a `mimeType` — no disk I/O needed | +| **Automatic encoding** | For file attachments, the runtime reads the image and converts it to base64 automatically | | **Auto-resize** | The runtime automatically resizes or quality-reduces images that exceed model-specific limits | | **Vision capability** | The model must have `capabilities.supports.vision = true` to process images | -## Quick Start +## Quick Start — File Attachment Attach an image file to any message using the file attachment type. The path must be an absolute path to an image on disk. @@ -215,9 +219,190 @@ await session.SendAsync(new MessageOptions +## Quick Start — Blob Attachment + +When you already have image data in memory (e.g., a screenshot captured by your app, or an image fetched from an API), use a blob attachment to send it directly without writing to disk. + +
+Node.js / TypeScript + +```typescript +import { CopilotClient } from "@github/copilot-sdk"; + +const client = new CopilotClient(); +await client.start(); + +const session = await client.createSession({ + model: "gpt-4.1", + onPermissionRequest: async () => ({ kind: "approved" }), +}); + +const base64ImageData = "..."; // your base64-encoded image +await session.send({ + prompt: "Describe what you see in this image", + attachments: [ + { + type: "blob", + data: base64ImageData, + mimeType: "image/png", + displayName: "screenshot.png", + }, + ], +}); +``` + +
+ +
+Python + +```python +from copilot import CopilotClient +from copilot.types import PermissionRequestResult + +client = CopilotClient() +await client.start() + +session = await client.create_session({ + "model": "gpt-4.1", + "on_permission_request": lambda req, inv: PermissionRequestResult(kind="approved"), +}) + +base64_image_data = "..." # your base64-encoded image +await session.send({ + "prompt": "Describe what you see in this image", + "attachments": [ + { + "type": "blob", + "data": base64_image_data, + "mimeType": "image/png", + "displayName": "screenshot.png", + }, + ], +}) +``` + +
+ +
+Go + + +```go +package main + +import ( + "context" + copilot "github.com/github/copilot-sdk/go" +) + +func main() { + ctx := context.Background() + client := copilot.NewClient(nil) + client.Start(ctx) + + session, _ := client.CreateSession(ctx, &copilot.SessionConfig{ + Model: "gpt-4.1", + OnPermissionRequest: func(req copilot.PermissionRequest, inv copilot.PermissionInvocation) (copilot.PermissionRequestResult, error) { + return copilot.PermissionRequestResult{Kind: copilot.PermissionRequestResultKindApproved}, nil + }, + }) + + base64ImageData := "..." + mimeType := "image/png" + displayName := "screenshot.png" + session.Send(ctx, copilot.MessageOptions{ + Prompt: "Describe what you see in this image", + Attachments: []copilot.Attachment{ + { + Type: copilot.Blob, + Data: &base64ImageData, + MIMEType: &mimeType, + DisplayName: &displayName, + }, + }, + }) +} +``` + + +```go +mimeType := "image/png" +displayName := "screenshot.png" +session.Send(ctx, copilot.MessageOptions{ + Prompt: "Describe what you see in this image", + Attachments: []copilot.Attachment{ + { + Type: copilot.Blob, + Data: &base64ImageData, // base64-encoded string + MIMEType: &mimeType, + DisplayName: &displayName, + }, + }, +}) +``` + +
+ +
+.NET + + +```csharp +using GitHub.Copilot.SDK; + +public static class BlobAttachmentExample +{ + public static async Task Main() + { + await using var client = new CopilotClient(); + await using var session = await client.CreateSessionAsync(new SessionConfig + { + Model = "gpt-4.1", + OnPermissionRequest = (req, inv) => + Task.FromResult(new PermissionRequestResult { Kind = PermissionRequestResultKind.Approved }), + }); + + var base64ImageData = "..."; + await session.SendAsync(new MessageOptions + { + Prompt = "Describe what you see in this image", + Attachments = new List + { + new UserMessageDataAttachmentsItemBlob + { + Data = base64ImageData, + MimeType = "image/png", + DisplayName = "screenshot.png", + }, + }, + }); + } +} +``` + + +```csharp +await session.SendAsync(new MessageOptions +{ + Prompt = "Describe what you see in this image", + Attachments = new List + { + new UserMessageDataAttachmentsItemBlob + { + Data = base64ImageData, + MimeType = "image/png", + DisplayName = "screenshot.png", + }, + }, +}); +``` + +
+ ## Supported Formats -Supported image formats include JPG, PNG, GIF, and other common image types. The runtime reads the image from disk and converts it as needed before sending to the LLM. Use PNG or JPEG for best results, as these are the most widely supported formats. +Supported image formats include JPG, PNG, GIF, and other common image types. For file attachments, the runtime reads the image from disk and converts it as needed. For blob attachments, you provide the base64 data and MIME type directly. Use PNG or JPEG for best results, as these are the most widely supported formats. The model's `capabilities.limits.vision.supported_media_types` field lists the exact MIME types it accepts. @@ -283,10 +468,10 @@ These image blocks appear in `tool.execution_complete` event results. See the [S |-----|---------| | **Use PNG or JPEG directly** | Avoids conversion overhead — these are sent to the LLM as-is | | **Keep images reasonably sized** | Large images may be quality-reduced, which can lose important details | -| **Use absolute paths** | The runtime reads files from disk; relative paths may not resolve correctly | -| **Check vision support first** | Sending images to a non-vision model wastes tokens on the file path without visual understanding | -| **Multiple images are supported** | Attach several file attachments in one message, up to the model's `max_prompt_images` limit | -| **Images are not base64 in your code** | You provide a file path — the runtime handles encoding, resizing, and format conversion | +| **Use absolute paths for file attachments** | The runtime reads files from disk; relative paths may not resolve correctly | +| **Use blob attachments for in-memory data** | When you already have base64 data (e.g., screenshots, API responses), blob avoids unnecessary disk I/O | +| **Check vision support first** | Sending images to a non-vision model wastes tokens without visual understanding | +| **Multiple images are supported** | Attach several attachments in one message, up to the model's `max_prompt_images` limit | | **SVG is not supported** | SVG files are text-based and excluded from image processing | ## See Also diff --git a/docs/features/streaming-events.md b/docs/features/streaming-events.md index 81b27f80..d03ed95f 100644 --- a/docs/features/streaming-events.md +++ b/docs/features/streaming-events.md @@ -639,7 +639,7 @@ The user sent a message. Recorded for the session timeline. |------------|------|----------|-------------| | `content` | `string` | ✅ | The user's message text | | `transformedContent` | `string` | | Transformed version after preprocessing | -| `attachments` | `Attachment[]` | | File, directory, selection, or GitHub reference attachments | +| `attachments` | `Attachment[]` | | File, directory, selection, blob, or GitHub reference attachments | | `source` | `string` | | Message source identifier | | `agentMode` | `string` | | Agent mode: `"interactive"`, `"plan"`, `"autopilot"`, or `"shell"` | | `interactionId` | `string` | | CAPI interaction ID | diff --git a/dotnet/README.md b/dotnet/README.md index bdb3e8da..c5b3857b 100644 --- a/dotnet/README.md +++ b/dotnet/README.md @@ -265,21 +265,35 @@ session.On(evt => ## Image Support -The SDK supports image attachments via the `Attachments` parameter. You can attach images by providing their file path: +The SDK supports image attachments via the `Attachments` parameter. You can attach images by providing their file path, or by passing base64-encoded data directly using a blob attachment: ```csharp +// File attachment — runtime reads from disk await session.SendAsync(new MessageOptions { Prompt = "What's in this image?", Attachments = new List { - new UserMessageDataAttachmentsItem + new UserMessageDataAttachmentsItemFile { - Type = UserMessageDataAttachmentsItemType.File, Path = "/path/to/image.jpg" } } }); + +// Blob attachment — provide base64 data directly +await session.SendAsync(new MessageOptions +{ + Prompt = "What's in this image?", + Attachments = new List + { + new UserMessageDataAttachmentsItemBlob + { + Data = base64ImageData, + MimeType = "image/png", + } + } +}); ``` Supported image formats include JPG, PNG, GIF, and other common image types. The agent's `view` tool can also read images directly from the filesystem, so you can also ask questions like: diff --git a/dotnet/src/Generated/SessionEvents.cs b/dotnet/src/Generated/SessionEvents.cs index c497038c..c64428e8 100644 --- a/dotnet/src/Generated/SessionEvents.cs +++ b/dotnet/src/Generated/SessionEvents.cs @@ -1870,6 +1870,22 @@ public partial class UserMessageDataAttachmentsItemGithubReference : UserMessage public required string Url { get; set; } } +public partial class UserMessageDataAttachmentsItemBlob : UserMessageDataAttachmentsItem +{ + [JsonIgnore] + public override string Type => "blob"; + + [JsonPropertyName("data")] + public required string Data { get; set; } + + [JsonPropertyName("mimeType")] + public required string MimeType { get; set; } + + [JsonIgnore(Condition = JsonIgnoreCondition.WhenWritingNull)] + [JsonPropertyName("displayName")] + public string? DisplayName { get; set; } +} + [JsonPolymorphic( TypeDiscriminatorPropertyName = "type", UnknownDerivedTypeHandling = JsonUnknownDerivedTypeHandling.FallBackToBaseType)] @@ -1877,6 +1893,7 @@ public partial class UserMessageDataAttachmentsItemGithubReference : UserMessage [JsonDerivedType(typeof(UserMessageDataAttachmentsItemDirectory), "directory")] [JsonDerivedType(typeof(UserMessageDataAttachmentsItemSelection), "selection")] [JsonDerivedType(typeof(UserMessageDataAttachmentsItemGithubReference), "github_reference")] +[JsonDerivedType(typeof(UserMessageDataAttachmentsItemBlob), "blob")] public partial class UserMessageDataAttachmentsItem { [JsonPropertyName("type")] @@ -2365,6 +2382,7 @@ public enum PermissionCompletedDataResultKind [JsonSerializable(typeof(UserInputRequestedEvent))] [JsonSerializable(typeof(UserMessageData))] [JsonSerializable(typeof(UserMessageDataAttachmentsItem))] +[JsonSerializable(typeof(UserMessageDataAttachmentsItemBlob))] [JsonSerializable(typeof(UserMessageDataAttachmentsItemDirectory))] [JsonSerializable(typeof(UserMessageDataAttachmentsItemDirectoryLineRange))] [JsonSerializable(typeof(UserMessageDataAttachmentsItemFile))] diff --git a/go/README.md b/go/README.md index 4cc73398..6bccdbb1 100644 --- a/go/README.md +++ b/go/README.md @@ -178,9 +178,10 @@ Event types: `SessionLifecycleCreated`, `SessionLifecycleDeleted`, `SessionLifec ## Image Support -The SDK supports image attachments via the `Attachments` field in `MessageOptions`. You can attach images by providing their file path: +The SDK supports image attachments via the `Attachments` field in `MessageOptions`. You can attach images by providing their file path, or by passing base64-encoded data directly using a blob attachment: ```go +// File attachment — runtime reads from disk _, err = session.Send(context.Background(), copilot.MessageOptions{ Prompt: "What's in this image?", Attachments: []copilot.Attachment{ @@ -190,6 +191,19 @@ _, err = session.Send(context.Background(), copilot.MessageOptions{ }, }, }) + +// Blob attachment — provide base64 data directly +mimeType := "image/png" +_, err = session.Send(context.Background(), copilot.MessageOptions{ + Prompt: "What's in this image?", + Attachments: []copilot.Attachment{ + { + Type: copilot.Blob, + Data: &base64ImageData, + MIMEType: &mimeType, + }, + }, +}) ``` Supported image formats include JPG, PNG, GIF, and other common image types. The agent's `view` tool can also read images directly from the filesystem, so you can also ask questions like: diff --git a/go/generated_session_events.go b/go/generated_session_events.go index 86f5066f..dd70282d 100644 --- a/go/generated_session_events.go +++ b/go/generated_session_events.go @@ -475,6 +475,10 @@ type Attachment struct { Title *string `json:"title,omitempty"` // URL to the referenced item on GitHub URL *string `json:"url,omitempty"` + // Base64-encoded content + Data *string `json:"data,omitempty"` + // MIME type of the inline data + MIMEType *string `json:"mimeType,omitempty"` } // Optional line range to scope the attachment to a specific section of the file @@ -854,6 +858,7 @@ const ( type AttachmentType string const ( + Blob AttachmentType = "blob" Directory AttachmentType = "directory" File AttachmentType = "file" GithubReference AttachmentType = "github_reference" diff --git a/nodejs/README.md b/nodejs/README.md index 78a535b7..8b1c585d 100644 --- a/nodejs/README.md +++ b/nodejs/README.md @@ -297,9 +297,10 @@ See `SessionEvent` type in the source for full details. ## Image Support -The SDK supports image attachments via the `attachments` parameter. You can attach images by providing their file path: +The SDK supports image attachments via the `attachments` parameter. You can attach images by providing their file path, or by passing base64-encoded data directly using a blob attachment: ```typescript +// File attachment — runtime reads from disk await session.send({ prompt: "What's in this image?", attachments: [ @@ -309,6 +310,18 @@ await session.send({ }, ], }); + +// Blob attachment — provide base64 data directly +await session.send({ + prompt: "What's in this image?", + attachments: [ + { + type: "blob", + data: base64ImageData, + mimeType: "image/png", + }, + ], +}); ``` Supported image formats include JPG, PNG, GIF, and other common image types. The agent's `view` tool can also read images directly from the filesystem, so you can also ask questions like: diff --git a/nodejs/src/generated/session-events.ts b/nodejs/src/generated/session-events.ts index cf87e102..67ada797 100644 --- a/nodejs/src/generated/session-events.ts +++ b/nodejs/src/generated/session-events.ts @@ -994,6 +994,24 @@ export type SessionEvent = */ url: string; } + | { + /** + * Attachment type discriminator + */ + type: "blob"; + /** + * Base64-encoded content + */ + data: string; + /** + * MIME type of the inline data + */ + mimeType: string; + /** + * User-facing display name for the attachment + */ + displayName?: string; + } )[]; /** * Origin of this message, used for timeline filtering (e.g., "skill-pdf" for skill-injected messages that should be hidden from the user) diff --git a/nodejs/src/types.ts b/nodejs/src/types.ts index acda50fe..188d7539 100644 --- a/nodejs/src/types.ts +++ b/nodejs/src/types.ts @@ -836,7 +836,7 @@ export interface MessageOptions { prompt: string; /** - * File, directory, or selection attachments + * File, directory, selection, or blob attachments */ attachments?: Array< | { @@ -859,6 +859,12 @@ export interface MessageOptions { }; text?: string; } + | { + type: "blob"; + data: string; + mimeType: string; + displayName?: string; + } >; /** diff --git a/python/README.md b/python/README.md index 5b87bb04..65b606ef 100644 --- a/python/README.md +++ b/python/README.md @@ -234,9 +234,10 @@ async def edit_file(params: EditFileParams) -> str: ## Image Support -The SDK supports image attachments via the `attachments` parameter. You can attach images by providing their file path: +The SDK supports image attachments via the `attachments` parameter. You can attach images by providing their file path, or by passing base64-encoded data directly using a blob attachment: ```python +# File attachment — runtime reads from disk await session.send({ "prompt": "What's in this image?", "attachments": [ @@ -246,6 +247,18 @@ await session.send({ } ] }) + +# Blob attachment — provide base64 data directly +await session.send({ + "prompt": "What's in this image?", + "attachments": [ + { + "type": "blob", + "data": base64_image_data, + "mimeType": "image/png", + } + ] +}) ``` Supported image formats include JPG, PNG, GIF, and other common image types. The agent's `view` tool can also read images directly from the filesystem, so you can also ask questions like: diff --git a/python/copilot/__init__.py b/python/copilot/__init__.py index f5f7ed0b..937a4ef5 100644 --- a/python/copilot/__init__.py +++ b/python/copilot/__init__.py @@ -8,9 +8,13 @@ from .session import CopilotSession from .tools import define_tool from .types import ( + Attachment, AzureProviderOptions, + BlobAttachment, ConnectionState, CustomAgentConfig, + DirectoryAttachment, + FileAttachment, GetAuthStatusResponse, GetStatusResponse, MCPLocalServerConfig, @@ -27,6 +31,7 @@ PingResponse, ProviderConfig, ResumeSessionConfig, + SelectionAttachment, SessionConfig, SessionContext, SessionEvent, @@ -42,11 +47,15 @@ __version__ = "0.1.0" __all__ = [ + "Attachment", "AzureProviderOptions", + "BlobAttachment", "CopilotClient", "CopilotSession", "ConnectionState", "CustomAgentConfig", + "DirectoryAttachment", + "FileAttachment", "GetAuthStatusResponse", "GetStatusResponse", "MCPLocalServerConfig", @@ -63,6 +72,7 @@ "PingResponse", "ProviderConfig", "ResumeSessionConfig", + "SelectionAttachment", "SessionConfig", "SessionContext", "SessionEvent", diff --git a/python/copilot/generated/session_events.py b/python/copilot/generated/session_events.py index 1b442530..4c6caf19 100644 --- a/python/copilot/generated/session_events.py +++ b/python/copilot/generated/session_events.py @@ -186,6 +186,7 @@ def to_dict(self) -> dict: class AttachmentType(Enum): + BLOB = "blob" DIRECTORY = "directory" FILE = "file" GITHUB_REFERENCE = "github_reference" @@ -232,6 +233,12 @@ class Attachment: url: str | None = None """URL to the referenced item on GitHub""" + data: str | None = None + """Base64-encoded content""" + + mime_type: str | None = None + """MIME type of the inline data""" + @staticmethod def from_dict(obj: Any) -> 'Attachment': assert isinstance(obj, dict) @@ -247,7 +254,9 @@ def from_dict(obj: Any) -> 'Attachment': state = from_union([from_str, from_none], obj.get("state")) title = from_union([from_str, from_none], obj.get("title")) url = from_union([from_str, from_none], obj.get("url")) - return Attachment(type, display_name, line_range, path, file_path, selection, text, number, reference_type, state, title, url) + data = from_union([from_str, from_none], obj.get("data")) + mime_type = from_union([from_str, from_none], obj.get("mimeType")) + return Attachment(type, display_name, line_range, path, file_path, selection, text, number, reference_type, state, title, url, data, mime_type) def to_dict(self) -> dict: result: dict = {} @@ -274,6 +283,10 @@ def to_dict(self) -> dict: result["title"] = from_union([from_str, from_none], self.title) if self.url is not None: result["url"] = from_union([from_str, from_none], self.url) + if self.data is not None: + result["data"] = from_union([from_str, from_none], self.data) + if self.mime_type is not None: + result["mimeType"] = from_union([from_str, from_none], self.mime_type) return result diff --git a/python/copilot/types.py b/python/copilot/types.py index f094666c..15ef05c2 100644 --- a/python/copilot/types.py +++ b/python/copilot/types.py @@ -65,8 +65,19 @@ class SelectionAttachment(TypedDict): text: NotRequired[str] +class BlobAttachment(TypedDict): + """Inline base64-encoded content attachment (e.g. images).""" + + type: Literal["blob"] + data: str + """Base64-encoded content""" + mimeType: str + """MIME type of the inline data""" + displayName: NotRequired[str] + + # Attachment type - union of all attachment types -Attachment = FileAttachment | DirectoryAttachment | SelectionAttachment +Attachment = FileAttachment | DirectoryAttachment | SelectionAttachment | BlobAttachment # Options for creating a CopilotClient diff --git a/test/scenarios/prompts/attachments/README.md b/test/scenarios/prompts/attachments/README.md index 8c8239b2..d61a26e5 100644 --- a/test/scenarios/prompts/attachments/README.md +++ b/test/scenarios/prompts/attachments/README.md @@ -11,19 +11,36 @@ Demonstrates sending **file attachments** alongside a prompt using the Copilot S ## Attachment Format +### File Attachment + | Field | Value | Description | |-------|-------|-------------| | `type` | `"file"` | Indicates a local file attachment | | `path` | Absolute path to file | The SDK reads and sends the file content to the model | +### Blob Attachment + +| Field | Value | Description | +|-------|-------|-------------| +| `type` | `"blob"` | Indicates an inline data attachment | +| `data` | Base64-encoded string | The file content encoded as base64 | +| `mimeType` | MIME type string | The MIME type of the data (e.g., `"image/png"`) | +| `displayName` | *(optional)* string | User-facing display name for the attachment | + ### Language-Specific Usage -| Language | Attachment Syntax | -|----------|------------------| +| Language | File Attachment Syntax | +|----------|------------------------| | TypeScript | `attachments: [{ type: "file", path: sampleFile }]` | | Python | `"attachments": [{"type": "file", "path": sample_file}]` | | Go | `Attachments: []copilot.Attachment{{Type: "file", Path: sampleFile}}` | +| Language | Blob Attachment Syntax | +|----------|------------------------| +| TypeScript | `attachments: [{ type: "blob", data: base64Data, mimeType: "image/png" }]` | +| Python | `"attachments": [{"type": "blob", "data": base64_data, "mimeType": "image/png"}]` | +| Go | `Attachments: []copilot.Attachment{{Type: copilot.Blob, Data: &data, MIMEType: &mime}}` | + ## Sample Data The `sample-data.txt` file contains basic project metadata used as the attachment target: