Skip to content

Add --run-in-docker to skill-validator to run Copilot CLI in a docker container#176

Draft
caaavik-msft wants to merge 1 commit intodotnet:mainfrom
caaavik-msft:caaavik/docker-copilot-server
Draft

Add --run-in-docker to skill-validator to run Copilot CLI in a docker container#176
caaavik-msft wants to merge 1 commit intodotnet:mainfrom
caaavik-msft:caaavik/docker-copilot-server

Conversation

@caaavik-msft
Copy link
Contributor

Summary

This PR adds an optional Docker execution mode to skill-validator so agent runs, judges, and setup commands can execute in an isolated container instead of directly on the host machine.

Motivation

The main use case for this is for local development, but it might also be useful for running in CI if we want to build on top of it. I was building some skills and found that when using some weaker models, they made destructive changes to my host system to accomplish the task (e.g. reinstalling .NET). With this, agents and judges run inside a container with only access to the files they need bound to the host machine. This does not add any additional security measures for network isolation.

Implementation

This makes use of the --headless mode for running copilot as described here: https://github.com/github/copilot-sdk/blob/main/docs/guides/setup/backend-services.md.

It requires a GITHUB_TOKEN be present to pass into the container so that it can use that to authenticate to the Copilot API. I have an example in the README which explains that you can get this token with gh auth token. For people with multiple gh accounts (e.g. personal and enterprise), you can also do gh auth token --user <name>.

A Dockerfile is included in the repo to use as the base image:

FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build

ARG COPILOT_SDK_VERSION
RUN dotnet new console -o /tmp/dl \
    && dotnet add /tmp/dl package GitHub.Copilot.SDK --version $COPILOT_SDK_VERSION \
    && dotnet build /tmp/dl -c Release \
    && cp /tmp/dl/bin/Release/net10.0/runtimes/*/native/copilot /usr/local/bin/copilot \
    && chmod +x /usr/local/bin/copilot \
    && rm -rf /tmp/dl

RUN copilot --version

This ensures that we use the exact same Copilot CLI binary that is shipped with the SDK. The SDK version is resolved programmatically inside the SkillValidator so it is kept in sync. It places the copilot binary at /usr/local/bin/copilot inside the container.

To handle path mapping/translation, when running in docker mode, all temp/work directories are placed inside a single directory in the TMP folder, and that entire directory is mounted into the container with read-write. This makes it easy to map paths to and from the host and container equivalent when needed. Skill directories are also mounted into the container with read-only access, and only the directories that are being evaluated will be mounted.

The container uses a randomised port -p 0:4321 which is resolved later using docker port. The container is always cleaned up after finishing, including on ProcessExit and CancelKeyPress events.

Future Extensibility

I have a proof of concept working locally which I chose not to push for now to keep this PR simple which runs all agents inside their own containers rather than having a single container that is used to run all agents and judges. This would help reduce any risks of agents modifying the environment and impacting other evaluations if that sounds desirable, but it does mean that each agent would use a separate CopilotClient rather a single shared CopilotClient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant