Skip to content

Conversation

@JAORMX
Copy link
Collaborator

@JAORMX JAORMX commented Nov 14, 2025

This proposal enables automatic MCP registry population based on Gateway API resources, making it easy for administrators to expose and discover MCP servers.

Summary

The operator will watch Gateway API HTTPRoute resources (annotated with toolhive.stacklok.dev/registry-export: "true") to discover which MCP servers are externally accessible. It constructs the full endpoint URL by traversing to the parent Gateway resource and creates registry entries in the upstream MCP Registry format.

Key Features

  • Gateway API integration: Watches HTTPRoute resources for annotation-based discovery
  • Explicit opt-in: Only HTTPRoutes with registry-export annotation are discovered
  • Annotation fallback: Direct URL annotation for non-Gateway API users
  • Rich metadata: Tier, tools, and tags via annotations
  • Upstream format: Uses modelcontextprotocol/registry format with ToolHive extensions
  • Event-based tracking: Kubernetes Events instead of status field bloat

Two Discovery Methods

  1. Primary (Gateway API): Annotate HTTPRoute → operator discovers endpoint
  2. Fallback (Direct URL): Annotate MCP resource → operator uses explicit URL

Annotations

Discovery control:

  • toolhive.stacklok.dev/registry-export: Enable discovery
  • toolhive.stacklok.dev/registry-url: Direct URL (fallback)
  • toolhive.stacklok.dev/registry-name: Override entry name

Metadata:

  • toolhive.stacklok.dev/registry-tier: "Official" | "Community" | "Partner"
  • toolhive.stacklok.dev/registry-tools: Comma-separated tool names
  • toolhive.stacklok.dev/registry-tags: Comma-separated tags
  • toolhive.stacklok.dev/registry-description: Human-readable description

Design Decisions

  • Separation of concerns: Ingress config stays separate from MCP resources (follows ToolHive design principles)
  • Explicit control: Administrators choose what to expose via annotations
  • Progressive rollout: Create HTTPRoute, test it, then add annotation when ready
  • Standard format: Uses upstream MCP Registry format for compatibility

Implementation Phases

  1. Phase 1: Direct URL annotation discovery
  2. Phase 2: Gateway API HTTPRoute discovery
  3. Phase 3: Production hardening (cross-namespace validation, cleanup)

🤖 Generated with Claude Code

This proposal enables the ToolHive Kubernetes operator to automatically
populate the MCP registry with externally accessible endpoints for
MCPServer, MCPRemoteProxy, and VirtualMCPServer resources.

Key features:
- Gateway API HTTPRoute annotation-based discovery (primary approach)
- Direct URL annotation fallback (for non-Gateway API users)
- Explicit opt-in model for administrator control
- Upstream MCP registry format with ToolHive publisher extensions
- Kubernetes Events for discovery tracking

The approach maintains clear separation between ingress configuration
and MCP resource definitions, following ToolHive's design principles.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@JAORMX JAORMX force-pushed the proposal/operator-registry-gateway-api-integration branch from 364577d to 7bf59f7 Compare November 14, 2025 17:57
@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.30%. Comparing base (c94268a) to head (7bf59f7).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2591      +/-   ##
==========================================
- Coverage   55.30%   55.30%   -0.01%     
==========================================
  Files         309      309              
  Lines       29129    29129              
==========================================
- Hits        16111    16109       -2     
- Misses      11585    11587       +2     
  Partials     1433     1433              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@rdimitrov rdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM but let's make sure we get this reviewed by the others too before going through with the implementation 👍

}
},
"_meta": {
"io.modelcontextprotocol.registry/official": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part is probably obsolete as this metadata is supposed to come if we were to get an MCP server from the upstream registry.

Copy link
Collaborator

@dmartinol dmartinol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice proposal, reminds me a full MCP gateway now 🤔


**Metadata Annotations** (on HTTPRoute or MCP resource):
4. **`toolhive.stacklok.dev/registry-tier`**: Server classification ("Official", "Community", "Partner")
5. **`toolhive.stacklok.dev/registry-tools`**: Comma-separated list of tool names (e.g., "create_pr,merge_pr,list_issues")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should we repeat the tools and tags if the registry already advertise them in the registered server? (e.g. the server with OCI package should be already there, right?)
maybe the annotations should connect the MCPServer to the registered server package with annotations sharing the name and version of the server?
BTW: of course also possible tool configs should be taken into account for filtering and renaming

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please review the possible overlaps with #2105
e.g., how many remotes are we going to advertise in the registry for a single MCPServer? I guess 1 for in-cluster consumers and, optionally, 1 for external consumers (this proposal)

2. Extracts the base MCP path from HTTPRoute rules
3. Traverses to parent Gateway resource(s) to get external hostname/IP
4. Constructs the full external endpoint URL
5. Creates an entry in the MCPRegistry pointing to this external endpoint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the detail we need to hash out. For the fallback approach, I think things are slightly simpler because the time which the Operator creates the MCP server, it is aware of what the URL is because its provided as an annotation. This way, the Operator can send a single request to the registry server, publishing information about the newly deployed and running MCP server, including it's externally accessible URL.

With the primary approach, I'm not sure of the sequence flow. The Operator will publish information to the registry server when it creates the MCP server, but the publishing of information around the externally accessible URL may not be known at the time because the Gateway API resources may be created by the user later on. This will result in a subsequent request to the registry server that denotes that there is now an externally accessible URL and to save it in the database.

I'm not saying this is a problem, I just think we need to iron out the specifics in the lower levels of the requests that will be sent in the scenarios. Possibly a sequence diagram will help here? This will also need to be factored into the registry server API design that @blkt has been steering so that if we do want to allow for the subsequent requests, what does the API look like for that.


1. Detects the annotated HTTPRoute and validates it's accepted by the Gateway controller
2. Extracts the base MCP path from HTTPRoute rules
3. Traverses to parent Gateway resource(s) to get external hostname/IP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Do we think users will be ok with the Operator being able to read the Gateway API resources and traversing them to build a URL? For MCP related resources its perhaps not a problem, but for others, it seems a bit "grabby"
  2. Do we need to traverse to the Gateway resource to get hostname? The hostnames are provided in the HTTPRoute right? Perhaps we don't need to go any further to avoid grabbing what we don't need (see point 1). Or are you saying that we may need to traverse because the hosts are optional in the HTTPRoute and can contain wildcards?

toolhive.stacklok.dev/registry-export: "true"
toolhive.stacklok.dev/registry-name: "github-production"

# Registry metadata
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these extra bits around "Registry metadata"? I'm wondering if instead we should allow specification of registry specific information somewhere else. The HTTPRoute is concerned with ingress, it doesn't feel the best place to conflate registry entry metadata in them too. The "Discovery control" annotation toolhive.stacklok.dev/registry-export make more sense to have here because it's denoting that a user wants this HTTPRoute information to be published to the registry.


**Event Types:**
- `EndpointDiscovered`: When a new endpoint is discovered via HTTPRoute or annotation
- `EndpointUpdated`: When an endpoint URL changes (Gateway address change, HTTPRoute update)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would it know if the endpoint URL changes? Will it store the endpoint URL somewhere to do comparisons?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants