feat: carry Foundry agent and resource definitions in azure.yaml#8779
Conversation
…dry service target
jongio
left a comment
There was a problem hiding this comment.
Solid architectural change. The decomposition of Foundry resources into individual azure.yaml service entries with uses: ordering is well-designed, and the fallback collectors (sibling service wins, bundled agent config is the legacy fallback) are cleanly implemented with good test coverage. A few observations below.
jongio
left a comment
There was a problem hiding this comment.
Incremental review (22 commits since last pass at 01463ff). CI is green, overall architecture is clean. The new pkg/foundry package is well-designed: proper cycle detection, depth bounds, error typing, and good test coverage. The resource-service split across per-host extensions is consistent and well-factored.
Prior unresolved items (4 comments from my earlier review remain open with no replies):
sanitizeServiceNameonly strips spaces (resource_services.go:193)- Path traversal in
resolveSkillInstructions(skills service_target.go:180) - Silent yaml.Marshal skip (agent_definition.go:348)
- Silent empty connName skip (resource_services.go:87)
I won't re-post these, but they still apply to the current HEAD.
New observations on the pkg/foundry package:
-
ExpandEnvsentinel restoration correctness (templating.go:68-70): The sentinel is guaranteed absent from the input, but env var expansion runs after masking. If an env var resolves to a string containing the sentinel prefix (azdFoundryTemplateSpan_0_),strings.Replacewill match it and inject a Foundry span where it doesn't belong. Extremely unlikely in practice, but the correctness invariant ("Foundry expressions are always returned byte-for-byte") is violated in that case. A simple fix: after expansion, verify each placeholder appears exactly once before replacing, or use a sentinel that encodes a random token per call. -
looksLikeInstructionsPathheuristic (includes.go:262-268): Inline prose that ends with.mdor.txt(e.g., "Follow the steps in README.md") will be incorrectly classified as a file path and rebased. The current check (single-line + suffix) is a reasonable heuristic, but consider also requiring a path separator or./prefix, or at least documenting the false-positive risk so authors of$ref'd files know to avoid ambiguous instructions values. -
Nil map-value guard (resource_services.go:222-229, 255-261, 287-293): The
collectProjectDeployments/collectConnections/collectToolboxescollectors iterate the services map and dereference each value's Host field. If a map ever contains a nil value (unlikely from azd core, but possible from test code or future refactors), this panics. A nil guard insortedServiceswould be cheap insurance.
All three are low-to-medium severity. The overall approach is solid; the design of ResolveFileRefs + YAMLDocument as a read/write pair sharing path logic is thoughtful. Ship it once the prior items are addressed.
jongio
left a comment
There was a problem hiding this comment.
Incremental review (1 commit since last pass at e59e938). The fix commit d6fd4da is clean and well-scoped:
-
Legacy host restored:
microsoft.foundryadded toFoundryServiceHostsfor backward compat. Tests cover mixed-host ambiguity detection. Schemarefcorrectly points to/main/...microsoft.foundry.json. -
Whitespace endpoint trimming:
strings.TrimSpaceapplied consistently infoundryServiceEndpoint,warnNetworkIgnoredInBrownfield, and the synthesizer's brownfield check. Blank endpoints now correctly trigger greenfield mode. Test added. -
Missing brownfield warning:
warnNetworkIgnoredInBrownfieldwas only called on the synthesizer path. Now also called on the on-disk Bicep path whenendpoint:is set. Good catch.
No concerns. CI is green for the core pipelines; agents-public build is still pending (expected timing). LGTM on this increment.
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
jongio
left a comment
There was a problem hiding this comment.
Incremental review (5 new commits since d6fd4da). All five address prior review feedback:
- Path traversal rejection in
resolveSkillInstructionsis correct: checks beforefilepath.Join, covers\and/separators, has test coverage for nested traversal (sub/../../escape.md). - Warnings for empty sanitized names are user-facing on stderr, consistent with
WarnLegacyAgentShape. - Schema docs now accurately reflect that remote URLs aren't supported.
azure.ai.*/schemas/*.jsonglob ensures cross-extension `` resolution works offline.- Marshal failure logging makes the validation skip visible during troubleshooting.
No new issues found in this increment.
| @@ -1,7 +1,7 @@ | |||
| // Copyright (c) Microsoft Corporation. All rights reserved. | |||
There was a problem hiding this comment.
Why are these files moved into the top level azd pkg?
There was a problem hiding this comment.
These files (the ref resolver, the azure.yaml edit helper, and the variable expander) were previously in azure.ai.agents/internal/project/. The problem: the connections, routines, and toolboxes extensions also need ref resolution and variable expansion for their own azure.yaml service entries. Keeping these in agents' internal package would force a circular dependency or require other extensions to import agents internals.
pkg/foundry is the shared-helpers layer accessible to all Foundry extensions without cross-extension coupling.
There was a problem hiding this comment.
These helpers (the $ref resolver, the edit helper, and the ${VAR} expander) used to live in agents'' internal/ package. Connections, routines, and toolboxes now need them too, and Go does not allow importing another extension''s internal/. pkg/foundry gives them one shared home instead of duplicated copies.
| // instead of service-level properties. Tracks migration of older Foundry | ||
| // projects off the legacy shape. | ||
| FoundryAgentLegacyConfigKey = AttributeKey{ | ||
| Key: attribute.Key("foundry.agent.legacy_config_shape"), |
There was a problem hiding this comment.
What actually sets this? Is this something the azd folks are alright with being added to telemetry?
There was a problem hiding this comment.
It's set in the project telemetry collection (cli/azd/cmd/actions.go area) when legacyAgentConfig is detected during project loading. The flag fires when a project still uses the deprecated config: block on its agent service entry instead of the new service-level properties introduced in this PR.
Classification is SystemMetadata / FeatureInsight, which is the standard azd tier for tracking feature-adoption signals (same tier as ProjectServiceHostsKey and ProjectServiceLanguagesKey). It doesn't capture user content, just a boolean indicating the project is on the old shape.
There was a problem hiding this comment.
Small correction: it is set in pkg/project/project.go during Load(), not actions.go. When a project loads with an azure.ai.agent service still using the old config: block, we tag a boolean. No user content, same tier as the existing ProjectServiceHostsKey.
| @@ -0,0 +1,496 @@ | |||
| // Copyright (c) Microsoft Corporation. All rights reserved. | |||
There was a problem hiding this comment.
What is the plan to stop supporting old formats? Adding this file now gives at least a third place where we have agent objects defined, making it all that much more difficult to make sure things are being updated in the right places. Would be good if we have an in place plan to switch completely and remove all of the extra logic
There was a problem hiding this comment.
The upcoming release will continue to support the old formats. We can consider deprecating them in the following version
| @@ -1,6 +1,6 @@ | |||
| { | |||
There was a problem hiding this comment.
Why do we need both these files, and the individual schemas in the other extensions?
There was a problem hiding this comment.
The agents extension bundles peer-extension schemas (Connection.json, Toolbox.json, etc.) so that $ + ref resolution in agent definitions can validate those shapes offline without requiring the connections/toolboxes/skills extensions to be installed at validation time.
The individual extension schemas (e.g., in azure.ai.connections/schemas/) remain the canonical source. These copies in azure.ai.agents/schemas/ are resolved via the azure.ai.*/schemas/*.json glob at build time, enabling cross-extension validation within a single extension's scope.
There was a problem hiding this comment.
They are two different shapes. agents/schemas/Connection.json is a nested item inside the legacy microsoft.foundry host (one entry composing arrays). azure.ai.connections/.../azure.ai.connection.json is the new host: azure.ai.connection service, where each connection is its own service. Both exist because this PR keeps the legacy host working alongside the new per resource hosts. They are not copies; the shapes differ.
Summary
azure.yamlbecomes the single source of truth for a Foundry agent and its companion resources. The agent definition moves off disk into the agent's service entry, and every Foundry resource — the project and its model deployments, connections, toolboxes, skills, and routines — becomes a first-class service with a realazd deploytarget. Existing projects keep working through legacy fallbacks that emit deprecation signals.What changed
azure.yaml. Carried inline on theazure.ai.agentservice entry instead of a separate on-diskagent.yaml.initstops writing that file; every agent command (deploy, run, listen, update, endpoint show) reads from the service entry. Projects on the old on-disk shape still load, with a deprecation warning and a telemetry signal.host: azure.ai.<kind>entries, each owned by its extension.azd deploynow performs a real upsert per resource (previously placeholders);initwrites the entries and wires their provision/deploy order.endpoint:). When a Foundry service setsendpoint:, provisioning is skipped and azd connects to that project instead of creating one.${VAR}from the azd environment at deploy time, while server-side${{...}}placeholders are preserved.azure.yamlJSON schema (stable + alpha) gains per-host validation for all six Foundry hosts; the legacy singlemicrosoft.foundryhost stays accepted so existing projects keep validating.$refresolver, theazure.yamledit helper, and the${VAR}expander move into core (pkg/foundry) so all Foundry extensions share one implementation.Before merge
The
connections,routines, andtoolboxesextensions temporarilyreplacethe azd core dependency with the in-tree checkout to build against the newpkg/foundryhelpers. Remove these directives once the core change lands and bump the published dependency.