Improve codegen use for headers and QS params in support of multiple scenario beta headers #351
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The original problem
In the course of investigating beta feature integration, I found that our current strategy to invariantly add the
OpenAI-Betaheader via a Pipeline policy is flawed: operations that expect the header require an exact value match, and e.g. attempting to provideOpenAI-Beta: assistants=v2to realtime orOpenAI-Beta: realtime=v1to vector stores will result in an error.This prompted me to encode the header values in the spec's operation definition directly, which led to a bit of a cascading set of changes along the way.
Common spec representation of operation inputs
Rather than adding a redefinition of the same query string literal in each operation, I defined an alias and spread it into the operation. This worked well, and prompted me to attempt the same for the pagination parameters (limit/order/after/before) repeated many times across Assistants and VectorStores -- we know from newer operations that these are only going to become more common.
Code consolidation of collection options
Simplifying the spec exposed a new problem: without redefinition of the parameter in each and every operation, the emitter only creates the spread type once, seemingly for the last applied instance. So Using
...PageOrderQueryParameteracross operations resulted inListFilesInVectorStoreBatchOrderbeing emitted, but no other instances across all of the affected operations.Rather than copy it, I'd like to propose here that we consolidate.
Prior to this change, we had six virtually identical collection options types accompanied by six virtually identical extensible enums. This is appearing in newer operations and even being applied to older ones like /files, so we're quickly going to arrive at a point where this pattern will bloat our library with dozens of duplicative public types that carry no intrinsic benefit -- it's now abundantly clear that we can count on the same limit/order/after/before pattern to get used again and again.
That said, the consolidated
OpenAIPageOptionsandOpenAIPageOrderare still marked as[Experimental]here, as the only applied usage is currently in those operations. We have a backlog of other things it'd retrofit onto.Use code-generated pipeline messages by letting them be virtual
In the process of adding the headers into Assistants and Vector Stores, I rediscovered that we'd currently been suppressing the code-generated pipeline message creation and redefining everything by hand, instead. I saw no functional deltas and thus removed all the customization to align with the spec (and automatically get the new header!). This worked very well.
Then I saw that it broke Azure. The reason we'd been redefining every single message creation method was likely to make it possible for Azure to override it -- a pattern we were using inconsistently but still taking a dependency on.
To address this, I added a standalone targeted PS script (easily transferred into the PR for plugin use) that makes the emitted pipeline creation methods
internal virtualinstead ofinternal. This allows Azure to do what it needs to.Finally, I modified the internal constructor use of operation subclients to hold a parent client reference (e.g.
CreateBatchFileJobOperationhas a back ref to aVectorStoreClient) instead of the minimalClientPipeline. By doing so, not only are the operation subclients able to reference the message creation methods they need in their codegen destination, but Azure is also able to entirely sidestep needing to define derived operation types -- by overriding the message creation in the parent client, the base library's operation types calling "parent client's message creation method" will automatically do the right thing. Lots of superfluous code dies this way.