Skip to content

Sequential (Streaming) media types and link to registry #4518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions src/oas.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,131 @@ Some examples of possible media type definitions:
application/vnd.github.v3.patch
```

#### Media Type Registry

While the [Schema Object](#schema-object) is designed to describe and validate JSON, several other media types are commonly used in APIs.
Requirements regarding support for other media types are documented in this Media Types section and in several Object sections later in this specification.
For convenience and future extensibility, these are cataloged in the OpenAPI Initiative's [Media Type Registry](https://spec.openapis.org/registry/media-type/), which indicates where in this specification the relevant requirements can be found.

#### Sequential Media Types

Several media types exist to transport a sequence of values, separated by some delimiter, either as a single document or as multiple documents representing chunks of a logical stream.
Depending on the media type, the values could either be in another existing format such as JSON, or in a custom format specific to the sequential media type.

Implementations MUST support modeling sequential media types with the [Schema Object](#schema-object) by treating the sequence as an array with the same items and ordering as the sequence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is confusing to me, and doesn't seem to reflect the requirement that the Schema Object modeling the sequence must itself be of type: array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no requirement that the Schema Object include type: array, although it would be a good practice.

What we're talking about here is not so much what to put in the Schema Object, but what data structure to convert the document to in order to use that data structure with the document.

Implementations don't get that from the Schema Object, they get that from these requirements, so it would be an error on the part of the implementation to pass anything but an array here. Of course, it's good practice to put the type: array in, and if you have other tools that depend on the type keyword and aren't paying attention to the media type with which the Schema Object is used, then you have to do that. But there's no requirement for it to be in the Schema Object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duncanbeevers my most recent commit (after a force-push that was a re-base of the unchanged original commit) added some clarification here, please see if that helps!

This requirement applies to the in-memory data structure corresponding to a sequential media type document, and does not change the behavior or restrict the capabilities of the Schema Object itself.

##### Working With Indefinite-Length Streams

In addition to regular document-style use, sequential media types can be used to represent some portion of a stream that may not have a well-defined beginning or end.
In such use cases, either the client or server makes a decision to work with one or more elements in the sequence at a time, but this subsequence is not a complete array in the sense of normal JSON arrays.

OpenAPI Description authors are responsible for avoiding the use of JSON Schema keywords such as `prefixItems`, `minItems`, `maxItems`, `contains`, `minContains`, or `maxContains` that rely on a beginning (for relative positioning) or an ending (to determine if a threshold has been reached or a limit has been exceeded) when the sequence is intended to represent a subsequence of a larger stream.
If such keywords are used, their behavior remains well-defined but may be counter-intuitive for users that expect them to apply to the stream as a whole rather than each subsequence as it is processed.
Copy link

@ThomasRooney ThomasRooney Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally wonder if this trade-off in slight confusion is worth it. The modelling of jsonl/sse in OpenAPI I've personally seen has always been for an indefinite-length stream, and I feel it might be a bit confusing for OAS authors and tool vendors to represent those as a type: array.

An alternative modelling is to have the schema model purely the JSON within the stream, and to validate the schema against each entry.

paths:
  /users/export:
    get:
      tags:
        - Users
      summary: Export user data in JSONL format
      description: >
        This endpoint returns user data in JSONL format, with each line containing a complete user record.
        This format is ideal for large datasets that need to be processed one record at a time.
      responses:
        '200':
          description: User data in JSONL format
          content:
            application/jsonl:
              schema:
                $ref: '#/components/schemas/User'
        '400':
          description: Invalid request
        '500':
          description: Internal server error
components:
  schemas:
    User:
      type: object
      required: [id, name, email]
      properties:
        id:
          type: string
          format: uuid
          description: Unique identifier for the user
        name:
          type: string
          description: User's full name
        email:
          type: string
          format: email
          description: User's email address
        age:
          type: integer
          description: User's age
        city:
          type: string
          description: User's city of residence

This approach has a few advantages for both JSONL and SSE. For JSONL, it:

  1. Matches the majority (I think all?) of examples I've come across in the wild from internal APIs.
  2. Is slightly simpler for tooling vendors to reason about.

Note

E.g. as Speakeasy, one of the client SDK generators as we convert the schema into a native type in each language, with application/jsonl indicative of purely the serialization/deserialization layer and wrapping of an the operation into some kind of Stream<T> response (where T is the subschema) in an SDK method. I.e. since type: array isn't directly exposed to users of an SDK, going with the proposed modelling we'd need to "unwrap"/special-case schemas at the top level as those impact the Stream, rather than the JSON within the stream. It might become similarly "messy" to implement for other vendors like API Gateway and documentation vendors.

An alternative modelling that supports/indicates a finite length JSONL response (note: we haven't actually seen any of these APIs yet, but my variant proposal otherwise closes the door on them) could be to represent that information within a new entry under the media type object, perhaps by following the example set by the encoding object:

paths:
  /users/export:
    get:
      tags:
        - Users
      summary: Export user data in JSONL format
      description: >
        This endpoint returns user data in JSONL format, with each line containing a complete user record.
        This format is ideal for large datasets that need to be processed one record at a time.
      responses:
        '200':
          description: User data in JSONL format
          content:
            application/jsonl:
              stream: # applicable for streaming media types only
                maxItems: 2
              schema:
                $ref: '#/components/schemas/User'
        '400':
          description: Invalid request
        '500':
          description: Internal server error

For SSE, there are also advantages. Consider the special data types for data, id, event defined by the text/event-stream media type. It's commonly modelled with something like this:

paths:
  /stock-updates:
    get:
      tags:
        - ServerSentEvents
      summary: Subscribe to real-time stock market updates
      description: >
       This endpoint streams real-time stock updates to the client using server-sent events (SSE).
       The client must establish a persistent HTTP connection to receive updates.
      responses:
        '200':
          description: Stream of real-time stock updates
          content:
            text/event-stream:
              schema:
                $ref: '#/components/schemas/StockStream'
        '400':
          description: Invalid request
        '500':
          description: Internal server error
components:
  schemas:
    StockStream:
      type: object
      description: A server-sent event containing stock market update content
      required: [id, event, data]
      properties:
        id:
          type: string
          description: Unique identifier for the stock update event
        event:
          type: string
          const: stock_update
          description: Event type
        data:
          $ref: '#/components/schemas/StockUpdate'

    StockUpdate:
      type: object
      properties:
        symbol:
          type: string
          description: Stock ticker symbol
        price:
          type: string
          description: Current stock price
          example: "100.25"

By continuing to represent the stream this way, we could open the door to richer modelling of the top level properties to also fit into the "encoding" object.

E.g. consider the "sentinel" event; something that's become popularised by the AI/LLM APIs by sending [DONE] as the last SSE data chunk. By avoiding the wrapping of the stream in type: array, we could enable the description of these media-type-specific entries in a standardized way through encoding, which will gracefully degrade if a tooling vendor doesn't understand the syntax because it's highly localized rather than "tainting" the JSON schema in the response body:

paths:
  /stock-updates:
    get:
      tags:
        - ServerSentEvents
      summary: Subscribe to real-time stock market updates
      description: >
       This endpoint streams real-time stock updates to the client using server-sent events (SSE).
       The client must establish a persistent HTTP connection to receive updates.
      responses:
        '200':
          description: Stream of real-time stock updates
          content:
            text/event-stream:
              encoding:
                event:
                  sentinel: '[DONE]]'
              stream:
                maxItems: 10
              schema:
                $ref: '#/components/schemas/StockStream'
        '400':
          description: Invalid request
        '500':
          description: Internal server error

By modelling it as type: array, it feels to me like we'd close the door on additional modelling of the top level fields outside of JSON Schema or extensions associated with the media type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ThomasRooney first, let me apologize for not tagging you in the original PR comment- I knew I was missing someone!

I'm going to take a while to think through this further, and also tag @gregsdennis who asked about this direction on Slack.

For now, I'll just state a few important principles that are guiding me here:

  • We model media types, and not protocols implemented on top of media types. There's nothing wrong with modeling protocols, but it can't be done by repurposing the media type layer. It would need a new mechanism, and that's too big of a change for 3.2, which needs to ship by this summer. Really, that would be better as a companion specification as it is beyond the current scope of the OAS.
  • The challenge here is that there's nothing in any of the JSON media types that says that every entry MUST be in the same format. If that were the case, then yes, the natural modeling would be to just model the single entry type. But we need to work with the media types as written, not as would make them more convenient. text/event-stream will tend to be more uniform, but there's no guarantee that someone won't use it in an unexpected way.
  • I feel like you're focusing on the response use case, but there are request use cases where the JSONL being sent is closer to a normal document.
  • I'm not that fixated on prefixItems, and in fact I think a more common relevant use would be to use maxItems as a way to limit the chunk size, although I do not know if that is ever actually done.
  • The Encoding Object is problematic for far too many reasons to get into here, and is due for a re-think in 3.3


##### Sequential JSON

For any media type where the items in the sequence are JSON values, no conversion beyond treating the sequence as an array is required.
JSON Text Sequences (`application/json-seq` and the `+json-seq` suffix, [[?RFC7464]]), JSON Lines (`application/jsonl`), and NDJSON (`application/x-ndjson`) are all in this category.
Note that the media types for JSON Lines and NDJSON are not registered with the IANA, but are in common use.

The following example, which uses `application/json-seq` but would be identical aside from the media type for either `application/jsonl` or `application/ndjson`, models a finite stream consisting of a single metadata document followed by an indefinite number of data documents consisting of numeric measurements with units:

```YAML
content:
application/json-seq:
schema:
type: array
prefixItems:
- $comment: Metadata for all subsequent data documents
type: object
required:
- subject
- dateCollected
properties:
subject:
type: string
dateCollected:
type: string
format: date-time
items:
$comment: A JSON document holding data
type: object
required:
- measurement
- unit
properties:
measurement:
type: number
unit:
type: string
```

##### Server-Sent Event Streams

The `text/event-stream` from the [HTML specification](https://html.spec.whatwg.org/multipage/iana.html#text/event-stream), which is also not IANA-registered, uses a custom named field format for its items.
Field names can be repeated within an item to allow splitting the value across multiple lines; such split values MUST be treated the same as if they were a single field, with newlines added as required by the `text/event-stream` specification.

Field value types MUST be handled as specified by the `text/event-stream` specification (e.g. the `retry` field value is modeled as a JSON number that is expected to be of JSON Schema `type: integer`), and fields not given an explicit value type MUST be handled as strings.

The `text/event-stream` specification requires that fields with Unknown names, as well as `id` fields where the value contains `U+0000 NULL` be ignored.
These fields SHOULD NOT be present in the data used with the Schema Object.

For example, the following `text/event-stream` document:

```EVENTSTREAM
event: add
data: This data is formatted
data: across two lines
retry: 5

event: add
data: 1234.5678
unknown-field: this is ignored
```

is equivalent to this JSON instance for the purpose of working with the Schema Object:

```JSON
[
{
"event": "add",
"data": "This data is formatted\nacross two lines",
"retry": 5
},
{
"event": "add",
"data": "1234.5678"
}
]
```

Note that `"1234.5678"` is a string, which avoids problems with number sizes and precision.
See [Data Type Format](#data-type-format) for options for handling numbers transported as strings.
Note also the newline inserted in the string in the first entry, and the absence of the field labeled `unknown-field` in the second entry.

The following Schema Object is a generic schema for the `text/event-stream` media type as documented by the HTML specification as of the time of this writing:

```YAML
type: array
items:
type: object
required:
- data
properties:
data:
type: string
event:
type: string
id:
type: string
retry:
type: integer
```

Some users of `text/event-stream` use a format such as JSON for field values, particularly the `data` field.
Use JSON Schema's keywords for working with the [contents of string-encoded data](https://www.ietf.org/archive/id/draft-bhutton-json-schema-validation-01.html#name-a-vocabulary-for-the-conten), particularly `contentMediaType` and `contentSchema`, to describe and validate such fields with more detail than string-related validation keywords such as `pattern` can support.

### HTTP Status Codes

The HTTP Status Codes are used to indicate the status of the executed operation.
Expand Down