Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to add a 'version' value into the message object to support a more granular versioning #1068

Open
IsmaelMartinez opened this issue Sep 23, 2024 · 22 comments
Labels
💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)

Comments

@IsmaelMartinez
Copy link

Introduction

This proposal is to add a version field into the message object. This is to allow a more granular versioning.

Problem Statement

Currently, versioning in AsyncAPI is done at the application level. This inherently implies that all messages are of the same version, or that events do not hold a version at all. This approach works when an application fully controls its specifications/messages, with no cross references from other applications. An example are openAPI specs.

However, in environments with continuous development and/or fast iterations, application level versioning looses its benefit. Versioning becomes more important at the message level. Without granular versioning, any change in the application version applies to all messages it sends and receives, which can be problematic as shown in the example.

Example

Payment-service (v1.0.0) receives [billing-agreement-created...], sends [payment-success].
Comms-service (v1.0.0) receives [payment-success, payment-failed, billing-agreement-created, address-changed..] sends [communication-send...]

All the events received and sent are then version 1.0.0.

If the communication-send message needs a new field (e.g., adding SMS as a type), the comms-service would become v1.1.0, and all its messages would then be version 1.1.0.

This change would require updating all other AsyncAPI files for payment-service and others, as the comms-service would be otherwise consuming message versions no-one is producing.

Proposed Solution

The proposal is to add an optional version on the message object level to allow for granular versioning.

This in turn will provide more visibility into what has changed, as an application version can change without modifying any of the underlying messages it sends and/or receives.

Example outcome

With the proposed solution, we can fix the message version in the comms-service for messages that haven't changed:

  • Comms-service (v1.1.0) receives [payment-success (v1.0.0), payment-failed (v1.0.0), billing-agreement-created (v1.0.0), address-changed (v1.0.0)..] sends [communication-send (v1.1.0)...]

Alternatives

  • Using a new message for each version as exemplified in https://eventstack.tech/posts/versioning-is-easy . The downside is that you still need to increase the application version to provide the new version of the application, what in turn means all other messages increases their version.
  • Using message traits might be able to provide a version, but feels more of an "weak" link
  • Using the specification-extensions, like x-version or x-message-version, can give us also a version to use. While this might work for some extensions, would make it easy for conflicting standarisation when 2/3 ways will compete for a standard version.

NOTE: Not sure if this relates in part to #432 . I made it a different proposal as it seems granular enough.

@IsmaelMartinez IsmaelMartinez added the 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md) label Sep 23, 2024
Copy link

Welcome to AsyncAPI. Thanks a lot for reporting your first issue. Please check out our contributors guide and the instructions about a basic recommended setup useful for opening a pull request.
Keep in mind there are also other channels you can use to interact with AsyncAPI community. For more details check out this issue.

@fmvilas
Copy link
Member

fmvilas commented Sep 23, 2024

I'm all in with this proposal. Just want to leave a note on the following:

This inherently implies that all messages are of the same version, or that events do not hold a version at all.

I think it's more like the latter. The version of the application has nothing to do with the version of the messages. This surely can be clarified.

@IsmaelMartinez
Copy link
Author

Sure, in my case the implementation of this in EventCatalog has assumed the events has a version, what has created the problem for me.

I agree the spec don't imply events have a version. Do you want to edit my message/proposal ? Not sure how this contributions work on that aspect, and if the proposal gets edited or not. Thanks a lot!

Also, if this gets approved, I am happy to make the changes in the parser, spec files, etc. It shouldn't be a big job (famous last words)...

@smoya
Copy link
Member

smoya commented Sep 23, 2024

I +1 this. I believe it is a good addition @IsmaelMartinez 💯

@fmvilas
Copy link
Member

fmvilas commented Sep 23, 2024

Do you want to edit my message/proposal ?

No, I think that's fine. Just wanted to clarify it 👍

@jonaslagoni
Copy link
Member

jonaslagoni commented Sep 24, 2024

Sounds like a good proposal yea 👍 Maybe make global version optional 🤔

@IsmaelMartinez
Copy link
Author

IsmaelMartinez commented Sep 24, 2024

Sounds like a good proposal yea 👍 Maybe make global version optional 🤔

Can that be for another RFC ? Otherwise changes can grow arms, legs and tentacles ;)

There doesn't seem to be any opposition to the matter. Reading the CONTRIBUTING.md I understand there are a few promotion steps that needs to happen and then a few changes to the parser and other projects.

I would be happy to start making those PRs just to have them ready for when that promotion happens. Just let me know if I am jumping the gun. Ta

@derberg
Copy link
Member

derberg commented Oct 1, 2024

Good to see some movement in the spec proposals, thanks @IsmaelMartinez

I saw your PR and have some questions/requests - I put these here in the issue as for now the PR is very small without much detail.

There is an old and similar issue, not sure if you had a chance to see it: #697

It is schema versioning related - but seems to be the same topic, just a different level of defining version.

If the communication-send message needs a new field (e.g., adding SMS as a type), the comms-service would become v1.1.0, and all its messages would then be version 1.1.0.

Application level version is application-related and does not imply the version change on the message level.
Have a look at the description of the issue I linked. It is not always the way you described. You may have an app that sends multiple messages, that differ a bit, and in the majority of cases, each version goes to a different channel.

With the above, we already end up with the option to have a version maybe on the message level, maybe on the schema level, but maybe also on the channel level. Have a look for example at below case study (channel names in the example AsyncAPI document) and notice how Adeo also specifies the version in the name (address in v3) of the channel.

https://www.asyncapi.com/casestudies/adeogroup

To complex things even more, you can see that it may also happen that a version of the schema, like also in the case of CloudEvents, can be defined at the schema level because integration systems require it in the payload -> https://www.asyncapi.com/casestudies/adeogroup#versioning-of-schemas

With all this EDA complexity that we have to deal with, and different patterns around - does it make sense to somehow arbitrarily say that versioning should be defined on the message level? Of course, I might be missing something and have wrong assumptions - sorry for that.

Sorry but just wanted to drop all these questions to make sure we clarify some things first.

Also for me a must-have for this proposal is to have some real examples of AsyncAPI documents with this new version flag. If you could take some time and show some of your real AsyncAPI documents and how they would benefit from this new flag - anonymize them to the maximum if needed. What is the benefit? for docs readers that they can see a kind of drop-down and switch between message definitions in different versions? but messages are usually shown in context of an operation, would you have a case where you have one operation that might receive two different message versions? like success-v1 or success-v2

This is what travelling is for, to sit at the airport for a few hours and find time to interact in GitHub issues 😄

Thanks again a lot for the proposal 🙏🏼 🙏🏼 🙏🏼 and please do not feel discouraged about all these questions 🙏🏼

@IsmaelMartinez
Copy link
Author

Hi @derberg , Thanks for the long reply and apologies I didn't find the related issue.

I think we got different use cases in there, but would suggest for the focus of this change to be only related to the message part. Currently there isn't a standard way to specify a version for a message in asyncapi, what means there is a divergence on how people are implementing the versioning.

Application level version is application-related and does not imply the version change on the message level.

My description applies to EventCatalog, but as you have highlighted, seems to be a broader issue that other people are tackling differently. In EventCatalog they have accepted the application version as the version of all events. This is incorrect and have driven me into this quest to try to put some sense to the madness.

From my understanding, adding an optional version in the messageObject level can help standardise this area. I don't think there is a standard way to do this neither in OpenAPI, JSONSchema, avro, CloudEvents, etc. OpenAPI has a similar version for the document/file, but not for the objects. Same applies to CloudEvents that only version the spec (I believe is called specversion).

Reading those adeo links, they seem to have gone for specifying the version in within their avro messages (so outside of the asyncAPI specification). While this is ok, and is a way, it is outside of asyncapi remit/power, what would mean having many different ways of doing this.

Regarding the schema spec version (like openapi 3.0.0), I think is currently ok as it is. We could discuss this as a side quest, but similar with channels and others, that would end up been a never ending discussion. I think the name version is ok as it is withing the message object. If we need a spec version in the future, that can be called specVersion or similar.

With all this EDA complexity that we have to deal with, and different patterns around - does it make sense to somehow arbitrarily say that versioning should be defined on the message level? Of course, I might be missing something and have wrong assumptions - sorry for that.

I don't think we are saying that versioning should be defined on the message level, but might be defined on the message level. The current approach, shared by all the specs is "it is not my problem", so I think there is an opportunity to do something about it.

I will get some popcorn and read the linked issue later when I have some time, and see if I can put some examples (I suspect I will probably finish drinking whisky by the end of it). I do appreciate the feedback and understand is a complex and controversial topic, but I also think is extremely important and, again, a plus to have.

Happy to setup a call and we can discuss a bit further this. Sometimes in writing is not the best way to move things forward.

Again, thanks a lot for the comments and links and have a save journey!

@IsmaelMartinez
Copy link
Author

Just read the other issue and it seems the consensus was going towards a similar approach (but using a metadata object, so using the data/metadata pattern). Happy going that direction, but that would probably mean moving other parts into it.

@GreenRover
Copy link
Collaborator

In most cases it is good practice to use the tolerant reader pattern. And assuming you are referencing to senatic version styled versions.

The minor and patch level version or only interesting for debug uses cases, because extra field can safely be ignore by the tolerant reader / receiver.

But major changes have to be rollet out in steps to not bild the life cycle of receiver and sender. Meaning eigther sender habe to send v1 and v2 or receiver have to listen on both and process what is send.

There for, for typed language based software need to get find the major version before parsing the message. This is why the major versions habe to be part of the channel / topic name or a header field (depending on used messaging solution / transport system).

@IsmaelMartinez
Copy link
Author

In most cases it is good practice to use the tolerant reader pattern. And assuming you are referencing to senatic version styled versions.

The minor and patch level version or only interesting for debug uses cases, because extra field can safely be ignore by the tolerant reader / receiver.

But major changes have to be rollet out in steps to not bild the life cycle of receiver and sender. Meaning eigther sender habe to send v1 and v2 or receiver have to listen on both and process what is send.

There for, for typed language based software need to get find the major version before parsing the message. This is why the major versions habe to be part of the channel / topic name or a header field (depending on used messaging solution / transport system).

I think that works then well for putting the version on the message level. It is outside of the payload and a property of the message envelope. That should allow for easier locating the versions, for example if you need to parse $ref objects.

Version should not be tied to semver IMO, as I personally only find useful tracking breaking changes in EDA. It should support any version, reason why I think a string should do the job.

Let me put some some examples to see the problems and options this provides.

components:
  messages:
    Banana:
      version: '1'
      name: 'banana'
      payload:
         something: 'yeah'
         ...
   Banana_v2:
      version: '2'
      name: 'banana'
      payload:
        schemaFormat: 'application/schema+json;version=draft-07'
        "$ref": 'BananaSchema-v2.schema.json'
   UniquePlatano:
      payload:
         key: value
         ...

I can see a 'small' caveat/issue with this solution, as the message identifier needs to be different for each version (to avoid clashes). I think that is ok to leave to the user to decide.

It is then possible to have 2 messages with different version by using the combination for name and version, but both aren't required, so this is more of a tool/implementation decision.

To resume, you can have what you got now that don't include a version:

components:
  messages:
    Banana_v1_or_wathever_you_want_in_here:
    ...
    Banana_e197238b-f384-4d1f-8b90-aa403a2a384b:
    ...

you can have versions on those messages:

components:
  messages:
    Banana_v1_e197238b-f384-4d1f-8b90-aa403a2a3811:
       version: '1.0.0'
       ...

You can even add version and name, so you can know when a message representation is a versioned version of another message

components:
  messages:
    Banana_v1:
       version: '1.0.0'
       name: 'banana'
    Banana_e197238b-f384-4d1f-8b90-aa403a2a384b:
       version: '2.0.0'
       name: 'banana'
       ...

But true that only adding the version only provides you with the version of the message, but doesn't tell you much about the message itself nor the relationship with other messages.

What do you think lovely people? I am probably missing a lot of caveats and happy to learn about them! Happy to be completely wrong also, so do shout!

@derberg
Copy link
Member

derberg commented Oct 2, 2024

yeah, you have Banana at version 1 and Banana_v2 at version 2 but these can be 2 different version of totally different bananas, and of course, someone can do Banana_v2 at version 1 and then what does that even mean. This is why I'm asking what is the benefit of that flag.

Also, regarding the related issue of schema versioning - this is why in this other discussion it was proposed to have versioning on schema level. This way you see that one message has multiple schema versions, and because they are part of one message, it is clear what is the relation.

The problem with versioning in the schema level only is that then you lose access to all the other message-level flags that are useful, like for example deprecated or even examples.

@IsmaelMartinez
Copy link
Author

I feel it goes better in the message object rather than the schema. Mainly because it would be more difficult to apply to external schemas. Like when importing JSONSchema, OpenAPI, Avro or similar.

Maybe we can use the version and name?!?. We can require both if a version is provided?!? Use the name as in the blog? I need to think about it.

I understand this is a complex issue. We all do it differently, as we are pragmatic, and have different problems. Happy to explain how we do it, if that helps. The questions I have currently are:

What do we need to support a message that has multiple versions of a schema? Is this the right question? What do you think?

@derberg
Copy link
Member

derberg commented Oct 3, 2024

What do we need to support a message that has multiple versions of a schema? Is this the right question? What do you think?

but is it the right direction? As I wrote, if we go schema-versioning-way then we lose access to all the other properties of message object, like for example deprecated. I can imagine pretty quickly that someone wants to have multiple versions of a message, and mark one of these as deprecated. With versioning on schema level - not possible.

message level versioning is better, but again, what is the value if you still cannot set a relation between messages

@IsmaelMartinez
Copy link
Author

message level versioning is better, but again, what is the value if you still cannot set a relation between messages

Why not set that relationship them with this change then? Even without the relationship I see it been in a better position, but thanks to this talk I am now thinking more an more making name required IF a version is provided. That way we got that relationship.

The options I see on this:

  • We enforce providing a name IF a version is provided, but not the other way around.I believe is just modifying this PR with the dependencies condition:
 "properties": {
    ...
    "version": {
      "type": "string",
      "description": "A version for the message. Requires a name."
    },
    "name": {
      "type": "string",
      "description": "Name of the message."
    },
    ...
  },
  "dependencies": {
     "version":["name"]
  }
  • We can highlight that combination, but not enforce it. This is basically just changing the description to something like:
    "A version for the message. It can be used in combination with the name to have multiple version of a message"
  • or we can just say that versioning of events/schemas is not part of the specification, and suggest to people they use the x- attributes.

My preference goes on that order for the following reasons.

  • If a user start using the version, it would be good to directly guide them to use the name also. Note that the name is only required IF a version is provided.
  • If feels a bit more clear, and we are guiding the user (making it easier and standard), but they got plenty of freedom to use this or just continue with their happy lives without versioning.
  • As you said, it allows for providing examples for multiple versions, and deprecating some messages.
  • I think that combination would work for all user cases, but happy to hear otherwise. If there is a use case that clashes with the version+name, then option 2 becomes into play.
  • I dislike the option 3 as that means each tool/solution that needs versioning, will do it differently and that would add unnecessary complexity to already complex systems.

What do you all think? I think we are getting somewhere with the conversation, so I do appreciate the time and effort and understand this might take some time.

@chrispatmore
Copy link

chrispatmore commented Oct 11, 2024

Hi there,

This is very a very interesting proposal and discussion. I'm really keen to see where it goes, but it's not completely clear to me exactly what issue we're trying to solve with this, I will try to elaborate my confusion below (I hope I don't make it worse!).

First to articulate my understanding and context, to make sure I'm not missing anything as much as anything else:

As I understand AsyncAPI documents they're there to describe what an application that implements the document should do. i.e. it should publish x message to y channel, or receive foo message from bar channel etc.

When an application is implemented it will use a version (application version) of the AsyncAPI document for this, either using some generated code or manually. At the point this is done the application is committing to working with a fixed set of schemas from that document. e.g. the application publishes Dog messages following the Dog schema and this schema might have a version.

If there were more than one version of the schema the publish operations would surely (I am assuming here) only use one version (until the application itself is updated, then it might switch to the next version. One application would not publish two messages using different schemas to the same channel). Whereas a receive operation might have to deal with multiple versions of a schema as there could be multiple applications out there of different versions, and those versions may be compatible or not (i.e. if semver 1.1.1, 1.1.2 and 2.0.0)

Also if we look at OpenAPI which has parallels here. It's interesting that it also does not attempt to solve this problem r.e. requests / responses. In a single version of an OpenAPI document each path has only one definition for its request and response schemas, and within the lifecycle of the document you are expected not to break those schemas so that older clients don't break. If a breaking change is required typically either a /v2/api (or similar) or header is used to allow the client to switch in it's own time before the old behaviour is removed.

Finally in my long preamble (sorry) there is the question of documentation via the AsyncAPI, this is information that does not impact the implementation of the application but is useful meta that can be used to give information to a developer or architect etc. who wants to understand what this application does (description comes to mind)

Given I have bored you to death with talking to myself, onto my confusion:

it seems to me from the issue description, that this is really a problem of documentation via the async API (Please help me understand if I am wrong). As it seems from the use case being described, that you are looking to show people what different versions of messages there are flowing around the architecture and how they vary. And that this cannot sensibly be done just using the application version, which is true.

Additionally I don't see how (from the application developers perspective) knowing the version of the message helps me implement the application, if I am doing a publish I will just use whatever singular schema I have been given in the document for that message for that operation. If I am doing a receive I will ensure I handle potentially multiple versions on that channel, where each message could really be two schema versions of the same thing which I can do like:

channels:
  lightingMeasured:
    address: 'smartylighting.streetlights.1.0.event.{streetlightId}.lighting.measured'
    messages:
      lightMeasuredv1:
        $ref: '#/components/messages/lightMeasuredv1'
      lightMeasuredv2:
        $ref: '#/components/messages/lightMeasuredv2'
operations:
  receiveLightMeasurement:
    action: receive
    channel:
      $ref: '#/channels/lightingMeasured'
    messages:
      - $ref: '#/channels/lightingMeasured/messages/lightMeasuredv1'
      - $ref: '#/channels/lightingMeasured/messages/lightMeasuredv2'
components:
  messages:
    lightMeasuredv1:
      name: lightMeasuredv1
      title: Light measured
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv1'
    lightMeasuredv2:
      name: lightMeasuredv2
      title: Light measured
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv2'

So it's not completely clear to me what we are discussing is really changing / adding / fixing. And I would really appreciate it if you could help me understand. Looking forward to it!

Also, again, this is a really great discussion

@IsmaelMartinez
Copy link
Author

IsmaelMartinez commented Oct 12, 2024

Hi @chrispatmore,

Just to make sure we are all in the same page, the current proposal has evolved to:

Allow for an optional version field that will enforce providing a name when present (that would be my preferred option currently)

Thanks for the long reply and taking the time to read through the messages.

Looking 1st on the articulate section

With regards the application version, I see it more of a document spec/contract version. Nothing to do with my application version as otherwise I will need to modify it multiple times a day making it more noisy than useful.

Moving on into the publishing multiple versions of a message, I don't think we can mandate published to only publish one version. We do incremental roll outs meaning we do move from version A to version B slowly, so a published might be publishing version A and B to facilitate that transition. Sometimes is easier to do it in the consumers (allow to consume both) but sometimes it make sense for the producers to do some of the job.

For the OpenAPI I believe you can use oneOf to provide multiple responses (or accept multiple requests), so you can handle versioning that way if you wish. I think for them it would have been more difficult to standardise this, as there where already various options for versioning. What means consumer/processing applications need to work for x type or y type of versioning, making it pretty difficult to work for all (if any).

I do see this useful for automation of documentation but also to help on helping on the implementation. As having a version can avoid or help understand problems that a version change can introduce.

Now trying to help you understand where I am thinking

From my point of view, I think it would be invaluable to include a version in the messages to give that context to the consumers/processors of the specification. Using your example, a consumer of that asyncapi specification needs to know that the version format is <messageName>v<version> and is in the messageObject key.

For a human this is fine as we know that we have 2 versions of lightMeasured (v1 and v2). For libraries consuming/processing the spec, they will see lightMeasuredv1 and lightMeasuredv2 as different messages that have nothing to do with one another.

Updating your example with the proposal, it should look like this:

channels:
  lightingMeasured:
    address: 'smartylighting.streetlights.1.0.event.{streetlightId}.lighting.measured'
    messages:
      lightMeasuredv1:
        $ref: '#/components/messages/lightMeasuredv1'
      lightMeasuredv2:
        $ref: '#/components/messages/lightMeasuredv2'
operations:
  receiveLightMeasurement:
    action: receive
    channel:
      $ref: '#/channels/lightingMeasured'
    messages:
      - $ref: '#/channels/lightingMeasured/messages/lightMeasuredv1'
      - $ref: '#/channels/lightingMeasured/messages/lightMeasuredv2'
components:
  messages:
    lightMeasuredv1:
      name: 'lightMeasured'
      version: '1'
      title: Light measured
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv1'
    lightMeasuredv2:
      name: 'lightMeasured'
      version: '2'
      title: Light measured
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv2'

This will allow applications to understand that both lightMeasuredv1 and lightMeasuredv2 are the same message (with name lightMeasured) but have different versions '1' and '2' (it is a string, so the version can be whatever you want).

This will allow things like:

  • detecting multiple versions of a message
  • checking for the differences between schemas/messages
  • creating changelogs
  • adding rules for automatically detecting breaking changes (version is just a string)
  • having a history of event changes
  • understanding if different versions of messages are going via different channels
  • etc

I am sure there will be others benefits, the list is just a few things that come to my head.

I think it would be incredibly useful, and should help standardise that area, without it is a free for all and there isn't really a one solution that solves the problem for all.

Consumer applications could use x- to archive this, or messageKey format <message>@<version>, or others.

But that then means each could decide is best to do it a different way and use different formats or x- keys what I think would lead us to something like this:

image

https://www.explainxkcd.com/wiki/images/6/60/standards.png

Again, extremely useful discussion and happy to take it into a call to see if I am missing something from AsyncAPI and/or how others use asyncAPI that might make this more of a problem than a helpful solution.

@chrispatmore
Copy link

chrispatmore commented Oct 14, 2024

Thanks for the reply @IsmaelMartinez, it's certainly helping me to get a handle on this. It definitely seems as though this is looking to define a new standard for how people should document the versions of their messages using AsyncAPI, not because it cannot be done, but because there are too many competing options and it makes it hard to produce useful tooling. However we definitely want to avoid the problem you have indicated of producing just another unhelpful option.

Therefore if we want to do this we need to make sure that what we choose is sensible, suitable an useful in the majority of use cases, accepting that it might not fit everyone's needs or all use cases! Which means I would like to understand the use cases we are trying to tackle a little better.

First can you elaborate on your publishing of multiple versions, I still don't follow the scenario there. As I see it, these are the options you could be indicating, and what I see the problem being (which is why I don't understand the case):

  • 1 publisher publishes 2 messages (1 of each version) to 1 channel
    • This is duplicating data (which may be acceptable in some scenarios)
    • The receiver has to cope with both anyway
  • 1 publisher publishes 2 messages (1 of each version) to 2 channels (1 to each, v1 / v2 channel)
    • This is duplicating data
    • In AsyncAPI spec, this is two different operations, so as far as one operation is concerned this only deals with one message
  • 1 publisher publishes 1 message (of a version determined by some other runtime factor) to 1 channel
    • I think this it out of our hands (or should be) as you can't sensibly document the runtime conditions for choosing one over the other in the AsyncAPI document
  • 2 publishers publish 1 message each (at different versions) to 1 channel
    • I would consider this standard BAU, each could have been built off different revisions (avoiding version here, but a bit on that below) of the whole document with only one operation
  • 2 publishers publish 1 message each (at different versions) to 2 channel (1 to each, v1 / v2 channel)
    • This is more separated version of above

w.r.t version (application version) the doc states:

Provides the version of the application API (not to be confused with the specification version).

Therefore I think we should assume that most people use it as stated, as the version of their application API, and that if you are changing the API a lot on a daily basis, and publishing those updates every time, then that version will change a lot. I don't think that's a bad thing, just the mark of an active API. It would be interesting to understand the scenarios where the message version changes would / would not impact the application version in this proposal.

Additionally I still don't see this as impacting the actual implementation of the application (this is NOT a bad thing). I can see how it would have a big impact on tooling, all the other things you listed, and be useful for potentially highlighting problems from a version change. But these (IMO) are all documentation / meta, after the document has been coded there would be no difference in the implementation brought about by the introduction of these changes, this is pretty normal, but I think it's important to be clear on it. e.g. using the above short doc example you updated, the behaviour of the application is defined via the different operations, and the links through to the various channels / messages. The inclusion of the new version field (+ required name) does not change this.

Sorry for being so long winded about everything! just really helps me to get clarity, especially when using message threads. I think a call could be good, but I like to make sure I'm joining the call with sufficient context and understanding, and not using people's valuable time getting that in the call.

EDIT - added NOT as I missed it and it completely changed the meaning! If you read this in the email update (which won't have it) I am sorry

@IsmaelMartinez
Copy link
Author

Hi @chrispatmore, I am still thinking about this, just been busy lately.

Thanks for clarifying the application API and it makes sense. We tend to only make the version matter when we are stable enough, what tends to remove the use of versioning for most services.

We currently use channels as just a generic object. We don't define much in it as we just use EventBridge, and with AWS-CDK IaC you are already doing the definition part in code.

We use versioning and we do phase deployments, parallel runs, shadow runs and the likes. As such, we send messages of different versions in production. Sometimes this can run for a long period as we can also use them to allow to run multiple test cases in parallel, most times this is just for a few minutes while we are deploying a change.

As such, having metadata from the event is almost vital for us. I agree is metadata like the correlationId, and maybe that approach might be a better one.

Another option is to include an optional metadata object like correlationId that is a regex on the message identified and identifies the version. Like that people can optionally use it to identify that from the lightMeasuredv2 the lightMeasured is the name and v2 is the version. Not sure, what do you think?

@chrispatmore
Copy link

That's alright I know the feeling!

I think after this discussion and clarity my vote would be for a new meta / metadata section in the message object which can contain version information and / or relation information. Much like was being discussed near the end of #697 .

I think that including it in the message key like lightMeasuredv2 is a requirement in order to separate messages and implement the application, but trying to use that string for meta will prove fragile and limiting, and annoying for people who've done it differently.

additionally I think not putting it inside a "metadata" section would imply a level of importance / relevance to the application implementation that I think would cause more confusion or bad practise than the problems it would solve. e.g. I can see someone writing / generating an app that's trying to use some weird envvar or logic to conditionally switch the message version being sent, rather than ensuring their consumers are also updated to handle the new changes / make sure the change are non breaking etc. Basically, the standard should net improve the situation not create more space for confusion

finally w.r.t to schema version, you could extend the argument to include the same new section at that point too. Add in a metadata section that someone can optionally use to provide more info in a standardised way if people want it. But there is no requirement to do so. Any implementation based on the version should continue to use fixed name keys like animalSchemav1 for clear mapping

@IsmaelMartinez
Copy link
Author

IsmaelMartinez commented Nov 10, 2024

Thanks for taking the time to reply and for your thoughts. I do like the meta/data separation of concerns on events.

I try to put an example of what I understand your proposal would be, assuming you only want to move the version into the metadata field:

(only putting the components section as the other part doesn't change)

components:
  messages:
    lightMeasuredv1:
      name: 'lightMeasured'
      metadata: 
        version: '1'
      title: Light measured
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv1'
    lightMeasuredv2:
      name: 'lightMeasured'
      metadata: 
        version: '2'
      title: Light measured
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv2'

From my understanding on your proposal we will still need to have a way to identify the message mapping of name/version, but would move the version into a metadata object.

The use of the optional regex would add extra context to the field keys, that people can use as they wish, but will remove the need of having a version and name, as that would be implicit in the keys.

Putting an example of what I mean:

(only putting the components section as the other part doesn't change much - also I haven't tested the regex)

components:
  messages:
    lightMeasured@v1:
      nameVersionRegEx: '^(?<name>[a-zA-Z]+)@(?<version>v\d+)$'
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv1'
    lightMeasured@v2:
      nameVersionRegEx: '^(?<name>[a-zA-Z]+)@(?<version>v\d+)$'
      contentType: application/json
      payload:
        $ref: '#/components/schemas/lightMeasuredPayloadv2'

That solves 3 problems:

  • The version is in the key field, that in any of the other options we have put needs to change anyway. It is also the more visible part of a message.
  • People can decide on their regex format they want to use. I guess the limit is the size of the keyfield
  • The regex would allow us to understand the relationship between events using the name/version mapping.

While I see the point of having a metadata object and a payload (or data) object, I think this is a bigger conversation to be had that would imply moving some of the other fields into it (like name, correlationId, title, contentType, etc).

I would be happy to contribute to that conversation, but I think it would be wise to not make this version conversation a meta(data) conversation as the scope of the later is much bigger, and adding a way to determine a version now, could help drive that conversation depending on how people use it.


For the record, we do use the metadata object in our events, and have the version in it (and name). For reference, we include the following fields:

  • eventId (aka name) - required
  • eventVersion (aka version) - optional
  • traceId (aka correlationId) - optional
  • timestamp (when the event was generated) - required
  • tenantId - optional

What do people think? Do you want to move this conversation to be a more metadata/payload separation or keep it into the how do allow for a mapping between event/version? Also, I don't like regex but I see this as the simplest way to solve that mapping, but I am all ears for other options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)
Projects
None yet
Development

No branches or pull requests

7 participants