Distributed consistent flag evaluations #396

sebastian-zahrhuber · 2024-09-23T08:39:30Z

sebastian-zahrhuber
Sep 23, 2024

Within my master thesis, I am currently investigating how distributed consistent flag evaluations could be implemented with OpenFeature and therefore, I want to share my ideas with you.

Imagine a service architecture where requests are forwarded to multiple downstream services. In this scenario, it could be important that all services down the call stack evaluate a shared feature flag to the same value, even if it was changed in the meantime.

Distributed consistent flag evaluations with multiple downstream services

To implement this, I see the following options available:

Transporting the evaluated flag value over the request
If the OpenFeature SDKs would support a versioning concept, transporting the feature flag version to evaluate over the request
Using Epoch Markers (see this paper section 3)
Using Fast Control Messages (see this paper section 4)

Since the OpenFeature SDKs do not yet support a versioning concept, the best option currently implementable, in my opinion, is transporting the evaluated flag value via the request down the stream. Therefore, a Baggage header could be used.

With this approach, the service has the responsibility to forward the baggage header and the service needs custom logic to use the baggage header instead of evaluating the feature flag itself.
To avoid having to implement this logic in all services, a further development of the SDK's would be to use this baggage header as a parameter for OpenFeature SDK flag evaluation. The SDK would then decide if a value should be used from the baggage header or evaluated from the provider. This way, the service would not have the responsibility to check the header.

Distributed consistent flag evaluations with asynchronous Execute/Poll pattern

A special extension is an asynchronous execute/poll pattern, where an operation is started with an execute request and the status of the operation can be queried with a poll request. Here too, it can be important that the shared feature flag is evaluated to the same value during the poll request as it was during the execute request.
To ensure distributed consistent flag evaluation in this case as well, I have looked into two possible approaches:

Distributed consistent flag evaluations using a cache for the first service + baggage header for downstream services

For the Execute request, the first service in the call stack fetches all feature flag configurations via OFREP with a given evaluation context. For all the downstream services, the evaluated flag values are forwarded via the Baggage header.
To support the asynchronous Execute/Poll pattern, the evaluated flags can be stored in a key-value store like Redis which lives in the same cluster.
For the Poll request, the first service in the call stack fetches the evaluated flags from the key-value store, uses these flags, and forwards them again via the Baggage header to the downstream services.

This approach would offer best performance since only one OFREP call needs to be made and the only one cache storage/read access is necessary. However, the downside is that it is only possible to store feature flag values within the scope of the shared feature flags, as only one service has access to the cache.

Distributed consistent flag evaluations using a cache for all services

The other option would be to have access to the cache for all services. This raises questions such as the support of multiple scopes, or whether only feature flags of the shared scope or the feature flags of all scopes should be consistent for the execute/poll pattern. Depending on this, one must also be cautious about data leaks on the cache side.
The downside of this approach would be the poorer performance due to multiple accesses to Redis and possibly multiple OFREP calls.

dabeeeenster · 2024-09-25T13:17:13Z

dabeeeenster
Sep 25, 2024
Maintainer

Just to add - we (Flagsmith) use epoch markers as part of our real time streaming infrastructure and it has worked well; its simple, reliable and easy to understand.

0 replies

moredip · 2024-09-25T18:19:18Z

moredip
Sep 25, 2024
Maintainer

Imagine a service architecture where requests are forwarded to multiple downstream services. In this scenario, it could be important that all services down the call stack evaluate a shared feature flag to the same value, even if it was changed in the meantime.

I’ve heard this hypothetical scenario being discussed a few times, but I’ve never seen it happen in the wild. I’m sort of skeptical that you’d really see different distributed services wanting to evaluate the same feature flag.

I’ve definitely seen a service at the top of the stack change a parameter that’s passed down to another service - for example, it checks whether a given user has access to same-day shipping in their market, and changes the parameters to an email template, or something - but IMO you don't want multiple services to be coordinating their behavior via a shared feature flag. It's much better to make that explicit by having the first service evaluate the flag and then pass the relevant parameters directly when calling the downstream service(s)

1 reply

AYM1607 Nov 5, 2024

I’ve heard this hypothetical scenario being discussed a few times, but I’ve never seen it happen in the wild. I’m sort of skeptical that you’d really see different distributed services wanting to evaluate the same feature flag.

FWIW, we (Azure Kubernetes Service) have some use cases for this. I agree with you that ideally the "first" service would make some changes to parameters or the data model which are consumed downstream. However, there are some instances in which treating the feature flags themselves the data model is convenient.

dabeeeenster · 2024-09-25T18:26:27Z

dabeeeenster
Sep 25, 2024
Maintainer

I think sometimes people theoretically over-think this problem too. Rolling server cluster upgrades will cause some traffic to get vN and some to get vN+1 but I've never, ever heard anyone worry or care about that situation.

1 reply

lukas-reining Sep 26, 2024
Maintainer

I think sometimes people theoretically over-think this problem too. Rolling server cluster upgrades will cause some traffic to get vN and some to get vN+1 but I've never, ever heard anyone worry or care about that situation.

There a several situations in which you would have a problem with this.
This is part of why some service meshes have sticky sessions and persistent paths in call chains.
Sometimes there are applications that can just not live with this situation otherwise yet.

lukas-reining · 2024-09-26T11:53:09Z

lukas-reining
Sep 26, 2024
Maintainer

Imagine a service architecture where requests are forwarded to multiple downstream services. In this scenario, it could be important that all services down the call stack evaluate a shared feature flag to the same value, even if it was changed in the meantime.

I’ve heard this hypothetical scenario being discussed a few times, but I’ve never seen it happen in the wild. I’m sort of skeptical that you’d really see different distributed services wanting to evaluate the same feature flag.

I have definitely seen that at a customer @moredip, and thinking of "coordinated feature deployments" I think this is not a bad pattern.
There, we just accepted a really small amount of requests to fail and be retried if the flag changed in between evaluations during the processing chain.

There we just used baggage headers to transport the context to the downstream services.

0 replies

lukas-reining · 2024-09-26T12:49:05Z

lukas-reining
Sep 26, 2024
Maintainer

Distributed consistent flag evaluations with multiple downstream services

@sebastian-zahrhuber I really like the idea and actually we implemented similar ideas in projects.

Transporting the evaluated flag value over the request

This could be a good solution.

In this case, we would have to accept that the set of context values could be different between the one the upstream service used for evaluation and the one used by the service.
In special cases this would mean that the downstream service would have evaluated differently than the upstream one just because of the context, but it would make the behavior really predictable.
To make that clear, we can even add an evaluation reason like UPSTREAM_EVALUATION which would help indicate this.
Compared to epoch markers or flag versions, this could then also be completely handled by the SDK instead of flagging services having to implement something like epochs.

I really appreciate that flag value propagation is completely transparent for the the services up- and downstream. They can evaluate as if they are the only service needing this flag but still benefit from consistent flags throughout the whole call chain.
Also it reduces evaluations and may save some time and network.

Therefore, a Baggage header could be used.

We also used it in several projects to add some transaction specific context coming from the frontend to chained services.

When handling a request, the SDK could just save the evaluated flags in a similar mechanism we use for evaluation context.
Getting it from incoming requests and attaching it to outgoing requests could become relatively complicated, but it is exactly what OTEL is doing so it should be solvable.

I would be really in favor of going this route.
I have seen demand for this and think that baggage is the least complex but still powerful solution for this.

Once we see a tendency that other are in favor of this too, I would love to build an experimental version for the JS SDKs.

(Edit) I see one huge problem: How do we make sure that the baggage is only set by a trusted party? I guess this could be a blocker. If a client can set baggage (which it can always do), and this flag is used for "permission toggles", we can not trust the flag information.
As a solution e.g. we could only accept baggage from other servers that we trust. In this case we would not accept baggage from a frontend. But how can we make sure that the sending upstream service is trusted and not a client without having something like mTLS or so required?
Another way could be to sign the flag values sent to the upstream service when evaluating and send them to it. The downstream services could then simply check the signature of the trusted flagging service by using the flagging services PK. It could actually be in the format of a JWT. But this requires some implementation by the flagging service which would be again a huge downside.

As an addition to this, I would also like to add an optional baggage context, which as said, we actually used in several projects to be able to define a specific context value in the client and use it down the call chain. But this would use the same mechanisms as the one for the flag value propagation.

Distributed consistent flag evaluations with asynchronous Execute/Poll pattern

I see the point but I have seen this problem way less then the other one, my main concern is that most of the proposed solutions either require knowledge about this condition in the first entity of the call chain (Version 1) or a separate cache which I think is hard to spec as OpenFeature. Also it makes this condition opaque to the services/clients.
Version 1 also requires the client to know about the flags that will be used by the downstream services even though only two downstream services need those flags to be evaluated.

9 replies

toddbaert Sep 30, 2024
Maintainer

JWTs?

Relevant: w3c/baggage#12

lukas-reining Sep 30, 2024
Maintainer

I mentioned JWTs up in the long answer @toddbaert :)
Interesting to see that linked issue, I did not find this before!

Another way could be to sign the flag values sent to the upstream service when evaluating and send them to it. The downstream services could then simply check the signature of the trusted flagging service by using the flagging services PK. It could actually be in the format of a JWT. But this requires some implementation by the flagging service which would be again a huge downside.

What do you think of that part from above?

toddbaert Oct 1, 2024
Maintainer

I think signing is the obvious solution, and JWTs would make it easier, but I understand why baggage didn't specify it themselves.

We're in a bit of a different situation than baggage. I think for a specification for this purpose to be useful, it would have to define a structure/payload of some kind. I think JWTs are ideal for this, so they are an interesting candidate.

But this requires some implementation by the flagging service which would be again a huge downside.

The nice thing about JWTs is that they are just base64 - you can verify the signature or not. In fact, there's an alg: none header value that tells a consumer that there's no signing at all.

lukas-reining Oct 3, 2024
Maintainer

I think signing is the obvious solution, and JWTs would make it easier, but I understand why baggage didn't specify it themselves.

We're in a bit of a different situation than baggage. I think for a specification for this purpose to be useful, it would have to define a structure/payload of some kind. I think JWTs are ideal for this, so they are an interesting candidate.

Definitely @toddbaert! And for the structure it could still be something like the evaluation details etc.

The nice thing about JWTs is that they are just base64 - you can verify the signature or not. In fact, there's an alg: none header value that tells a consumer that there's no signing at all.

But to me this is the problem. The only reason to use JWTs is that we could verify the signature of our trusted server we expect the value to come from. The same as it is in OAuth and similars.
But this would mean that flagging servers would have to provide signed evaluation values and also provide their keys as JWKs. Again like OAuth. I really like this solution but I see the problem in the need for servers to implement this. This could be something for OFREP maybe, but for other services it could be a problem.

The second thing I see as problematic would be that some flagging services have a namespace concept. I am wondering how we would solve the namespace a defines flag x while namespace b also defines flag x. Currently some services determine which namespace to use by a context value or by including a custom header in the implementation. If we use the evaluation details as the value of such propagated flag values, how would we decide if flag x is from the namespace we expect?
This is really challenging to me but I hope we can resolve it.

toddbaert Oct 24, 2024
Maintainer

The second thing I see as problematic would be that some flagging services have a namespace concept. I am wondering how we would solve the namespace a defines flag x while namespace b also defines flag x. Currently some services determine which namespace to use by a context value or by including a custom header in the implementation. If we use the evaluation details as the value of such propagated flag values, how would we decide if flag x is from the namespace we expect?
This is really challenging to me but I hope we can resolve it.

Great point @lukas-reining . We officially now have a concept of a flag-set which might be good for this purpose. We could add this as a prefix to the flag, or perhaps even better, as an additional prop in our payload to indicate a scope/namespace; that way systems could easily ignore or it or use it as they require.

lukas-reining · 2024-10-24T17:56:33Z

lukas-reining
Oct 24, 2024
Maintainer

We officially now have a concept of a flag-set which might be good for this purpose.

Oh I did not see that one, nice point!

We could add this as a prefix to the flag, or perhaps even better, as an additional prop in our payload to indicate a scope/namespace; that way systems could easily ignore or it or use it as they require.

But we would have to have the concept in the SDKs too then right? Maybe generally a thing to add next to targetingKey? What do you think @toddbaert? If we had that, the problem would be, besides some possible edge cases like "more dimensional flag sets", gone.

Should we open an issue to discuss that in the spec repo?

0 replies

AYM1607 · 2024-11-05T23:20:17Z

AYM1607
Nov 5, 2024

Happy to see that there's already a conversation around this! Thanks for the initial writeup @sebastian-zahrhuber.

A counter argument for transporting evaluated flag values is payload size, our in-house flagging system uses a yaml file as the source of data which is now 2MB+ for all our flags and rules.

The reason why the total size is relevant for us is that, for better or worse, the context schema that's applied to the rules can vary by service, thus we need to transport all the rules and not just evaluated values.

I can't disclose too much information, but epoch markers are the synchronization mechanism that makes the most sense for us. We plan to implement it internally either way, but would love to adopt OpenFeature and contribute if possible.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenFeature

Distributed consistent flag evaluations #396

{{title}}

Replies: 7 comments 11 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

OpenFeature

Distributed consistent flag evaluations #396

sebastian-zahrhuber Sep 23, 2024

Distributed consistent flag evaluations with multiple downstream services

Distributed consistent flag evaluations with asynchronous Execute/Poll pattern

Distributed consistent flag evaluations using a cache for the first service + baggage header for downstream services

Distributed consistent flag evaluations using a cache for all services

Replies: 7 comments · 11 replies

dabeeeenster Sep 25, 2024 Maintainer

moredip Sep 25, 2024 Maintainer

AYM1607 Nov 5, 2024

dabeeeenster Sep 25, 2024 Maintainer

lukas-reining Sep 26, 2024 Maintainer

lukas-reining Sep 26, 2024 Maintainer

lukas-reining Sep 26, 2024 Maintainer

toddbaert Sep 30, 2024 Maintainer

lukas-reining Sep 30, 2024 Maintainer

toddbaert Oct 1, 2024 Maintainer

lukas-reining Oct 3, 2024 Maintainer

toddbaert Oct 24, 2024 Maintainer

lukas-reining Oct 24, 2024 Maintainer

AYM1607 Nov 5, 2024

sebastian-zahrhuber
Sep 23, 2024

Replies: 7 comments 11 replies

dabeeeenster
Sep 25, 2024
Maintainer

moredip
Sep 25, 2024
Maintainer

dabeeeenster
Sep 25, 2024
Maintainer

lukas-reining Sep 26, 2024
Maintainer

lukas-reining
Sep 26, 2024
Maintainer

lukas-reining
Sep 26, 2024
Maintainer

toddbaert Sep 30, 2024
Maintainer

lukas-reining Sep 30, 2024
Maintainer

toddbaert Oct 1, 2024
Maintainer

lukas-reining Oct 3, 2024
Maintainer

toddbaert Oct 24, 2024
Maintainer

lukas-reining
Oct 24, 2024
Maintainer

AYM1607
Nov 5, 2024