feat: multi-provider strategies skip providers returning reason=DISABLED#386
feat: multi-provider strategies skip providers returning reason=DISABLED#386jonathannorris wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the documentation for the Multi-Provider strategies—First Match, First Successful, and Comparison—to include logic for skipping providers that are intentionally disabled (reason=DISABLED). A review comment suggests clarifying the description of the First Successful Strategy by removing the redundant mention of FLAG_NOT_FOUND, as it is already classified as an error and excluded from successful results.
|
If people don’t use flagd, they’ll be affected by this change for no reason. It would be better to simply explain how to create a custom strategy to preserve the previous behavior for those who use flagd and really want old results. |
|
Good point worth addressing. This change isn't specific to flagd. More broadly, I think the right question is: should Anyone who wants |
There was a problem hiding this comment.
Pull request overview
This PR updates the Multi-Provider strategy semantics in the spec so that provider results indicating an intentionally disabled flag (reason = DISABLED) are treated as “no value” and therefore fall through to subsequent providers, consistent with FLAG_NOT_FOUND.
Changes:
- Update First Match strategy to skip providers that return
reason=DISABLED(in addition toFLAG_NOT_FOUND). - Update First Successful strategy to skip
reason=DISABLEDwhen searching for a provider result to use. - Update Comparison strategy to exclude
FLAG_NOT_FOUNDandreason=DISABLEDproviders from comparisons.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I agree with the idea that this change makes sense, but I understand @erka 's hesitation. After seeing the comment from @jonathannorris here, I think it's something we probably want, but I have to agree this is a breaking change, though admittedly with a "secondary" part of the SDK. It's pretty niche, and I do think the newly proposed behavior follows the principle of least astonishment. cc @thomaspoignant @beeme1mr @kinyoklion @lukas-reining @oxddr cc @erka |
I would prefer to have a separate strategy for disabled status. This clearly breaks behaviour for existing customers. Something like Now, if we treat as a bug in the Multi-provider spec, I have no objections. The intent in this case shows it as a new feature, hence my hesitation with this one. |
I'd argue this is a bug rather than a new feature. The Multi-Provider spec already defines Consider the expected end-user experience. If my-feature doesn't exist in Provider A, Multi-Provider falls through to Provider B. If my-feature is disabled in Provider A, why would the behaviour be any different? From the user's perspective, Provider A is saying "I have nothing for you" in both cases. Stopping at Provider A and returning a code default, rather than checking Provider B, would be surprising. I'm personally pretty against sprawling to more strategies, that feels like punting on making a decision if this is an issue or not. |
|
There is an option 2 in the
If flagd implements that in future? Or what if someone else is doing it? If the flag is off, it could return a different preconfigured value than the one hardcoded in the code. The adverse impact of this change may be broader than leaving it as it is. |
lukas-reining
left a comment
There was a problem hiding this comment.
That's semantically the same as FLAG_NOT_FOUND from the Multi-Provider's perspective.
I do not agree that this is semantically the same, especially for First Match.
Flag not found really means that there is no information at all, while disabled can have the lastest information but it specifically states being disabled.
More broadly, I think the right question is: should reason=DISABLED be a fall-through by default in Multi-Provider? I think it should. DISABLED means "this provider has intentionally turned this flag off"; it's an operational state, not a resolved value.
If that is true, we would not have to have both from my point of view.
To me DISABLED can definitely carry the current value.
Falling through to the next provider is the behavior that makes sense in migration, rollout, and multi-source scenarios, which is exactly what Multi-Provider is designed for.
I agree with this. To me, it feels reasonable to do this change.
But also it is a breaking change to me.
|
@lukas-reining @erka I'll concede the semantic point: That said, for Multi-Provider fall-through the distinction doesn't change the right behavior. The question is whether the provider is actively serving the flag, not whether a value is technically present in the response. |
|
There is SuperFly, and they use a Go multi-provider setup with LD. They configure two LD providers in their multiprovider with different URLs - one local and one external for failover. If the disabled flag returns a value different from the hard-coded default, this becomes a breaking change for them. Similarly, if the system performs local site evaluation (detecting disabled) and then falls back to remote site evaluation, that would be also a change with possible incorrect outcomes, while doubling time and cost. |
|
@erka Agreed this would be a breaking change for that specific setup. But I'd argue how the Multi-Provider handled I also think that failover topology is better served by a custom strategy built on the |
Signed-off-by: Jonathan Norris <jonathan.norris@dynatrace.com>
Signed-off-by: Jonathan Norris <jonathan.norris@dynatrace.com>
Signed-off-by: Jonathan Norris <jonathan.norris@dynatrace.com>
3856764 to
8183b5c
Compare
|
I don't see DISABLED as an error or a skipped step. To me, it's a successful evaluation result - a kill switch for a feature. I'm also not sure what the experience would be like for users leveraging multiprovider setups to migrate between providers. If an issue occurs in their application and a feature needs to be turned off immediately, they'll have to disable the flag in both providers to ensure its off. Analytics may also lose important context, as records that would previously have been marked as DISABLED could end up as DEFAULT, making it harder to understand historically why those evaluation results occurred. I believe I may not be aware of some important details about this change. That said, I think the simplest workaround for go-sdk would be to rename the existing strategies (making them custom) and document the migration path clearly in the release notes, so users can preserve the behavior if that's what their setup relies on. Would this approach work for you? @jonathannorris |
Summary
reason=DISABLEDmeans a provider has intentionally turned a flag off; it's an operational state, not a resolved value. In a Multi-Provider context, this is semantically equivalent toFLAG_NOT_FOUND— the provider isn't returning a value, so the Multi-Provider should fall through to the next one.This is a widespread pattern across the OpenFeature provider ecosystem. Flipt (Go, JS, .NET), Flagsmith (Go, Java, JS, .NET), LaunchDarkly, Optimizely, GO Feature Flag, OFREP, DevCycle, and the Dynatrace providers all return
reason=DISABLEDwhen a flag is intentionally off.This PR updates the three standard strategies to treat
reason=DISABLEDconsistently as a fall-through signal, the same asFLAG_NOT_FOUND:reason=DISABLEDreason=DISABLEDwhen looking for a provider with a valuereason=DISABLEDfrom the comparisonThe immediate trigger for this change is the flagd disabled flag evaluation ADR, which moves flagd from returning a
FLAG_DISABLEDerror to a successfulreason=DISABLEDresult. Without this spec update, that change would silently break Multi-Provider fall-through for flagd users.