Skip to content

Enable EVP flagevaluation system tests for Python#7156

Draft
leoromanovsky wants to merge 14 commits into
mainfrom
leo.romanovsky/ffe-evp-flagevaluation-enable-python
Draft

Enable EVP flagevaluation system tests for Python#7156
leoromanovsky wants to merge 14 commits into
mainfrom
leo.romanovsky/ffe-evp-flagevaluation-enable-python

Conversation

@leoromanovsky

@leoromanovsky leoromanovsky commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Motivation

tests/ffe/test_flag_eval_evp.py covers the EVP flagevaluation contract for server-side SDKs. Python now has a passing local focused run against the implementation PR, including aggregation, bounded context payloads, and natural degraded-tier overflow.

Changes

  • Enables tests/ffe/test_flag_eval_evp.py for Python in manifests/python.yml at v4.12.0rc1.
  • Sets the FFE scenario Python flush interval to 60 seconds so the degradation test exercises one aggregation window.
  • Leaves the shared test expectations from Add EVP flagevaluation system tests #7146 intact.
  • Keeps this as a sibling PR based on leo.romanovsky/ffe-evp-flagevaluation-system-tests, not stacked on other language enablement branches.

Decisions

  • This PR enables only Python; other languages are enabled by separate sibling PRs after their own local pass evidence.
  • Degradation is exercised naturally by exceeding the production per-flag full-tier cap; no test-only cap override is required.
  • Existing OTel metric coverage stays in tests/ffe/test_flag_eval_metrics.py.

Validation

  • ./build.sh python -w flask-poc - PASS.
  • ./run.sh FEATURE_FLAGGING_AND_EXPERIMENTATION --library python --weblog flask-poc -k "test_flag_eval_evp" - PASS, 8 passed, 2577 deselected in 59.91s.
  • System-test context: Agent 7.80.1; library python@4.12.0-rc1; artifact b9102dea0c5d812f6fecee87eb1fea54d06c09c1; weblog flask-poc; Linux x86_64.
  • Agent payload evidence:
    • evp-count-flag: 1 event, total evaluation_count=5.
    • evp-burst-aggregation-flag: 1 event, total evaluation_count=512.
    • evp-high-cardinality-aggregation-flag: 128 events, total evaluation_count=128.
    • evp-degradation-flag: 10001 events, total evaluation_count=10050, full_total=10000, degraded_total=50.

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

CODEOWNERS have been resolved as:

manifests/python.yml                                                    @DataDog/apm-python @DataDog/asm-python
utils/_context/_scenarios/__init__.py                                   @DataDog/system-tests-core

@datadog-prod-us1-3

datadog-prod-us1-3 Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 19 Pipeline jobs failed

Testing the test | System Tests (dotnet, dev) / End-to-end #2 / poc 2   View in Datadog   GitHub Actions

🧪 5 Tests failed

tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start.test_ffe_rc_down_from_start[poc] from system_tests_suite   View in Datadog
AssertionError: Flag evaluation failed: {"type":"https://tools.ietf.org/html/rfc9110#section-15.5.1","title":"One or more validation errors occurred.","status":400,"errors":{"TargetingKeys":["The TargetingKeys field is required."]},"traceId":"00-a69709c89743c3537b25417d6549890d-73d4aa1debe0c229-00"}
assert 400 == 200
 +  where 400 = HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-a69709c89743c3537b25417d6549890d-73d4aa1debe0c229-00"}).status_code
 +    where HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-a69709c89743c3537b25417d6549890d-73d4aa1debe0c229-00"}) = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start object at 0x7f4ec97566f0>.r

self = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start object at 0x7f4ec97566f0>

    def test_ffe_rc_down_from_start(self):
        """Test that default value is returned when RC is down from start."""
        # Verify tracer received 503 from RC
...
tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable.test_ffe_rc_unavailable_graceful_degradation[poc] from system_tests_suite   View in Datadog
AssertionError: Baseline request failed: {"type":"https://tools.ietf.org/html/rfc9110#section-15.5.1","title":"One or more validation errors occurred.","status":400,"errors":{"TargetingKeys":["The TargetingKeys field is required."]},"traceId":"00-fb57de908ef1291c77216cbe83d24b8e-d65142856a63cd96-00"}
assert 400 == 200
 +  where 400 = HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-fb57de908ef1291c77216cbe83d24b8e-d65142856a63cd96-00"}).status_code
 +    where HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-fb57de908ef1291c77216cbe83d24b8e-d65142856a63cd96-00"}) = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable object at 0x7f4ec9757ce0>.baseline_eval

self = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable object at 0x7f4ec9757ce0>

    def test_ffe_rc_unavailable_graceful_degradation(self):
        """Test that cached flag configs continue working when RC is unavailable."""
        # Verify tracer received 503 from RC
...
View all 5 test failures

Testing the test | System Tests (dotnet, dev) / End-to-end #2 / uds 2   View in Datadog   GitHub Actions

🧪 5 Tests failed

tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start.test_ffe_rc_down_from_start[uds] from system_tests_suite   View in Datadog
AssertionError: Flag evaluation failed: {"type":"https://tools.ietf.org/html/rfc9110#section-15.5.1","title":"One or more validation errors occurred.","status":400,"errors":{"TargetingKeys":["The TargetingKeys field is required."]},"traceId":"00-30517cc944891146913397d5675323c7-b532a671497324bb-00"}
assert 400 == 200
 +  where 400 = HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-30517cc944891146913397d5675323c7-b532a671497324bb-00"}).status_code
 +    where HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-30517cc944891146913397d5675323c7-b532a671497324bb-00"}) = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start object at 0x7fee28c0f8c0>.r

self = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start object at 0x7fee28c0f8c0>

    def test_ffe_rc_down_from_start(self):
        """Test that default value is returned when RC is down from start."""
        # Verify tracer received 503 from RC
...
tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable.test_ffe_rc_unavailable_graceful_degradation[uds] from system_tests_suite   View in Datadog
AssertionError: Baseline request failed: {"type":"https://tools.ietf.org/html/rfc9110#section-15.5.1","title":"One or more validation errors occurred.","status":400,"errors":{"TargetingKeys":["The TargetingKeys field is required."]},"traceId":"00-180a836884298814410c7c28bfee1756-96fc02b5d3e4aa9f-00"}
assert 400 == 200
 +  where 400 = HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-180a836884298814410c7c28bfee1756-96fc02b5d3e4aa9f-00"}).status_code
 +    where HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-180a836884298814410c7c28bfee1756-96fc02b5d3e4aa9f-00"}) = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable object at 0x7fee28c0f980>.baseline_eval

self = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable object at 0x7fee28c0f980>

    def test_ffe_rc_unavailable_graceful_degradation(self):
        """Test that cached flag configs continue working when RC is unavailable."""
        # Verify tracer received 503 from RC
...
View all 5 test failures

Testing the test | System Tests (dotnet, prod) / End-to-end #2 / poc 2   View in Datadog   GitHub Actions

🧪 5 Tests failed

tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start.test_ffe_rc_down_from_start[poc] from system_tests_suite   View in Datadog
AssertionError: Flag evaluation failed: {"type":"https://tools.ietf.org/html/rfc9110#section-15.5.1","title":"One or more validation errors occurred.","status":400,"errors":{"TargetingKeys":["The TargetingKeys field is required."]},"traceId":"00-fb21223f0cf09e48ed232d7336e6a83b-3fd4319bd44583b6-00"}
assert 400 == 200
 +  where 400 = HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-fb21223f0cf09e48ed232d7336e6a83b-3fd4319bd44583b6-00"}).status_code
 +    where HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-fb21223f0cf09e48ed232d7336e6a83b-3fd4319bd44583b6-00"}) = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start object at 0x7f1849f3a150>.r

self = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Down_From_Start object at 0x7f1849f3a150>

    def test_ffe_rc_down_from_start(self):
        """Test that default value is returned when RC is down from start."""
        # Verify tracer received 503 from RC
...
tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable.test_ffe_rc_unavailable_graceful_degradation[poc] from system_tests_suite   View in Datadog
AssertionError: Baseline request failed: {"type":"https://tools.ietf.org/html/rfc9110#section-15.5.1","title":"One or more validation errors occurred.","status":400,"errors":{"TargetingKeys":["The TargetingKeys field is required."]},"traceId":"00-c57cc1bfe506b636bd26793523200ea9-38f40f706d72c60e-00"}
assert 400 == 200
 +  where 400 = HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-c57cc1bfe506b636bd26793523200ea9-38f40f706d72c60e-00"}).status_code
 +    where HttpResponse(status_code:400, headers:{'Content-Type': 'application/problem+json; charset=utf-8', 'Date': 'Wed, 17 Jun...ngKeys":["The TargetingKeys field is required."]},"traceId":"00-c57cc1bfe506b636bd26793523200ea9-38f40f706d72c60e-00"}) = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable object at 0x7f1849f3a090>.baseline_eval

self = <tests.ffe.test_dynamic_evaluation.Test_FFE_RC_Unavailable object at 0x7f1849f3a090>

    def test_ffe_rc_unavailable_graceful_degradation(self):
        """Test that cached flag configs continue working when RC is unavailable."""
        # Verify tracer received 503 from RC
...
View all 5 test failures

View all 19 failed jobs.

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 7ec9df5 | Docs | Datadog PR Page | Give us feedback!

Base automatically changed from leo.romanovsky/ffe-evp-flagevaluation-system-tests to main June 17, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant