Skip to content

Add CSS error sampler coverage (TS021)#7171

Open
ichinaski wants to merge 3 commits into
mainfrom
ichinaski/css-tests-phase-c
Open

Add CSS error sampler coverage (TS021)#7171
ichinaski wants to merge 3 commits into
mainfrom
ichinaski/css-tests-phase-c

Conversation

@ichinaski

@ichinaski ichinaski commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Motivation

Phase C (APMSP-3049) of the Client-Side Stats v1.2.0 coverage: test the interaction between CSS and the agent error sampler. The agent keeps a portion of error traces, so a tracer with CSS enabled must still send all traces containing an error even when sampling would otherwise drop them.

Changes

Adds Test_Error_Sampler in tests/stats/test_stats.py and a new TRACE_STATS_COMPUTATION_ERROR_SAMPLER scenario (DD_TRACE_SAMPLE_RATE=0 with CSS enabled). The test asserts stats are still computed and error traces are still sent to the agent. Active for golang, python, dotnet and java; marked missing_feature for ruby (no stats computed when sampling drops all traces), php (errors counted in stats but error traces dropped client-side) and cpp (no client-side stats).

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on you PR until the CI passes
  3. Mark it as ready for review
    • Test logic is modified? -> Get a review from RFC owner.
    • Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added, removed or renamed?

Add Test_Error_Sampler end-to-end test asserting that traces containing
errors are still sent to the agent when sampling would drop them, with a
new TRACE_STATS_COMPUTATION_ERROR_SAMPLER scenario. Active for golang,
python, dotnet and java; marked missing_feature for ruby and php (with
reasons) and cpp (no client-side stats).
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

CODEOWNERS have been resolved as:

.github/workflows/run-end-to-end.yml                                    @DataDog/system-tests-core
manifests/cpp.yml                                                       @DataDog/dd-trace-cpp
manifests/cpp_httpd.yml                                                 @DataDog/dd-trace-cpp
manifests/cpp_nginx.yml                                                 @DataDog/dd-trace-cpp
manifests/java.yml                                                      @DataDog/asm-java @DataDog/apm-java
manifests/php.yml                                                       @DataDog/apm-php @DataDog/asm-php
manifests/ruby.yml                                                      @DataDog/ruby-guild @DataDog/asm-ruby
tests/schemas/test_schemas.py                                           @DataDog/system-tests-core
tests/stats/test_stats.py                                               @DataDog/system-tests-core
utils/_context/_scenarios/__init__.py                                   @DataDog/system-tests-core

@datadog-prod-us1-3

datadog-prod-us1-3 Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 6 Pipeline jobs failed

Testing the test | System Tests (php, dev) / End-to-end #1 / apache-mod-7.1-zts 1   View in Datadog   GitHub Actions

🧪 1 Test failed

tests.ffe.test_exposures.Test_FFE_Exposure_Events.test_ffe_multiple_remote_config_files[apache-mod-7.1-zts] from system_tests_suite   View in Datadog
AssertionError: Timed out waiting for exposure event for flags ['test-flag-1', 'test-flag-2'] and subject 'test-user-multi'
assert False
 +  where False = <bound method ProxyBasedInterfaceValidator.wait_for of AgentInterfaceValidator('agent')>(<function wait_for_exposure_event.<locals>.<lambda> at 0x7f5b07ad6de0>, timeout=30)
 +    where <bound method ProxyBasedInterfaceValidator.wait_for of AgentInterfaceValidator('agent')> = AgentInterfaceValidator('agent').wait_for
 +      where AgentInterfaceValidator('agent') = interfaces.agent

self = <tests.ffe.test_exposures.Test_FFE_Exposure_Events object at 0x7f5b30c8ed50>

    def test_ffe_multiple_remote_config_files(self):
        """Test that FFE correctly handles multiple remote config files with different flags."""
...

DataDog/system-tests | ruby-app-deployment-mode.amd64.DOA9: [public.ecr.aws/lts/ubuntu:22.04, linux/amd64, 3.1.7]   View in Datadog   GitLab

DataDog/system-tests | ruby-app-deployment-mode.arm64.DOA9: [public.ecr.aws/lts/ubuntu:22.04, linux/arm64, 3.1.7]   View in Datadog   GitLab

View all 6 failed jobs.

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 66d900a | Docs | Datadog PR Page | Give us feedback!

vertx does not flag 500 responses as errors, so no error span is produced
and the error-sampler assertion cannot hold on those weblogs.
@ichinaski ichinaski marked this pull request as ready for review June 18, 2026 12:27
@ichinaski ichinaski requested review from a team as code owners June 18, 2026 12:27
@ichinaski ichinaski requested review from claponcet, jandro996, tabgok and ygree and removed request for a team June 18, 2026 12:27

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 496bf72a1c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/stats/test_stats.py
Comment on lines +636 to +638
# Droppable P0 traffic: with sample rate 0 these are dropped from traces but still counted in stats.
for _ in range(5):
weblog.get("/")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Assert that non-error sampled traces are dropped

Because this loop discards the successful request handles, the test below only proves that an error trace was emitted; it never proves that DD_TRACE_SAMPLE_RATE=0 actually dropped non-error P0 traces. In contexts like Node.js, where manifests/nodejs.yml already records that DD_TRACE_SAMPLE_RATE=0 still emits P0 traces, this scenario can pass while the prerequisite sampling behavior is broken, giving a false activation signal. Please retain these requests and assert they have no emitted spans, or mark unsupported libraries accordingly.

Useful? React with 👍 / 👎.

@ichinaski ichinaski Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The false-activation risk doesn't apply to the libraries this test runs on.

Comment thread tests/stats/test_stats.py
def test_error_traces_always_sent(self):
"""Test that error traces are sent and stats are computed when sample rate is 0."""
# Client-side stats are computed despite sampling being disabled.
stats_requests = list(interfaces.library.get_data("/v0.6/stats"))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add the new stats scenario to the schema allow-list

This new scenario reads /v0.6/stats payloads, but the auxiliary schema test's APMSP-2158 allow-list in tests/schemas/test_schemas.py only includes the existing trace-stats scenarios, not trace_stats_computation_error_sampler. When this runs for libraries covered by that known schema bug, Test_DdtraceSchemas.test_library will treat the same stats schema errors as new failures for this scenario. Please add this scenario to the existing known-bug condition or fix the schema before enabling it.

Useful? React with 👍 / 👎.

…P-2158)

The new TRACE_STATS_COMPUTATION_ERROR_SAMPLER scenario reads /v0.6/stats,
so it is subject to the same known schema bug as the other trace-stats
scenarios; add it to the existing allow-list.

@tabgok tabgok left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved from IDM's standpoint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants