Skip to content

Gitlab pipeline for End to End and parametric scenarios (WIP)#6651

Draft
nccatoni wants to merge 215 commits into
mainfrom
nccatoni/gitlab
Draft

Gitlab pipeline for End to End and parametric scenarios (WIP)#6651
nccatoni wants to merge 215 commits into
mainfrom
nccatoni/gitlab

Conversation

@nccatoni

@nccatoni nccatoni commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator

Motivation

Changes

Potential issue

Rebuilding of CI images is susceptible to race conditions if two PRs modifying different dependencies of the image are merged to main without being rebased on one another before. The right way to solve this issue would be to implement a merge queue. This is still unlikely in the current configuration and should be caught fairly fast by running the workflow on main.

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on you PR until the CI passes
  3. Mark it as ready for review
    • Test logic is modified? -> Get a review from RFC owner.
    • Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added, removed or renamed?

@github-actions

github-actions Bot commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

CODEOWNERS have been resolved as:

mirror_images.lock.yaml                                                 @DataDog/system-tests-core
mirror_images.yaml                                                      @DataDog/system-tests-core
tests/test_the_test/test_build_pipeline.py                              @DataDog/system-tests-core
tests/test_the_test/test_external_gitlab_pipeline.py                    @DataDog/system-tests-core
tests/test_the_test/test_gitlab_pipeline_structure.py                   @DataDog/system-tests-core
utils/_context/_image_mirror.py                                         @DataDog/system-tests-core
utils/ci/__init__.py                                                    @DataDog/system-tests-core
utils/ci/gitlab/__init__.py                                             @DataDog/system-tests-core
utils/ci/gitlab/build_pipeline.py                                       @DataDog/system-tests-core
utils/ci/gitlab/docker/git.Dockerfile                                   @DataDog/system-tests-core
utils/ci/gitlab/docker/system-tests.Dockerfile                          @DataDog/system-tests-core
utils/ci/gitlab/main.yml                                                @DataDog/system-tests-core
utils/ci/gitlab/section.sh                                              @DataDog/system-tests-core
utils/ci/gitlab/ssi.yml                                                 @DataDog/system-tests-core
utils/ci/gitlab/system-tests.yml.j2                                     @DataDog/system-tests-core
utils/ci/gitlab/templates.yml                                           @DataDog/system-tests-core
utils/scripts/mirror_rewrite_dockerfile.py                              @DataDog/system-tests-core
utils/scripts/update_mirror_images.py                                   @DataDog/system-tests-core
.gitlab-ci.yml                                                          @DataDog/system-tests-core
.yamlfmt                                                                @DataDog/system-tests-core
.yamllint                                                               @DataDog/system-tests-core
format.sh                                                               @DataDog/system-tests-core
requirements.txt                                                        @DataDog/system-tests-core
tests/test_the_test/test_ci_orchestrator.py                             @DataDog/system-tests-core
tests/test_the_test/test_compute_libraries_and_scenarios.py             @DataDog/system-tests-core
utils/_context/containers.py                                            @DataDog/system-tests-core
utils/build/build.sh                                                    @DataDog/system-tests-core
utils/docker_fixtures/_test_agent.py                                    @DataDog/system-tests-core
utils/scripts/ci_orchestrators/external_gitlab_pipeline.py              @DataDog/system-tests-core
utils/scripts/ci_orchestrators/gitlab_exporter.py                       @DataDog/system-tests-core
utils/scripts/ci_orchestrators/workflow_data.py                         @DataDog/system-tests-core
utils/scripts/compute-workflow-parameters.py                            @DataDog/system-tests-core
utils/scripts/compute_libraries_and_scenarios.py                        @DataDog/system-tests-core

@nccatoni nccatoni force-pushed the nccatoni/gitlab branch 6 times, most recently from 17e7dfc to 047676e Compare March 31, 2026 15:41
@datadog-datadog-prod-us1

datadog-datadog-prod-us1 Bot commented Mar 31, 2026

Copy link
Copy Markdown

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 2 Pipeline jobs failed

Testing the test | lint / lint   View in Datadog   GitHub Actions

Testing the test | all-jobs-are-green   View in Datadog   GitHub Actions

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 33cbc55 | Docs | Datadog PR Page | Give us feedback!

@nccatoni nccatoni force-pushed the nccatoni/gitlab branch 6 times, most recently from c7fe9a2 to ad45877 Compare April 1, 2026 09:45
@nccatoni nccatoni force-pushed the nccatoni/gitlab branch 2 times, most recently from 9384467 to ed748cf Compare April 1, 2026 10:46
nccatoni and others added 29 commits June 11, 2026 14:02
- Add retain_line_breaks_single: true to .yamlfmt so yamlfmt preserves
  single blank lines between top-level keys.
- Drop utils/ci/gitlab/ from yamlfmt scope: GitLab's $[[ ]] interpolation
  syntax is not valid YAML and causes yamlfmt to abort on those files.
  yamllint (not yamlfmt) is the lint tool covering that directory.
- Restore blank lines stripped by yamlfmt in main.yml, ssi.yml, and
  templates.yml (commit 612d10e ran yamlfmt on them before the scope
  was narrowed).
- Apply ruff format to test_build_pipeline.py, test_external_gitlab_pipeline.py,
  test_gitlab_pipeline_structure.py, and build_pipeline.py; remove two
  stale noqa: E501 directives that became redundant after reformatting.
Replace the global 'indentation: disable' added in 612d10e with:
- A per-rule 'indentation.ignore' covering only .gitlab-ci.yml, which
  uses intentional mixed indentation per GitLab CI conventions.
- A top-level 'ignore' block for files that cannot be linted by yamllint
  at all: templates.yml (contains ANSI escape codes for CI section folding),
  main.yml (embedded newlines in block scalars break YAML parsing), and
  system-tests.yml.j2 (Jinja2 template, not plain YAML).
- ssi.yml passes yamllint clean with the indentation rule enabled.
- manifests/ indentation coverage is fully restored.
Remove unjustified yamllint ignore for templates.yml/main.yml/system-tests.yml.j2:
those files lint cleanly. Drop trailing blank line in main.yml flagged by the
empty-lines rule.
Replace the piped `mirror_images_list.py | xargs ... add` workflow with a
self-contained `python utils/scripts/update_mirror_images.py` that:
  * enumerates every image required by the CI scenarios,
  * records them in mirror_images.yaml (mirror_images.py `add`),
  * resolves digests into mirror_images.lock.yaml (mirror_images.py `lock`),
  * never pushes/mirrors anything (commit both files yourself).

The mirror_images_check CI job runs it with --skip-lock (no registry creds
needed) and fails on mirror_images.yaml drift. Restores the 93-image
mirror_images.yaml that was dropped during the branch squash.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a main-only (and manual) job that runs `mirror_images.py mirror --dry-run`
against the committed lock. The dry-run reads only the destination registry and
exits 0 even when images are missing, so the job greps its output for
"needs copy" and fails if any locked image is absent from MIRROR_DEST_REGISTRY
(a non-zero exit from registry auth/network errors also fails it).

Factor the crane+uv setup into a shared .mirror_images_registry_job template,
and drop the in-CI re-lock from mirror_images_mirror so mirror and verify both
operate off the same committed digests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace mirror_images_check / mirror_images_mirror / mirror_images_verify with
one job that, on every push:
  * regenerates the image list and fails on mirror_images.yaml drift,
  * pushes any locked image missing from the mirror (mirror checks the
    destination first, so already-present images are skipped).

It never updates the lock file. crane copies registry-to-registry over HTTPS,
so no docker-in-docker is needed and it runs on the framework runner image.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an opt-in image-mirror layer so CI pulls images from
registry.ddbuild.io/system-tests/mirror instead of docker.io/ghcr.io/etc.

- utils/_context/_image_mirror.py: mirror_image(ref) maps a source ref to its
  mirror target from mirror_images.lock.yaml when USE_IMAGE_MIRROR is truthy;
  passthrough otherwise (off by default, safe where the mirror is unreachable).
- containers.py: ImageInfo rewrites its ref to the mirror and, on pull failure,
  falls back to the original registry.
- _test_agent.py: TestAgentFactory image routed through the mirror.
- build.sh + utils/scripts/mirror_rewrite_dockerfile.py: when enabled, weblog
  FROM base images are rewritten to the mirror at build time (stage names and
  unmirrored refs untouched).
- templates.yml: enable USE_IMAGE_MIRROR=1 for e2e jobs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ror job

- main.yml: .run_test_pipeline_base gains an optional `needs: mirror_images`
  so the e2e build/run jobs only start once the mirror is populated. Optional
  so repos that include this template without a mirror_images job are unaffected.
- .gitlab-ci.yml: move mirror_images to the e2e stage (so earlier-DAG jobs can
  depend on it) and give it Docker Hub auth (crane auth login) since it is now
  the only job pulling the mirror sources from Docker Hub.
- Drop `docker_auth: true` so the generated test jobs no longer authenticate to
  Docker Hub (they pull from the mirror); the shared docker_auth plumbing stays
  dormant (default false) for other repos.
- Remove dead DOCKER_LOGIN exports from check_merge_labels (it never logs in).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The generated build/run/parametric jobs read binaries from binaries/, but a
consumer's build job may publish its artifact elsewhere (e.g. dd-trace-py's
"build linux" puts wheels in pywheels/). Add a binaries_artifact_path input
(threaded main.yml -> build_pipeline.py -> system-tests.yml.j2) that, when set
alongside binaries_artifact, copies $CI_PROJECT_DIR/<path>/. into binaries/
before building/running. Default empty = unchanged behaviour.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Generated child-pipeline jobs cannot needs: a parent pipeline job by name alone.
When binaries_artifact_path is set (meaning the artifact comes from the parent),
emit `pipeline: \$PARENT_PIPELINE_ID` on the needs entry so GitLab resolves it
correctly. Pass PARENT_PIPELINE_ID: \$CI_PIPELINE_ID from the trigger variables
in .run_test_pipeline_base so the child pipeline has the value available.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
binaries_artifact names a job in the parent pipeline by definition, so
pipeline: \$PARENT_PIPELINE_ID belongs on every binaries_artifact needs entry,
not only when binaries_artifact_path is also set.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
The trigger jobs (.run_test_pipeline_base) now need the binaries_artifact
job (optional) so the child pipeline doesn't start before the artifact is
ready. Without this the child pipeline fired immediately and the
cross-pipeline needs failed to retrieve artifacts.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Job names containing ':' and '[' (e.g. parallel matrix instance names like
'build linux: [amd64, cp311-cp311, ...]') are special YAML characters and
must be quoted, otherwise the generated pipeline YAML is invalid.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
build_test_pipeline needs the binaries_artifact job (optional, so it's
a no-op when the input is empty). Since run_test_pipeline_* always waits
for build_test_pipeline before triggering the child pipeline, this
guarantees the artifact is ready before any child job tries to download it
via pipeline: \$PARENT_PIPELINE_ID.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
build_test_pipeline only needs to wait for the job to finish before
generating the child pipeline; it never uses the artifacts itself.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
\$CI_PARENT_PIPELINE_ID is provided automatically by GitLab in any triggered
child pipeline. The manually-passed PARENT_PIPELINE_ID was redundant and
potentially empty if not resolved correctly, causing the artifact download to
fail. Since needs:pipeline: waits for the referenced job itself, build_test_pipeline
also doesn't need to wait for binaries_artifact.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
needs:pipeline: in child jobs does NOT wait — it just tries to download
the artifact and fails if the upstream job is not done yet. build_test_pipeline
must therefore wait for the binaries_artifact job (optional, no-op when the
input is empty) so the artifact is guaranteed to exist before any child
pipeline job runs.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
The generated child pipeline jobs use pipeline: \$UPSTREAM_PIPELINE_ID
to reference the parent pipeline's binaries artifact. This variable was
never being set, so the artifact download had an empty pipeline reference
and failed.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Different weblogs need different python version wheels. Add binaries_artifacts
(semicolon-separated, since job names contain commas) as the full list of
upstream artifact jobs to download from in the generated child pipeline.
The existing single-job binaries_artifact remains the timing gate.

Falls back to [binaries_artifact] when binaries_artifacts is empty, so
existing callers that only set binaries_artifact are unaffected.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@datadog-system-tests-org

datadog-system-tests-org Bot commented Jun 18, 2026

Copy link
Copy Markdown

Pipelines

⚠️ Warnings

🚦 2 Pipeline jobs failed

Testing the test | all-jobs-are-green   View in Datadog   GitHub Actions

Testing the test | lint / lint   View in Datadog   GitHub Actions

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 33cbc55 | Docs | Give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant