Skip to content

feat(applicationlayer): WAF v3 (Coraza WASM) render — kube-controllers plumbing + admission webhook#4821

Merged
electricjesus merged 12 commits into
tigera:masterfrom
electricjesus:seth/applicationlayer-render-v3
Jun 12, 2026
Merged

feat(applicationlayer): WAF v3 (Coraza WASM) render — kube-controllers plumbing + admission webhook#4821
electricjesus merged 12 commits into
tigera:masterfrom
electricjesus:seth/applicationlayer-render-v3

Conversation

@electricjesus

@electricjesus electricjesus commented May 19, 2026

Copy link
Copy Markdown
Member

Summary

Operator-side render for WAF v3 (Coraza WASM) on calico-kube-controllers. Pairs with the merged reconcilers (tigera/calico-private#11834) and the in-process SecLang validating admission webhook (tigera/calico-private#12141, EV-6657). Design: tigera/designs#25 (PMREQ-384).

Review scope — 3 commits, ~470 lines:

Commit What
feat(api): add GatewayAPI WAF extension gating field GatewayAPI.spec.extensions.waf.state (Enabled/Disabled, default off) + deepcopy + CRD
feat(applicationlayer): render WAF v3 + in-process admission webhook gateway_waf.go (webhook Service + VWC), kube-controllers env/RBAC/cert/port, coraza-wasm component
feat(applicationlayer): wire WAF v3 render + webhook into installation controller cert issuance + render wiring in core controller

Everything is gated on GatewayAPI.spec.extensions.waf.state == Enabled — non-WAF / OSS / WAF-disabled installs render a byte-identical kube-controllers Deployment.

Merge order (why hold merge): the webhook render is failurePolicy: Fail against the in-process server on :9443, which ships in tigera/calico-private#12141 (open). Merging this before cp#12141 is in the hashrelease kube-controllers image would brick wafplugins/wafpolicies writes once WAF is enabled. Review now; merge lands right after cp#12141. The plumbing half pairs with already-merged cp#11834.

WAF SecLang admission webhook — in-process (calico-kube-controllers)

The webhook runs in-process inside the existing calico-kube-controllers Pod (the applicationlayer manager's webhook server) — not a standalone Deployment. pkg/render/applicationlayer/gateway_waf.go renders only:

  • a Service (tigera-waf-webhook) fronting the kube-controllers Pod (:443 → :9443), and
  • a ValidatingWebhookConfiguration intercepting CREATE/UPDATE on wafplugins + wafpolicies at /validate-waf, failurePolicy: Fail (the reconciler backstop is status-only, so the webhook is the hard admission gate), caBundle = the operator CA.

No dedicated ServiceAccount/ClusterRole/ClusterRoleBinding/Deployment — the webhook reuses the kube-controllers ServiceAccount + ClusterRole.

kube-controllers plumbing (gated on GatewayAPI.spec.extensions.waf.state == Enabled)

  • WASM_IMAGE / WASM_PULL_SECRET / WASM_CA_CERT env + ENABLED_CONTROLLERS=applicationlayer (names verified against the merged reconcilers' manager.go).
  • WAF reconciler RBAC: the 6 applicationlayer.projectcalico.org resources + /status + /finalizers, EnvoyExtensionPolicy CRUD, events (core + events.k8s.io), secret/ConfigMap replication, Gateway/HTTPRoute reads for targetRef validation, namespaces patch/update for the upcoming per-namespace waf-id-range annotation.
  • The webhook serving cert (WAFWebhookServerTLS, issued for tigera-waf-webhook.calico-system.svc) mounted into the Pod; WAF_WEBHOOK_CERT_DIR env; container port 9443.

Controller wiring (installation core controller)

Issues the serving cert via CertificateManager, materializes it into calico-system through the existing CertificateManagement render, threads it into the kube-controllers config, and renders the webhook Service + ValidatingWebhookConfiguration — all behind the WAF-enabled gate.

Test plan

  • go test ./pkg/render/applicationlayer/... ./pkg/render/kubecontrollers/... — webhook contract (resources/path/failurePolicy/caBundle), kube-controllers cert mount + env + port, RBAC.
  • go build ./..., go vet, gofmt clean; full Operator CI green (2424 tests).
  • RBAC audited against the merged reconcilers' kubebuilder markers + client-call sweep — no missing grants (incl. the controller-runtime cached-client list/watch for Gateway/HTTPRoute Gets, and */finalizers for OCP OwnerReferencesPermissionEnforcement).
  • End-to-end admission round-trip on a cluster (apiserver → in-process webhook TLS handshake) once cp#12141 is in a hashrelease image.

Release Note

Add operator render for the WAF v3 (Coraza WASM) SecLang validating admission
webhook — served in-process by calico-kube-controllers — plus the WASM_* env
vars and serving-cert / RBAC plumbing (paired with tigera/calico-private#11834
and #12141). The existing WAF v1 (sidecar / ModSecurity) render path is untouched.

Linked

@marvin-tigera marvin-tigera added this to the v1.43.0 milestone May 19, 2026
@electricjesus electricjesus added the hold merge Do not merge label May 19, 2026
@electricjesus electricjesus force-pushed the seth/applicationlayer-render-v3 branch from 4c8c92b to 46b19ae Compare May 19, 2026 13:25
@electricjesus electricjesus force-pushed the seth/applicationlayer-render-v3 branch 2 times, most recently from 96a55a7 to 2f38001 Compare May 29, 2026 14:50
@electricjesus electricjesus marked this pull request as ready for review May 29, 2026 15:54
@electricjesus electricjesus requested a review from a team as a code owner May 29, 2026 15:54
@electricjesus electricjesus force-pushed the seth/applicationlayer-render-v3 branch from 1690e71 to 2f38001 Compare June 4, 2026 21:22
@electricjesus electricjesus changed the title feat(applicationlayer): WAF v3 render + kube-controllers config plumbing feat(applicationlayer): WAF v3 (Coraza WASM) render — kube-controllers plumbing + admission webhook Jun 4, 2026
@rene-dekker

Copy link
Copy Markdown
Member

Nit: It would be possible to cut down on the code comments. Especially when the comments don't add any information you couldn't get directly from reading the code.
Everyone has AI agents now, so the reader could even consult these to help them read/navigate the code.

Comment thread api/v1/gatewayapi_types.go Outdated
Comment thread hack/gen-versions/enterprise.go.tpl Outdated
variant: enterpriseVariant,
}
{{- end }}
{{ with index .Components "coraza-wasm" }}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me if we could put this in an existing image and then change the run cmd when it is used. What was the motivation for an extra image again?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we needed this extra image because this is an oci image that the envoy gateway controller consumes and distributes it out to gateway proxies. it's a scratch oci image with just 1 file, a .wasm file which contains coraza engine + rules. needs to be small so it gets distributed fast, i think we set our benchmark to be roughly 30mb, we're well under that i think

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It adds extra maintenance on our side and the customer when we add extra images. We could add the WASM image to an existing image like envoyproxy and eliminate needing to pull anything altogether, which would result in better performance. File loading is also supported, we could even point to a file path, rather than an image path.

Comment thread pkg/common/common.go Outdated
Comment thread pkg/controller/installation/core_controller.go
Comment thread pkg/controller/installation/core_controller.go Outdated
Comment thread pkg/controller/installation/core_controller.go Outdated
Comment thread pkg/render/kubecontrollers/kube-controllers.go Outdated
Comment thread pkg/render/kubecontrollers/kube-controllers.go
Comment thread pkg/render/applicationlayer/gateway_waf.go
Comment thread pkg/render/kubecontrollers/kube-controllers.go Outdated
Comment thread pkg/controller/installation/core_controller.go Outdated
Comment thread pkg/render/applicationlayer/gateway_waf.go
electricjesus added a commit to electricjesus/operator that referenced this pull request Jun 6, 2026
…rollers component

Review feedback (rene-dekker, tigera#4821):
- Move the webhook Service + ValidatingWebhookConfiguration out of the core
  controller passthrough into the kube-controllers component, so the objects
  are emitted as objectsToDelete when the WAF extension is disabled or the
  GatewayAPI CR is removed. Add deletion test coverage.
- Export WAFWebhookContainerPort from pkg/render/applicationlayer and use it
  for the container port and NetworkPolicy ingress rule instead of duplicated
  9443 constants.
- Use gatewayapi.GetGatewayAPI for the WAF gate so the legacy tigera-secure
  CR name is handled (and a default/tigera-secure duplicate degrades).
- Drop the nil-guard around the WAF webhook KeyPairOption (the
  certificate-management render skips nil key pairs).
- Log when multiple imagePullSecrets are configured and only the first is
  used for the WAF wasm OCI pull.
electricjesus added a commit to electricjesus/operator that referenced this pull request Jun 6, 2026
Review feedback (rene-dekker, tigera#4821): nothing consumes the constant, and the
calico-private side has since moved from the ingress-gateway-addons feature
gate to a binary license-validity check (kube-controllers applicationlayer
LicenseGate), so the feature string has no remaining consumer. Also take the
iff->if comment suggestion.
electricjesus added a commit to electricjesus/operator that referenced this pull request Jun 6, 2026
…AF wasm pull secret

Review feedback (rene-dekker, tigera#4821): rather than copying only the first
Installation pull secret into tigera-waf-pull-secret, merge the registry
auths of every Installation pull secret into it. The EnvoyExtensionPolicy
image source takes a single pullSecretRef, so a merged secret is the only way
to honor multiple pull secrets for the Coraza wasm OCI pull (e.g. the Tigera
pull secret plus credentials for a private registry mirror).

First secret in Installation order wins on duplicate registry entries; the
merged map marshals with sorted keys so the rendered bytes are deterministic
across reconciles. Unparseable secrets are skipped and logged rather than
failing the reconcile. Legacy dockercfg-type secrets are supported.
@electricjesus electricjesus force-pushed the seth/applicationlayer-render-v3 branch from 5912b6e to 5eaab67 Compare June 12, 2026 12:14
Add spec.extensions.waf.state (+ IsWAFGatewayExtensionEnabled helper) to the
GatewayAPI CR to gate the WAF v3 (Gateway API add-on) surface, default-off.
Regenerate deepcopy + CRD manifest.

Refs EV-6657
Render the WAF v3 (Coraza WASM) surface on calico-kube-controllers, gated on the
GatewayAPI WAF extension:

- WASM_IMAGE/WASM_PULL_SECRET/WASM_CA_CERT env, ENABLED_CONTROLLERS, reconciler
  RBAC (wafpolicies/plugins, EnvoyExtensionPolicy, events, secret replication),
  coraza-wasm component (config/enterprise_versions.yml + gen-versions template +
  generated enterprise.go) + GatewayAddonsFeature constant.
- In-process WAF SecLang validating admission webhook: a Service fronting the
  kube-controllers Pod + ValidatingWebhookConfiguration (wafplugins/wafpolicies,
  /validate-waf, FailurePolicy=Fail, caBundle=operator CA); the serving-cert
  mount + WAF_WEBHOOK_CERT_DIR env + container port 9443; and namespaces
  patch/update RBAC for the waf-id-range annotation.

Refs EV-6657
…n controller

Gate on GatewayAPI.spec.extensions.waf.state, issue the webhook serving cert for
the tigera-waf-webhook Service DNS (materialized into calico-system via the
existing CertificateManagement render), thread it into the kube-controllers
config, and render the webhook Service + ValidatingWebhookConfiguration.

Refs EV-6657
…k :9443 (EV-6386)

The WAF admission webhook serves on :9443 in kube-controllers, but the
calico-system kube-controllers NetworkPolicy only allowed ingress on :9094, so
default-deny dropped the apiserver -> webhook request and the ValidatingWebhook
timed out (failurePolicy=Fail) -- blocking all WAFPolicy/WAFPlugin writes. Add a
:9443 ingress rule, gated on the GatewayAPI WAF extension being enabled.
…ull (EV-6386)

The WAF reconciler replicated WASM_PULL_SECRET (the install pull secret,
tigera-pull-secret) into tenant namespaces, but the GatewayAPI render also
copies tigera-pull-secret there (operator-managed) so the replica conflicts
(ReplicaUnmanaged) and WAFPolicies are blocked. Provision + replicate a
dedicated tigera-waf-pull-secret (renamed copy of the install pull secret)
instead, avoiding the clash.
… (EV-6386)

Symmetric to the tigera-waf-pull-secret fix: WASM_CA_CERT pointed at the
operator-managed tigera-ca-bundle, which the GatewayAPI render also copies into
tenant namespaces, so the WAF reconciler's replica clashed (ReplicaUnmanaged).
Provision + replicate a dedicated tigera-waf-ca-bundle copy instead.
…rollers component

Review feedback (rene-dekker, tigera#4821):
- Move the webhook Service + ValidatingWebhookConfiguration out of the core
  controller passthrough into the kube-controllers component, so the objects
  are emitted as objectsToDelete when the WAF extension is disabled or the
  GatewayAPI CR is removed. Add deletion test coverage.
- Export WAFWebhookContainerPort from pkg/render/applicationlayer and use it
  for the container port and NetworkPolicy ingress rule instead of duplicated
  9443 constants.
- Use gatewayapi.GetGatewayAPI for the WAF gate so the legacy tigera-secure
  CR name is handled (and a default/tigera-secure duplicate degrades).
- Drop the nil-guard around the WAF webhook KeyPairOption (the
  certificate-management render skips nil key pairs).
- Log when multiple imagePullSecrets are configured and only the first is
  used for the WAF wasm OCI pull.
Review feedback (rene-dekker, tigera#4821): nothing consumes the constant, and the
calico-private side has since moved from the ingress-gateway-addons feature
gate to a binary license-validity check (kube-controllers applicationlayer
LicenseGate), so the feature string has no remaining consumer. Also take the
iff->if comment suggestion.
…AF wasm pull secret

Review feedback (rene-dekker, tigera#4821): rather than copying only the first
Installation pull secret into tigera-waf-pull-secret, merge the registry
auths of every Installation pull secret into it. The EnvoyExtensionPolicy
image source takes a single pullSecretRef, so a merged secret is the only way
to honor multiple pull secrets for the Coraza wasm OCI pull (e.g. the Tigera
pull secret plus credentials for a private registry mirror).

First secret in Installation order wins on duplicate registry entries; the
merged map marshals with sorted keys so the rendered bytes are deterministic
across reconciles. Unparseable secrets are skipped and logged rather than
failing the reconcile. Legacy dockercfg-type secrets are supported.
…nfigMap (EV-6386)

The WAF reconciler replicates the WASM_CA_CERT ConfigMap (tigera-waf-ca-bundle)
into tenant namespaces for the Coraza wasm registry TLS check, but the source
copy was never created (left as a TODO), so reconcile failed with
'source configmap calico-system/tigera-waf-ca-bundle not found'.

Provision it in the core controller as a renamed copy of the trusted CA bundle
-- the full TrustedBundle is available there, unlike the read-only interface the
kube-controllers render sees. Gate WASM_CA_CERT on the provisioned ConfigMap.
@electricjesus electricjesus force-pushed the seth/applicationlayer-render-v3 branch from 5eaab67 to 2af9beb Compare June 12, 2026 12:32
…proxy image (EV-6386)

The Coraza WAF wasm is baked into the gateway envoy-proxy image (its final
layer), so there is no separate coraza-wasm image to ship. Resolve WASM_IMAGE
from ComponentGatewayAPIEnvoyProxy -- the same image the gateway data plane
already runs -- and drop the standalone ComponentCorazaWASM component and its
enterprise_versions.yml pin. Addresses review feedback on op#4821.
…(EV-6386)

Completes the standalone coraza-wasm removal: the gen-versions template still
defined ComponentCorazaWASM, so gen-versions regenerated it into enterprise.go
and validate-gen-versions/dirty-check failed. The wasm now ships baked into the
envoy-proxy image, resolved via ComponentGatewayAPIEnvoyProxy.
@electricjesus electricjesus merged commit 99479ce into tigera:master Jun 12, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants