Skip to content

feat(applicationlayer): WAF observability — gateway audit-log capture + Felix/fluentd event log (EV-6650)#4895

Draft
electricjesus wants to merge 5 commits into
tigera:masterfrom
electricjesus:seth/waf-audit-log-capture-ev6650
Draft

feat(applicationlayer): WAF observability — gateway audit-log capture + Felix/fluentd event log (EV-6650)#4895
electricjesus wants to merge 5 commits into
tigera:masterfrom
electricjesus:seth/waf-audit-log-capture-ev6650

Conversation

@electricjesus

Copy link
Copy Markdown
Member

Description

Operator render for the Gateway API WAF observability path (EV-6650), split out of #4821:

  • Audit-log capture (gateway proxy render)pkg/render/gatewayapi/gateway_api.go:
    • EnvoyProxy.Spec.Logging.Level = {default: warn, wasm: info} — surfaces the Coraza WASM filter's AuditLog: lines in Envoy's application log.
    • EnvoyProxy.Spec.ExtraArgs += --log-path /access_logs/envoy.log — redirects the application log onto the existing access-logs emptyDir (ExtraArgs rather than a container-args Patch, which would replace EG's generated args; the path sits directly under /access_logs because Envoy doesn't create --log-path parent dirs).
    • WAF_AUDIT_LOG_PATH=/access_logs/envoy.log on the l7-log-collector, which tails the file and forwards WAF decision records via PolicySync.ReportWAF.
  • Felix + fluentd event-log legsWAFEventLogsFileEnabled (FelixConfiguration) and WAF_LOG_FILE (fluentd) so WAF block / would-block decisions land in the tigera_secure_ee_waf index.

Stacked on #4821 (first 3 commits are #4821's; review the last 2: capture WAF (Coraza) audit log from the gateway proxy and enable Felix + fluentd WAF event log for the gateway data plane). Will rebase onto master once #4821 merges.

Draft / hold: the calico-private producer pair (tigera/calico-private#12142) is not merged yet — the env vars rendered here have no consumer until it lands.

Cluster-validated on a live Envoy Gateway proxy: SQLi → 403 (Block-mode WAFPolicy) → Coraza AuditLog:{...} JSON captured at [info][wasm] in /access_logs/envoy.log.

Release Note

Add operator render for Gateway API WAF observability (EV-6650): capture the Coraza audit log from the gateway proxy, and enable the Felix + fluentd legs (WAFEventLogsFileEnabled, WAF_LOG_FILE) so WAF block / would-block decisions land in the tigera_secure_ee_waf index.

For PR author

  • Tests for change.
  • If changing pkg/apis/, run make gen-files
  • If changing versions, run make gen-versions

For PR reviewers

A note for code reviewers - all pull requests must have the following:

  • Milestone set according to targeted release.
  • Appropriate labels:
    • release-note-required
    • docs-pr-required

Linked

Add spec.extensions.waf.state (+ IsWAFGatewayExtensionEnabled helper) to the
GatewayAPI CR to gate the WAF v3 (Gateway API add-on) surface, default-off.
Regenerate deepcopy + CRD manifest.

Refs EV-6657
Render the WAF v3 (Coraza WASM) surface on calico-kube-controllers, gated on the
GatewayAPI WAF extension:

- WASM_IMAGE/WASM_PULL_SECRET/WASM_CA_CERT env, ENABLED_CONTROLLERS, reconciler
  RBAC (wafpolicies/plugins, EnvoyExtensionPolicy, events, secret replication),
  coraza-wasm component (config/enterprise_versions.yml + gen-versions template +
  generated enterprise.go) + GatewayAddonsFeature constant.
- In-process WAF SecLang validating admission webhook: a Service fronting the
  kube-controllers Pod + ValidatingWebhookConfiguration (wafplugins/wafpolicies,
  /validate-waf, FailurePolicy=Fail, caBundle=operator CA); the serving-cert
  mount + WAF_WEBHOOK_CERT_DIR env + container port 9443; and namespaces
  patch/update RBAC for the waf-id-range annotation.

Refs EV-6657
…n controller

Gate on GatewayAPI.spec.extensions.waf.state, issue the webhook serving cert for
the tigera-waf-webhook Service DNS (materialized into calico-system via the
existing CertificateManagement render), thread it into the kube-controllers
config, and render the webhook Service + ValidatingWebhookConfiguration.

Refs EV-6657
Wire the EnvoyProxy render so the data-plane Envoy proxy captures the Coraza
WAF filter's audit decision log (EV-6650 WAF observability):

- Tune EnvoyProxy.Spec.Logging.Level to {default: warn, wasm: info} so the
  wasm component's "AuditLog:" lines (emitted via proxywasm.LogInfo) surface
  in Envoy's application log while the rest stays quiet. Envoy Gateway passes
  arbitrary component keys through to --component-log-level, and Envoy
  recognises "wasm".
- Append --log-path /access_logs/envoy.log via EnvoyProxy.Spec.ExtraArgs to
  redirect Envoy's application log to a file on the existing access-logs
  emptyDir (already mounted in both the envoy container, which writes it, and
  the l7-log-collector, which reads it). ExtraArgs is used rather than a
  container-args Patch, which would replace Envoy Gateway's generated args.
  The file is directly under /access_logs (not a subdirectory) because Envoy
  does not create --log-path parent directories.
- Set WAF_AUDIT_LOG_PATH=/access_logs/envoy.log on the l7-log-collector init
  container so it can tail the file and forward WAF decision records via
  PolicySync.ReportWAF.

Refs EV-6650
…gateway data plane

The gateway data-plane WAF (design-25) emits Coraza audit events that the
l7-collector forwards to Felix via ReportWAF. For those events to reach
Elasticsearch they need the same Felix -> waf.log -> fluentd -> linseed
pipeline the legacy ApplicationLayer WAF uses, but two of its enablement
knobs were never wired for the gateway path:

- FelixConfiguration.WAFEventLogsFileEnabled gates Felix's ReportWAF handler
  and the waf.log file reporter; without it ReportWAF returns
  "WAFEvents disabled". The ApplicationLayer controller already owns this
  field, so OR in the GatewayAPI WAF extension state (and add a GatewayAPI
  watch so toggling it re-reconciles). Also set it in the TPROXYMode
  upgrade-workaround branch, since it is an independent field.
- fluentd-node's in_tail_waf_logs source is gated by the WAF_LOG_FILE env,
  which the operator never set. Set it alongside FLOW_LOG_FILE / DNS_LOG_FILE;
  the path is always present and the file only exists when a WAF producer is
  enabled.

Refs EV-6650
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants