From 2bdad9dfd4585757dc828b56c4dac7353acf337d Mon Sep 17 00:00:00 2001 From: yugstar Date: Tue, 16 Jun 2026 18:47:16 +0530 Subject: [PATCH] docs: add cluster-wide inhibition example to alerting tutorial The Alertmanager docs mention that inhibition can suppress alerts from an entire cluster, but (as noted in #1353) there is no concrete example showing how. As suggested on the issue, this adds an "Inhibiting alerts from an entire cluster" section to the "Alerting based on metrics" tutorial. It shows an inhibit_rules configuration that mutes every alert sharing the same cluster label as a firing ClusterUnreachable alert, and explains source_matchers, target_matchers, and the equal label list. Closes #1353 Signed-off-by: yugstar --- docs/tutorials/alerting_based_on_metrics.md | 26 ++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/docs/tutorials/alerting_based_on_metrics.md b/docs/tutorials/alerting_based_on_metrics.md index acd0299b9..9b9151cc3 100644 --- a/docs/tutorials/alerting_based_on_metrics.md +++ b/docs/tutorials/alerting_based_on_metrics.md @@ -81,4 +81,28 @@ Open [http://localhost:9090/rules](http://localhost:9090/rules) in your browser -Similarly Alertmanager can be configured with other receivers to notify when an alert is firing. \ No newline at end of file +Similarly Alertmanager can be configured with other receivers to notify when an alert is firing. + +## Inhibiting alerts from an entire cluster + +When a whole cluster (or instance) becomes unreachable, you usually don't want a separate notification for every alert that fires as a consequence. Alertmanager's [inhibition](/docs/alerting/latest/alertmanager/#inhibition) feature lets a single "cluster is down" alert mute all the dependent alerts coming from that same cluster, so you receive one meaningful notification instead of a flood. + +Inhibition is configured with `inhibit_rules` in `alertmanager.yml`. The following rule mutes every alert that shares the same `cluster` label value as a firing `ClusterUnreachable` alert: + +> alertmanager.yml + +```yaml +inhibit_rules: + - source_matchers: + - 'alertname = "ClusterUnreachable"' + target_matchers: + - 'alertname != "ClusterUnreachable"' + equal: + - 'cluster' +``` + +- `source_matchers` selects the alert that suppresses others when it is firing (here, `ClusterUnreachable`). +- `target_matchers` selects the alerts to mute. `ClusterUnreachable` is excluded so the source alert itself is still delivered. +- `equal` lists the labels whose values must match between the source and target alerts for the inhibition to apply. Alerts are muted only when they share the **same** `cluster` value, so an outage in one cluster never hides alerts from another. + +For this to work, both the `ClusterUnreachable` alert and the alerts you want to mute must carry a `cluster` label, for example set on your alerting rules or added through `external_labels`. An alert is also never inhibited by itself, so `ClusterUnreachable` is always delivered.