Skip to content

feat: Support version history and rollback for traffic rules#1477

Open
mochengqian wants to merge 6 commits into
apache:developfrom
mochengqian:feat/Support-version-history-and-rollback-for-traffic-rules
Open

feat: Support version history and rollback for traffic rules#1477
mochengqian wants to merge 6 commits into
apache:developfrom
mochengqian:feat/Support-version-history-and-rollback-for-traffic-rules

Conversation

@mochengqian
Copy link
Copy Markdown
Contributor

@mochengqian mochengqian commented May 22, 2026

Closes #1473.
Please view the detailed test report at https://notion-next-iota-amber-43.vercel.app/article/dubbo-admin-reviewer-report

1. What this PR delivers

Dubbo Admin currently saves traffic-rule edits with a destructive overwrite. There is no history, no audit trail for upstream-registry pushes, and no way to recover from a bad edit. This PR adds an immutable release ledger plus one-click rollback for the three governor-managed rule kinds — Condition Route, Tag Route, and Dynamic Config (Configurator).

The approach was picked after evaluating five candidate solutions during design:

Solution This PR?
A Immutable release ledger in the admin's local DB. Every publish, upstream push and rollback is appended as a row; rollback re-publishes an old snapshot, never overwriting. Yes — what this PR ships.
B Two-phase publish with a reviewer / approver workflow before the change reaches the registry. Deferred — needs concrete user demand before designing the state machine.
C Structural JSON diff in the admin UI. Folded into A — implemented via Monaco diff.
D GitOps: rules persisted in Git, the registry pulls from Git. Deferred — a major infrastructure shift, not v1-scale.
E Pure event sourcing — derive current rule state by replaying an event log. Deferred — orthogonal to the current store model.

User-visible capabilities

  • Each rule-detail page gets a History drawer with source (ADMIN / UPSTREAM / ROLLBACK / BOOTSTRAP), operator, timestamp and reason.
  • Diff against current uses Monaco diff to compare any past version with the live one.
  • Rollback takes a required reason; the old snapshot is re-published through the normal governor path and recorded as a new ROLLBACK row. The old row is never mutated.
  • Concurrent editors get a sticky 409 Reload notification instead of a silent overwrite.
  • Default retention is 5 versions per rule (configurable via versioning.maxVersionsPerRule).

2. How it works

UI publish ┐
           ├─► BeginMutationIntent ─► ResourceManager/governor/registry ─► CommitIntent
Rollback  ┘                                      │
                                                 ▼
                                  rule_version (immutable) + rule_version_meta (current)

ZK push ─► EventBus.ResourceChangedEvent ─► RuleVersionSubscriber ─► InsertVersion
                                                 │
                                                 ▼
                                 source = UPSTREAM, author from
                                 event.Context()["source-registry"] when present

Crash / de-sync ─► open intent ─► repair or abandon operator endpoints
  • ADMIN / ROLLBACK attribution is intent-based. Admin writes create an intent before the governor mutation, mark it applied after the mutation succeeds, then commit it into the immutable ledger. Subscriber echoes with the same content hash attach to the open intent instead of inserting a second row.
  • UPSTREAM attribution is event-context based. Registry-originated pushes with no matching open intent are recorded as source=UPSTREAM; ZK currently sets events.SourceRegistryContextKey, while Nacos / Apollo source labelling is deferred.
  • Rollback = re-publish the old snapshot. It flows through the normal governor mutation path and commits a new row with source=ROLLBACK and rolled_back_from_id pointing at the source version. The original row is never modified.
  • version_no is monotonic and not reused after trim — users always see strictly increasing version labels.
  • The ledger is an effective state change log, not a verbatim API-call log. Identical consecutive content snapshots are deduped so registry resync echoes and idempotent rewrites do not flood history, while every distinct rule state is preserved.

3. Scope

In this PR

  • Backend

    • New pkg/core/versioning package: types, memory + GORM stores, service, subscriber, component, and admin mutation intents.
    • New pkg/config/versioning config block with sane defaults and validation. Disabled by default in this release — set versioning.enabled: true to turn it on (see §8).
    • Bootstrap scan that emits a BOOTSTRAP baseline row for every existing rule (idempotent).
    • 4 new REST endpoints × 3 rule kinds, plus 2 cross-cutting operator endpoints for intent reconciliation (/rule-version-intents/:intentId/{repair,abandon}) to recover ledger state when an admin write crashes mid-flight or de-syncs from the registry.
    • Optional expectedVersionId precondition on existing PUT / POST / DELETE.
    • events.SourceRegistryContextKey constant + ZKConfigEventSubscriber emits registry context.
  • Frontend

    • ui-vue3/src/views/traffic/_shared/: RuleHistoryDrawer.vue, RuleHistoryPanel.vue, RuleDiffEditor.vue, ruleVersion.ts.
    • History panel + Monaco diff + rollback modal wired into all three rule pages.
    • 409 conflicts surfaced through notification with duration: 0 and a Reload action.
  • Tests

    • Unit tests across normalize / memory & GORM store / subscriber sources / intent repair and abandon / rollback paths / config validation / ZK delete nil-guard.
    • One end-to-end rollback drill (e2e_rollback_drill_test.go) covering bootstrap → intent-backed admin edit → upstream push → intent-backed rollback → overflow trim → audit-chain assertions.

Out of scope for this PR

Item Status Plan
AffinityRoute integration Tracked Will be done by @mochengqian as an immediate follow-up. AffinityRoute is not currently in governor.RuleResourceKinds, so its write path bypasses the governor that this ledger hooks into. Once it is brought onto the governor path, all four kinds will share the same versioning code.
Non-ZK registry source labelling (Nacos / Apollo / ...) Follow-up Only ZKConfigEventSubscriber currently emits events.SourceRegistryContextKey. Other registry subscribers fall back to author system:upstream instead of system:nacos / system:apollo. The wiring to fix this is one line per subscriber and is left for a separate PR to keep this one bounded.
Remove frontend post-write sleep Follow-up Existing traffic-rule pages still wait briefly after writes. Replacing that with deterministic readback needs a smoke re-run, so it is intentionally outside this PR's final cleanup.
Batch bootstrap inserts Follow-up The v1 bootstrap scan is idempotent and simple, but still O(rules) sequential inserts. Large-install optimization is tracked separately.

4. Locked design decisions

# Choice Rationale
1 Rollback reason is required Most important audit-trail field; enforced front + back.
2 Effective state-change ledger Consecutive identical content snapshots are deduped; this keeps registry resync noise out of history without losing distinct rule states.
3 AffinityRoute deferred Not in governor.RuleResourceKinds; out of scope.
4 Monaco diff Reuses an editor dependency already shipped.
5 Hard delete on trim Matches the "keep last 5" product semantics; keeps tables small.
6 expectedVersionId is weak-CAS but intent-aware Existing CRUD remains backward-compatible when omitted. When provided, the rule lock plus open-intent check catches concurrent admin writes before the meta pointer alone would; a stale writer receives 409 VERSION_LEDGER_PENDING or VERSION_CONFLICT.

5. API surface

New endpoints (one set per rule kind: condition-rule, tag-rule, configurator)

GET    /api/v1/{kind}/:ruleName/versions
GET    /api/v1/{kind}/:ruleName/versions/:versionId
GET    /api/v1/{kind}/:ruleName/versions/:versionId/diff?against=current|<id>
POST   /api/v1/{kind}/:ruleName/versions/:versionId/rollback
       body: { "reason": "<required>", "expectedVersionId": <optional int64> }

diff?against=current requires the rule to still have a current live version. If the rule has since been deleted, diff-vs-current returns rule-version not found; callers can still diff against an explicit historical version id.

Operator endpoints (cross-cutting, intent reconciliation)

When an admin write crashes between "intent created" and "intent committed/failed", a pending intent can be left dangling and block subsequent edits on the same rule with 409 VERSION_LEDGER_PENDING. Two endpoints let an operator reconcile:

POST   /api/v1/rule-version-intents/:intentId/repair
       Re-applies the intent if its hash matches the current registry state,
       or commits the existing version row if the intent was already applied.
POST   /api/v1/rule-version-intents/:intentId/abandon
       body: { "reason": "<required>" }
       Marks a pending intent as failed so writes on that rule can proceed.
       Refuses to abandon an intent whose content actually matches the current
       rule (use repair instead).

The UI surfaces both as Repair / Abandon buttons inside the VERSION_LEDGER_PENDING notification.

Existing endpoints (backward-compatible)

PUT / POST / DELETE /api/v1/{kind}/:ruleName accept an optional ?expectedVersionId=<id>.
Omit → unchanged behavior. Provide → mismatch returns:

HTTP/1.1 409 Conflict
Content-Type: application/json
{"code":"VERSION_CONFLICT","currentVersionId":5,"message":"rule version conflict"}

Feature flag

With versioning.enabled=false, all new endpoints return 503 + {"code":"FEATURE_DISABLED"}. Existing CRUD is completely untouched.


6. Database

Two new tables, created via AutoMigrate on the existing GORM connection (MySQL / Postgres). With store.type=memory a pure-Go in-memory implementation is used — no config changes needed.

CREATE TABLE rule_version (
    id, rule_kind, mesh, resource_key, rule_name,
    version_no,            -- monotonic; not reused after trim
    content_hash,          -- sha256(canonical spec json)
    spec_json,
    source,                -- ADMIN | UPSTREAM | ROLLBACK | BOOTSTRAP
    operation,             -- CREATE | UPDATE | DELETE
    author, reason,
    rolled_back_from_id,
    created_at,
    UNIQUE (rule_kind, resource_key, version_no),
    INDEX  (rule_kind, resource_key, created_at DESC),
    INDEX  (rule_kind, content_hash)
);

CREATE TABLE rule_version_meta (
    rule_kind, resource_key,    -- PK
    current_version,            -- nullable; NULL when the rule is deleted
    last_version_no,            -- monotonic
    updated_at
);

Upgrade path: on first start, scan every existing rule and write one source=BOOTSTRAP baseline row. Idempotent — safe to re-run.


7. Verification

Automated tests

go test ./pkg/core/versioning/... \
        ./pkg/config/versioning/... \
        ./pkg/console/handler/... \
        ./pkg/console/service/... \
        ./pkg/store/... \
        ./pkg/core/discovery/subscriber/... \
        ./pkg/core/manager/...

All green. go vet ./pkg/... reports no new warnings.

Frontend build

cd ui-vue3 && npm install --legacy-peer-deps && npm run build

Passes. npm run type-check still reports the pre-existing repo-wide TypeScript debt (home/index.vue, AuthUtil.ts, GrafanaPage, etc.) — that count does not increase under this branch.

Manual smoke

A full bootstrap → admin edit → diff → rollback → optimistic-lock 409 → retention cap → upstream ZK push → cross-rule rollback sweep was run end-to-end against MySQL + ZooKeeper. Evidence (HTTP transcripts, JSON ledger dumps, UI screenshots) is in the PR comments.


8. Upgrade and rollback

  • Default — disabled. This release ships versioning.enabled: false. The feature creates two new tables via AutoMigrate and writes a BOOTSTRAP baseline row per existing rule on first start; making that the default for every upstream user is too noisy. Existing CRUD behavior is unchanged when the feature is off.
  • Enable — set versioning.enabled: true in admin.yml and restart. On first start AutoMigrate creates rule_version / rule_version_meta (+ rule_version_intent for the intent ledger), and the bootstrap scan writes a baseline row per existing rule. Idempotent — safe to re-run.
  • Disable — flip versioning.enabled back to false and restart. CRUD paths are unaffected; new endpoints return 503. The tables are left in place in case the feature is re-enabled later.
  • Full revert — reverting the two feat(versioning) commits is sufficient; the two preparatory commits at the bottom of the ladder are harmless to keep.

9. Test plan checklist

  • CI green
  • go test ./pkg/... passes locally
  • cd ui-vue3 && npm run build passes
  • Bootstrap → admin edit → rollback verified manually
  • Optimistic-lock 409 verified manually
  • Retention cap respected after exceeding maxVersionsPerRule

@mochengqian mochengqian force-pushed the feat/Support-version-history-and-rollback-for-traffic-rules branch from 6deb25e to 9596f7e Compare May 22, 2026 05:19
@mochengqian mochengqian changed the title Support version history and rollback for traffic rules feat: Support version history and rollback for traffic rules May 22, 2026
@mochengqian mochengqian marked this pull request as draft May 23, 2026 03:22
@mochengqian mochengqian force-pushed the feat/Support-version-history-and-rollback-for-traffic-rules branch from 2471544 to d2a6ddc Compare May 23, 2026 05:58
@mochengqian mochengqian marked this pull request as ready for review May 23, 2026 12:24
New pkg/core/versioning package + REST endpoints record an immutable
version row for every condition-route, tag-route, and dynamic-config
change. ADMIN writes go through an intent (Begin → Apply → Commit)
that the event-bus subscriber attaches to the matching upstream echo.
UPSTREAM/ROLLBACK/BOOTSTRAP sources land via the same subscriber path.

Default off (versioning.enabled=false). When enabled, GORM AutoMigrate
creates rule_version + rule_version_meta + rule_version_intent; a
bootstrap scan emits one source=BOOTSTRAP row per existing rule.
Retention defaults to 5 versions per rule, trimmed on insert.

Optimistic CAS via ?expectedVersionId=<id> on existing PUT/POST/DELETE;
mismatch returns 409 VERSION_CONFLICT. Open intents on a rule return
409 VERSION_LEDGER_PENDING with intentId, recoverable via
/rule-version-intents/:id/{repair,abandon}.
Shared _shared/{RuleHistoryDrawer,RuleHistoryPanel,RuleDiffEditor,ruleVersion}
components wired into routingRule, tagRule, and dynamicConfig pages.
Monaco diff against current or any historical version; rollback opens
a modal requiring a non-empty reason. Concurrent edits surface a sticky
409 notification with a Reload action; pending intents offer Repair /
Abandon. Existing edit forms now thread expectedVersionId through PUT/
POST/DELETE for optimistic concurrency. MSW mocks cover the new
versioning endpoints and the conflict / pending / disabled rule-name
conventions.
@mochengqian mochengqian force-pushed the feat/Support-version-history-and-rollback-for-traffic-rules branch from 990402d to b525c2b Compare May 24, 2026 01:02
@robocanic robocanic requested a review from Copilot May 24, 2026 10:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces immutable version history, diff viewing, and rollback for governor-managed traffic rules (Condition Route, Tag Route, Dynamic Config) in Dubbo Admin, along with optimistic concurrency control to prevent silent overwrites.

Changes:

  • Backend: adds a versioning subsystem (stores, service, subscriber, bootstrap, intent workflow) plus REST endpoints for listing/getting/diffing/rollback and intent repair/abandon.
  • Frontend: adds shared history/diff/rollback UI components, wires them into rule pages, and threads expectedVersionId through mutations with 409 conflict handling.
  • Store/core plumbing: adds ListResources() and aligns empty-index ListByIndexes semantics across memory/GORM stores.

Reviewed changes

Copilot reviewed 62 out of 62 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
ui-vue3/src/views/traffic/tagRule/tabs/updateByYAMLView.vue Adds expectedVersionId concurrency + version-error notifications for YAML editing.
ui-vue3/src/views/traffic/tagRule/tabs/updateByFormView.vue Threads version precondition into form updates and handles version conflicts.
ui-vue3/src/views/traffic/tagRule/tabs/formView.vue Integrates history panel entry point and current version badge.
ui-vue3/src/views/traffic/tagRule/index.vue Uses current version id precondition on delete and conflict notifications.
ui-vue3/src/views/traffic/routingRule/tabs/updateByYAMLView.vue Adds expectedVersionId concurrency + version-error notifications for YAML editing.
ui-vue3/src/views/traffic/routingRule/tabs/updateByFormView.vue Threads version precondition into form updates and handles version conflicts.
ui-vue3/src/views/traffic/routingRule/tabs/formView.vue Integrates history panel entry point and current version badge.
ui-vue3/src/views/traffic/routingRule/index.vue Uses current version id precondition on delete and conflict notifications.
ui-vue3/src/views/traffic/dynamicConfig/tabs/YAMLView.vue Adds history panel + expectedVersionId concurrency for YAML-based configurator edits.
ui-vue3/src/views/traffic/dynamicConfig/tabs/formView.vue Adds history panel + expectedVersionId concurrency for form-based configurator edits.
ui-vue3/src/views/traffic/dynamicConfig/index.vue Uses current version id precondition on delete and conflict notifications.
ui-vue3/src/views/traffic/_shared/ruleVersion.ts Shared helpers for fetching current version state and displaying conflict/pending notifications.
ui-vue3/src/views/traffic/_shared/RuleHistoryPanel.vue New history panel orchestrating list/view/diff/rollback flows.
ui-vue3/src/views/traffic/_shared/RuleHistoryDrawer.vue Drawer UI for version timeline and actions.
ui-vue3/src/views/traffic/_shared/RuleDiffEditor.vue Monaco diff editor wrapper for version comparisons.
ui-vue3/src/mocks/handlers/tagRule.ts MSW: simulates version conflicts/pending and records admin writes for tag rules.
ui-vue3/src/mocks/handlers/routingRule.ts MSW: simulates version conflicts/pending and records admin writes for condition rules.
ui-vue3/src/mocks/handlers/dynamicConfig.ts MSW: simulates version conflicts/pending and records admin writes for configurators (with URL decoding).
ui-vue3/src/mocks/handlers/ruleVersion.ts MSW: full in-browser mock ledger + diff/rollback + intent repair/abandon flows.
ui-vue3/src/mocks/handlers.ts Registers ruleVersion MSW handlers.
ui-vue3/src/base/http/request.ts Suppresses generic error toasts for version conflict/pending so the dedicated notifications can be used.
ui-vue3/src/api/service/traffic.ts Adds versioning API surface + threads expectedVersionId into existing mutations.
pkg/store/memory/store.go Adds ListResources() with sorting and error propagation.
pkg/store/memory/store_test.go Tests ListResources() sorting and empty-index semantics.
pkg/store/dbcommon/gorm_store.go Adds ListResources() and aligns empty-index semantics with memory store.
pkg/store/dbcommon/gorm_store_test.go Tests empty-index behavior and ListResources() ordering.
pkg/core/versioning/types.go Defines version/meta/intent models, enums, and shared errors.
pkg/core/versioning/subscriber.go Records upstream changes and attaches events to matching admin intents.
pkg/core/versioning/store.go In-memory immutable ledger store + intent lifecycle + retention trimming + dedup.
pkg/core/versioning/store_gorm.go GORM-backed immutable ledger store + intent lifecycle + trimming + dedup.
pkg/core/versioning/store_gorm_test.go Tests GORM store migration, trimming, dedup, intents, and concurrency monotonicity.
pkg/core/versioning/service.go Versioning service API: list/get/diff, expected-version check, intents, repair helpers.
pkg/core/versioning/normalize.go Canonical JSON normalization and sha256 hashing for dedup and intent matching.
pkg/core/versioning/e2e_rollback_drill_test.go End-to-end drill covering bootstrap, admin edit, upstream push, rollback, and retention trim.
pkg/core/versioning/component.go Runtime component wiring: store selection, event subscriptions, startup repair/bootstrap scan.
pkg/core/store/store.go Extends ResourceStore interface with ListResources().
pkg/core/manager/manager.go Adds List(rk) to manager via store’s ListResources().
pkg/core/manager/manager_test.go Verifies manager List returns sorted resources.
pkg/core/events/eventbus.go Adds SourceRegistryContextKey and clarifies event context immutability expectations.
pkg/core/discovery/subscriber/zk_config.go Adds ZK delete nil-guard and emits source-registry context for version attribution.
pkg/core/discovery/subscriber/zk_config_test.go Tests delete path uses local old rule and missing-local-rule is a noop.
pkg/core/bootstrap/bootstrap.go Registers the versioning component as an optional bootstrap component.
pkg/console/service/tag_rule.go Wraps tag rule mutations with intent-based versioning and expectedVersionId checks.
pkg/console/service/configurator_rule.go Wraps configurator mutations with intent-based versioning and expectedVersionId checks.
pkg/console/service/condition_rule.go Wraps condition rule mutations with intent-based versioning and expectedVersionId checks.
pkg/console/service/rule_version.go Adds console-layer services for version list/get/diff/rollback and intent repair/abandon.
pkg/console/service/rule_version_test.go Covers conflict handling, pending intents, rollback paths, delete marker semantics, and intent recovery.
pkg/console/model/tag_rule.go Exposes force and priority fields in tag rule responses.
pkg/console/model/condition_rule.go Exposes force and priority fields in condition rule responses.
pkg/console/handler/tag_rule.go Adds mutation options parsing and maps versioning conflicts/pending to 409.
pkg/console/handler/configurator_rule.go Adds mutation options parsing and maps versioning conflicts/pending to 409.
pkg/console/handler/condition_rule.go Adds mutation options parsing and maps versioning conflicts/pending to 409.
pkg/console/handler/rule_version.go Implements versioning REST endpoints and error-to-HTTP mapping.
pkg/console/handler/rule_version_test.go Tests status/code mapping for InvalidArgument and pending intent id propagation.
pkg/console/router/router.go Registers versioning endpoints under the existing traffic rule routes + intent ops routes.
pkg/console/context/context.go Exposes RuleVersioning service from the runtime component.
pkg/console/component_test.go Ensures auth middleware blocks rollback endpoint without a session.
pkg/config/versioning/config.go Adds versioning config block, defaults, sanitize and validation.
pkg/config/versioning/config_test.go Tests defaults, sanitize, and validation for versioning config.
pkg/config/app/admin.go Adds versioning config to AdminConfig and ensures defaults for nil config blocks.
pkg/config/app/admin_test.go Tests AdminConfig defaulting behavior when Versioning config is missing.
Comments suppressed due to low confidence (1)

ui-vue3/src/views/traffic/dynamicConfig/tabs/YAMLView.vue:205

  • The catch block only calls notifyRuleVersionError(...) and then swallows the exception. This can hide non-versioning failures (notably YAML parse errors from yaml.load or other runtime exceptions) because the request interceptor won’t run for those cases. Consider rethrowing or showing a generic error toast when notifyRuleVersionError returns false.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/core/versioning/component.go
Comment thread ui-vue3/src/views/traffic/tagRule/tabs/updateByYAMLView.vue
Comment thread ui-vue3/src/views/traffic/routingRule/tabs/updateByYAMLView.vue
Comment thread pkg/console/handler/rule_version.go Outdated
Comment thread pkg/console/handler/rule_version.go Outdated
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support version history and rollback for traffic rules

2 participants