Skip to content

[refactor] Split CNI into watcher/handler under felix#774

Open
sknat wants to merge 2 commits into
masterfrom
nsk-split-cni
Open

[refactor] Split CNI into watcher/handler under felix#774
sknat wants to merge 2 commits into
masterfrom
nsk-split-cni

Conversation

@sknat

@sknat sknat commented Sep 2, 2025

Copy link
Copy Markdown
Collaborator

This patch splits the CNI watcher and handlers
in two pieces. The handling will be done in the main
'felix' goroutine, while the watching / grpc server
will live under watchers/ and not store or access agent
state.

The intent is to move away from a model with multiple servers
replicating state and communicating over a pubsub. This being
prone to race conditions, deadlocks, and not providing many
benefits as scale & asynchronicity will not be a constraint
on nodes with relatively small number of pods (~100) as is k8s
default.

@sknat sknat self-assigned this Sep 2, 2025
@sknat sknat force-pushed the nsk-split-cni branch 2 times, most recently from 2e5ecdf to 0a5dc4b Compare October 21, 2025 13:35
@sknat sknat added this to the agent refactoring single thread milestone Nov 17, 2025
@sknat sknat changed the title Split CNI into watcher/handler under felix [refactor] Split CNI into watcher/handler under felix Jan 7, 2026
This patch splits the felix server in two pieces:
- a felix watcher placed under `agent/watchers/felix`
- a felix server placed under `agent/felix`

The former will have only the responsibility of watching
and submitting events into a single event queue.
The latter will receive the event in a single goroutine
and proceed to program VPP as a single thred.

The intent is to move away from a model with multiple servers
replicating state and communicating over a pubsub. This being
prone to race conditions, deadlocks, and not providing many
benefits as scale & asynchronicity will not be a constraint
on nodes with relatively small number of pods (~100) as is k8s
default.

Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
@aritrbas

Copy link
Copy Markdown
Collaborator

Rebased on latest master to resolve merge conflicts and applied some fixes to comply with the latest Felix API updates in release/v3.31.0 and release/v3.32.0 as well as the NPOL and CNAT changes in VPP.

This patch splits the CNI watcher and handlers
in two pieces. The handling will be done in the main
'felix' goroutine, while the watching / grpc server
will live under watchers/ and not store or access agent
state.

The intent is to move away from a model with multiple servers
replicating state and communicating over a pubsub. This being
prone to race conditions, deadlocks, and not providing many
benefits as scale & asynchronicity will not be a constraint
on nodes with relatively small number of pods (~100) as is k8s
default.

Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants