Skip to content

lib: utils/fdt: Cache CPU intc phandle->hartid lookups#9

Merged
cp0613 merged 1 commit into
XUANTIE-RV:opensbi-v1.7-devfrom
cp0613:opensbi-v1.7-dev
May 29, 2026
Merged

lib: utils/fdt: Cache CPU intc phandle->hartid lookups#9
cp0613 merged 1 commit into
XUANTIE-RV:opensbi-v1.7-devfrom
cp0613:opensbi-v1.7-dev

Conversation

@cp0613

@cp0613 cp0613 commented May 29, 2026

Copy link
Copy Markdown
Collaborator

CLINT/PLIC/PLMT/PLICSW probing walks interrupts-extended and for each entry resolves the CPU intc phandle to a hartid via fdt_node_offset_by_phandle() + fdt_parent_offset() + fdt_parse_hart_id(). Both libfdt helpers are O(FDT_size) linear scans of the structure block, so on an N-hart system each driver pays O(N * FDT_size), and the same handful of intc phandles is re-resolved across multiple drivers (PLIC + MSWI + MTIMER + ...).

Build a small cache by walking /cpus once: for each cpu node, record its child intc node's phandle paired with the parsed hartid. Subsequent lookups become an O(harts) linear scan over the cache instead of two full FDT walks per entry. The cache is keyed on the FDT pointer so a new fdt invalidates it implicitly.

Also move the hwirq filter ahead of the phandle resolution at each callsite so non-matching interrupts-extended entries skip the lookup entirely.

Measured on 8-hart system (release build, mtime @ 25MHz, 1M ticks = 40 ms):

sbi_irqchip_init: 15.08M -> 3.64M (~4.15x; 603 -> 146 ms)
sbi_ipi_init: 14.75M -> 2.82M (~5.23x; 590 -> 113 ms)
sbi_timer_init: 14.86M -> 2.94M (~5.06x; 594 -> 118 ms)
combined: 44.69M -> 9.39M (~4.76x; 1788 -> 376 ms)

CLINT/PLIC/PLMT/PLICSW probing walks `interrupts-extended` and for each
entry resolves the CPU intc phandle to a hartid via
`fdt_node_offset_by_phandle()` + `fdt_parent_offset()` + `fdt_parse_hart_id()`.
Both libfdt helpers are O(FDT_size) linear scans of the structure block,
so on an N-hart system each driver pays O(N * FDT_size), and the same
handful of intc phandles is re-resolved across multiple drivers
(PLIC + MSWI + MTIMER + ...).

Build a small cache by walking `/cpus` once: for each cpu node, record
its child intc node's phandle paired with the parsed hartid. Subsequent
lookups become an O(harts) linear scan over the cache instead of two
full FDT walks per entry. The cache is keyed on the FDT pointer so a
new fdt invalidates it implicitly.

Also move the hwirq filter ahead of the phandle resolution at each
callsite so non-matching `interrupts-extended` entries skip the lookup
entirely.

Measured on 8-hart system (release build, mtime @ 25MHz,
1M ticks = 40 ms):

  sbi_irqchip_init: 15.08M -> 3.64M  (~4.15x; 603 -> 146 ms)
  sbi_ipi_init:     14.75M -> 2.82M  (~5.23x; 590 -> 113 ms)
  sbi_timer_init:   14.86M -> 2.94M  (~5.06x; 594 -> 118 ms)
  combined:         44.69M -> 9.39M  (~4.76x; 1788 -> 376 ms)

Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
@cp0613 cp0613 merged commit 64355e7 into XUANTIE-RV:opensbi-v1.7-dev May 29, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant