k_sem_give faults (cache access error) from IRAM ISR: z_sem_pop_waiter is flash-resident

## Summary

`k_sem_give()` is marked `K_ISR_SAFE` (i.e. `IRAM_ATTR`) so it can be called
from ESP-IDF interrupts that may run while the flash cache is disabled. However,
on its hot path it calls the static helper `z_sem_pop_waiter()`
(`components/zkernel/src/k_sem.c`), which is **not** `K_ISR_SAFE` and therefore
lands in flash-mapped (cached) text. When `k_sem_give()` is invoked from an IRAM
ISR during a window where the cache is disabled (e.g. a concurrent SPI
flash/NVS write or erase), fetching `z_sem_pop_waiter()` faults with a cache
access error and the chip panics.

## Why this matters

The inline comment on `k_sem_give()` documents the exact contract this breaks:
it is `K_ISR_SAFE` so that an interrupt allocated with `ESP_INTR_FLAG_IRAM` can
give a semaphore safely while the cache is off. A very common path that hits
this is an IRAM-registered GPIO interrupt whose handler calls `k_work_submit()`
-- work submission wakes the target work queue via `k_sem_give()`. The boreas
GPIO driver installs its ISR service with `ESP_INTR_FLAG_IRAM`
(`components/zdevice/src/gpio_dt.c`), so any such handler that submits work and
happens to fire during a flash operation will fault.

## Observed

Panic `Guru Meditation Error: ... Cache error / Cache access error`. Symbolized
backtrace (innermost first):

```
z_sem_pop_waiter            k_sem.c            <-- faulted here (cache error)
k_sem_give                  k_sem.c
k_work_submit_internal      k_work.c
k_work_submit_to_queue      k_work.c
k_work_submit               k_work.c
<IRAM GPIO ISR handler> -> k_work_submit
gpio_esp32_isr              gpio_dt.c
<esp_driver_gpio ISR dispatch>
```

`MEPC` resolves into the flash-mapped region (`0x4200_0000+`) at
`z_sem_pop_waiter`; every other frame on the stack is in IRAM (`0x4080_0000+`).
The fault is intermittent by nature -- it requires the interrupt to land inside
a cache-disabled flash window.

## Root cause

`z_sem_pop_waiter()` (the wake-target popper called on `k_sem_give()`'s hot
path, and also by `k_sem_reset()`) lacks `K_ISR_SAFE`, so it is flash-resident
while its ISR-safe caller is in IRAM. An IRAM function must only call
IRAM-resident code when the cache may be disabled; this one link in the
`k_sem_give` call graph violates that.

## Proposed fix

Mark `z_sem_pop_waiter()` `K_ISR_SAFE`. The function is pure list-walking over
the caller-owned waiter list with no FreeRTOS calls, so it is safe to place in
IRAM. Its other caller, `k_sem_reset()`, is flash-resident, but flash code
calling an IRAM function is fine.

```c
static K_ISR_SAFE struct z_sem_waiter *z_sem_pop_waiter(struct k_sem *sem)
```

Verified on an esp32c5 target: the symbol relocates from the flash-mapped
region into IRAM, the cache-error panic no longer reproduces, the host test
suite is unaffected, and `clang-format` stays clean. IRAM cost is a single
small list-walk function (negligible).

## Suggested follow-up

Audit the rest of the ISR-reachable call graph from `K_ISR_SAFE` entry points
(`k_sem_give`, `k_sem_take` with `K_NO_WAIT`, the `k_work_submit` chain) for any
other static helpers that are flash-resident. The same class of bug -- an
`IRAM_ATTR` function calling a non-`IRAM_ATTR` helper -- would be latent
anywhere a helper was factored out without carrying the attribute.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

k_sem_give faults (cache access error) from IRAM ISR: z_sem_pop_waiter is flash-resident #53

Summary

Why this matters

Observed

Root cause

Proposed fix

Suggested follow-up

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

k_sem_give faults (cache access error) from IRAM ISR: z_sem_pop_waiter is flash-resident #53

Description

Summary

Why this matters

Observed

Root cause

Proposed fix

Suggested follow-up

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions