Missing `probe_type` in `probe.annotations` / `probes_info` when probe is built via `read_spikeglx`

Missing `probe_type` in `probe.annotations` / `probes_info` when probe is built via `read_spikeglx`

### Summary
For a recording loaded with `spikeinterface.extractors.read_spikeglx(...)` (or a probe built directly via `probeinterface.read_spikeglx(meta_file)`), the probe-level annotations dict does **not** contain `probe_type`, while the IMRO-based path (`probeinterface.read_imro`) does set it. Any downstream code that does `recording.get_annotation("probes_info")[0]["probe_type"]` breaks silently after upgrading.

### Environment confirmed on
- `probeinterface == 0.3.2`
- `spikeinterface == 0.104.1`
- Probe: NP1.0 (`imDatPrb_pn = "PRB_1_4_0480_1_C"`)

### How it was discovered
A long-running notebook used to read the probe family via:

```python
probes_info = raw_rec.get_annotation("probes_info")
probe_type  = probes_info[0]["probe_type"]   # used to work; now KeyError
```

After upgrading, `probe_type` was gone. Running the diagnostic snippet against the *same* recording on the versions above prints:

```
probeinterface: 0.3.2 ..spikeinterface\probeinterface\src\probeinterface\__init__.py
spikeinterface: 0.104.1 ..spikeinterface\spikeinterface\src\spikeinterface\__init__.py
annotations dict: {'model_name': 'PRB_1_4_0480_1_C', 'manufacturer': 'imec', 'description': 'Neuropixels 1.0 probe with cap', 'shank_tips': [[24.0, -220.0]], 'adc_bit_depth': 10, 'num_readout_channels': 384, 'ap_sample_frequency_hz': 30000.0, 'lf_sample_frequency_hz': 2500.0, 'adc_sampling_table': '(32,12)(0 1 24 25 48 49 72 73 96 97 120 121 144 145 168 169 192 193 216 217 240 241 264 265 288 289 312 313 336 337 360 361)(2 3 26 27 50 51 74 75 98 99 122 123 146 147 170 171 194 195 218 219 242 243 266 267 290 291 314 315 338 339 362 363)(4 5 28 29 52 53 76 77 100 101 124 125 148 149 172 173 196 197 220 221 244 245 268 269 292 293 316 317 340 341 364 365)(6 7 30 31 54 55 78 79 102 103 126 127 150 151 174 175 198 199 222 223 246 247 270 271 294 295 318 319 342 343 366 367)(8 9 32 33 56 57 80 81 104 105 128 129 152 153 176 177 200 201 224 225 248 249 272 273 296 297 320 321 344 345 368 369)(10 11 34 35 58 59 82 83 106 107 130 131 154 155 178 179 202 203 226 227 250 251 274 275 298 299 322 323 346 347 370 371)(12 13 36 37 60 61 84 85 108 109 132 133 156 157 180 181 204 205 228 229 252 253 276 277 300 301 324 325 348 349 372 373)(14 15 38 39 62 63 86 87 110 111 134 135 158 159 182 183 206 207 230 231 254 255 278 279 302 303 326 327 350 351 374 375)(16 17 40 41 64 65 88 89 112 113 136 137 160 161 184 185 208 209 232 233 256 257 280 281 304 305 328 329 352 353 376 377)(18 19 42 43 66 67 90 91 114 115 138 139 162 163 186 187 210 211 234 235 258 259 282 283 306 307 330 331 354 355 378 379)(20 21 44 45 68 69 92 93 116 117 140 141 164 165 188 189 212 213 236 237 260 261 284 285 308 309 332 333 356 357 380 381)(22 23 46 47 70 71 94 95 118 119 142 143 166 167 190 191 214 215 238 239 262 263 286 287 310 311 334 335 358 359 382 383)', 'num_adcs': 32, 'num_channels_per_adc': 12, 'serial_number': '19108303752', 'part_number': 'PRB_1_4_0480_1_C', 'port': '1', 'slot': '2'}
probes_info anno: [{'model_name': 'PRB_1_4_0480_1_C', 'manufacturer': 'imec', 'description': 'Neuropixels 1.0 probe with cap', 'shank_tips': [[24.0, -220.0]], 'adc_bit_depth': 10, 'num_readout_channels': 384, 'ap_sample_frequency_hz': 30000.0, 'lf_sample_frequency_hz': 2500.0, 'adc_sampling_table': '(32,12)(0 1 24 25 48 49 72 73 96 97 120 121 144 145 168 169 192 193 216 217 240 241 264 265 288 289 312 313 336 337 360 361)(2 3 26 27 50 51 74 75 98 99 122 123 146 147 170 171 194 195 218 219 242 243 266 267 290 291 314 315 338 339 362 363)(4 5 28 29 52 53 76 77 100 101 124 125 148 149 172 173 196 197 220 221 244 245 268 269 292 293 316 317 340 341 364 365)(6 7 30 31 54 55 78 79 102 103 126 127 150 151 174 175 198 199 222 223 246 247 270 271 294 295 318 319 342 343 366 367)(8 9 32 33 56 57 80 81 104 105 128 129 152 153 176 177 200 201 224 225 248 249 272 273 296 297 320 321 344 345 368 369)(10 11 34 35 58 59 82 83 106 107 130 131 154 155 178 179 202 203 226 227 250 251 274 275 298 299 322 323 346 347 370 371)(12 13 36 37 60 61 84 85 108 109 132 133 156 157 180 181 204 205 228 229 252 253 276 277 300 301 324 325 348 349 372 373)(14 15 38 39 62 63 86 87 110 111 134 135 158 159 182 183 206 207 230 231 254 255 278 279 302 303 326 327 350 351 374 375)(16 17 40 41 64 65 88 89 112 113 136 137 160 161 184 185 208 209 232 233 256 257 280 281 304 305 328 329 352 353 376 377)(18 19 42 43 66 67 90 91 114 115 138 139 162 163 186 187 210 211 234 235 258 259 282 283 306 307 330 331 354 355 378 379)(20 21 44 45 68 69 92 93 116 117 140 141 164 165 188 189 212 213 236 237 260 261 284 285 308 309 332 333 356 357 380 381)(22 23 46 47 70 71 94 95 118 119 142 143 166 167 190 191 214 215 238 239 262 263 286 287 310 311 334 335 358 359 382 383)', 'num_adcs': 32, 'num_channels_per_adc': 12, 'serial_number': '19108303752', 'part_number': 'PRB_1_4_0480_1_C', 'port': '1', 'slot': '2'}]
```

Note that `'probe_type'` is absent from both. Every other identifying field (`manufacturer`, `model_name`, `serial_number`, `part_number`, `port`, `slot`) is present, so the recording’s `set_probe` / `get_probe` round-trip is otherwise healthy — this is specifically a reader gap.

I cannot pinpoint the exact version where `probe_type` went missing (the notebook has been in production for a while and the regression surfaced only recently). It is definitely present on ≤ `spikeinterface 0.104.1` + `probeinterface 0.3.2`. Pointers on when the divergence happened would be welcome.

### Root cause
`probeinterface/neuropixels_tools.py::read_spikeglx` calls `build_neuropixels_probe` (which sets `model_name`, `manufacturer`, `description`, and several ADC/sampling annotations) and then adds `serial_number`, `part_number`, `port`, `slot` via `probe.annotate(...)` (around L864–L867). It never calls `probe.annotate(probe_type=...)`.

By contrast `read_imro` does (around L759–L760):

```python
probe_type = imro_str.strip().split(")")[0].split(",")[0][1:]
probe.annotate(probe_type=probe_type)
```

SpikeInterface’s `BaseRecordingSnippets._set_probes` stores `probe.annotations` verbatim as the recording annotation `probes_info`, so whatever `read_spikeglx` omits is simply absent from the recording forever.

### Minimal repro
```python
import probeinterface
probe = probeinterface.read_spikeglx("/path/to/run_g0_t0.imec0.ap.meta")
print("probe_type" in probe.annotations)   # False
print(sorted(probe.annotations))
```

### Suggested fix (narrow)
In `probeinterface/neuropixels_tools.py::read_spikeglx`, in the "recording-specific annotations" block around L859–L867, add the same annotation that `read_imro` already does. The value is directly available in the meta and, as a fallback, via the existing `probe_part_number_to_probe_type` mapping:

```python
imDatPrb_type = meta.get("imDatPrb_type") \
    or probe_part_number_to_probe_type.get(imDatPrb_pn)
probe.annotate(probe_type=imDatPrb_type)
```

Two lines, symmetric with `read_imro`.

### Design question for maintainers
The underlying reason this regression is so easy to introduce is structural, not just a missing key:

- The probe-level `annotations` dict is **not** serialised by `Probe.to_numpy` / `ProbeGroup.to_numpy` (which is what `_set_probes` uses to persist the probe onto the recording). Annotations survive only through the parallel side-channel `recording.annotations["probes_info"]`, which is rehydrated in `baserecordingsnippets.py` around L264–L270.
- Therefore the set of keys available after `get_probe()` is *entirely* determined by whichever `probe.annotate(...)` calls happen to live in each reader. There is no documented minimum contract, so any reader refactor can silently drop keys that downstream users depend on — which is exactly what happened here.

It would help to pick, and document, one of:

1. **Reader contract.** `read_spikeglx`, `read_openephys`, `read_imro`, etc. must all guarantee a minimum set of annotation keys — proposed minimum: `probe_type`, `model_name`, `manufacturer`, `serial_number`, `part_number`. Enforced via a shared test?
2. **Single source of truth on `Probe`.** Make `probe.annotations` the authoritative store and have `Probe.to_numpy` / `from_numpy` round-trip scalar annotations, so `_set_probes` / `get_probe` no longer rely on a parallel `probes_info` annotation that can go out of sync with the probe itself.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing `probe_type` in `probe.annotations` / `probes_info` when probe is built via `read_spikeglx` #424

Summary

Environment confirmed on

How it was discovered

Root cause

Minimal repro

Suggested fix (narrow)

Design question for maintainers

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missing probe_type in probe.annotations / probes_info when probe is built via read_spikeglx #424

Description

Summary

Environment confirmed on

How it was discovered

Root cause

Minimal repro

Suggested fix (narrow)

Design question for maintainers

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Missing `probe_type` in `probe.annotations` / `probes_info` when probe is built via `read_spikeglx` #424