Missing probe_type in probe.annotations / probes_info when probe is built via read_spikeglx
Summary
For a recording loaded with spikeinterface.extractors.read_spikeglx(...) (or a probe built directly via probeinterface.read_spikeglx(meta_file)), the probe-level annotations dict does not contain probe_type, while the IMRO-based path (probeinterface.read_imro) does set it. Any downstream code that does recording.get_annotation("probes_info")[0]["probe_type"] breaks silently after upgrading.
Environment confirmed on
probeinterface == 0.3.2
spikeinterface == 0.104.1
- Probe: NP1.0 (
imDatPrb_pn = "PRB_1_4_0480_1_C")
How it was discovered
A long-running notebook used to read the probe family via:
probes_info = raw_rec.get_annotation("probes_info")
probe_type = probes_info[0]["probe_type"] # used to work; now KeyError
After upgrading, probe_type was gone. Running the diagnostic snippet against the same recording on the versions above prints:
probeinterface: 0.3.2 ..spikeinterface\probeinterface\src\probeinterface\__init__.py
spikeinterface: 0.104.1 ..spikeinterface\spikeinterface\src\spikeinterface\__init__.py
annotations dict: {'model_name': 'PRB_1_4_0480_1_C', 'manufacturer': 'imec', 'description': 'Neuropixels 1.0 probe with cap', 'shank_tips': [[24.0, -220.0]], 'adc_bit_depth': 10, 'num_readout_channels': 384, 'ap_sample_frequency_hz': 30000.0, 'lf_sample_frequency_hz': 2500.0, 'adc_sampling_table': '(32,12)(0 1 24 25 48 49 72 73 96 97 120 121 144 145 168 169 192 193 216 217 240 241 264 265 288 289 312 313 336 337 360 361)(2 3 26 27 50 51 74 75 98 99 122 123 146 147 170 171 194 195 218 219 242 243 266 267 290 291 314 315 338 339 362 363)(4 5 28 29 52 53 76 77 100 101 124 125 148 149 172 173 196 197 220 221 244 245 268 269 292 293 316 317 340 341 364 365)(6 7 30 31 54 55 78 79 102 103 126 127 150 151 174 175 198 199 222 223 246 247 270 271 294 295 318 319 342 343 366 367)(8 9 32 33 56 57 80 81 104 105 128 129 152 153 176 177 200 201 224 225 248 249 272 273 296 297 320 321 344 345 368 369)(10 11 34 35 58 59 82 83 106 107 130 131 154 155 178 179 202 203 226 227 250 251 274 275 298 299 322 323 346 347 370 371)(12 13 36 37 60 61 84 85 108 109 132 133 156 157 180 181 204 205 228 229 252 253 276 277 300 301 324 325 348 349 372 373)(14 15 38 39 62 63 86 87 110 111 134 135 158 159 182 183 206 207 230 231 254 255 278 279 302 303 326 327 350 351 374 375)(16 17 40 41 64 65 88 89 112 113 136 137 160 161 184 185 208 209 232 233 256 257 280 281 304 305 328 329 352 353 376 377)(18 19 42 43 66 67 90 91 114 115 138 139 162 163 186 187 210 211 234 235 258 259 282 283 306 307 330 331 354 355 378 379)(20 21 44 45 68 69 92 93 116 117 140 141 164 165 188 189 212 213 236 237 260 261 284 285 308 309 332 333 356 357 380 381)(22 23 46 47 70 71 94 95 118 119 142 143 166 167 190 191 214 215 238 239 262 263 286 287 310 311 334 335 358 359 382 383)', 'num_adcs': 32, 'num_channels_per_adc': 12, 'serial_number': '19108303752', 'part_number': 'PRB_1_4_0480_1_C', 'port': '1', 'slot': '2'}
probes_info anno: [{'model_name': 'PRB_1_4_0480_1_C', 'manufacturer': 'imec', 'description': 'Neuropixels 1.0 probe with cap', 'shank_tips': [[24.0, -220.0]], 'adc_bit_depth': 10, 'num_readout_channels': 384, 'ap_sample_frequency_hz': 30000.0, 'lf_sample_frequency_hz': 2500.0, 'adc_sampling_table': '(32,12)(0 1 24 25 48 49 72 73 96 97 120 121 144 145 168 169 192 193 216 217 240 241 264 265 288 289 312 313 336 337 360 361)(2 3 26 27 50 51 74 75 98 99 122 123 146 147 170 171 194 195 218 219 242 243 266 267 290 291 314 315 338 339 362 363)(4 5 28 29 52 53 76 77 100 101 124 125 148 149 172 173 196 197 220 221 244 245 268 269 292 293 316 317 340 341 364 365)(6 7 30 31 54 55 78 79 102 103 126 127 150 151 174 175 198 199 222 223 246 247 270 271 294 295 318 319 342 343 366 367)(8 9 32 33 56 57 80 81 104 105 128 129 152 153 176 177 200 201 224 225 248 249 272 273 296 297 320 321 344 345 368 369)(10 11 34 35 58 59 82 83 106 107 130 131 154 155 178 179 202 203 226 227 250 251 274 275 298 299 322 323 346 347 370 371)(12 13 36 37 60 61 84 85 108 109 132 133 156 157 180 181 204 205 228 229 252 253 276 277 300 301 324 325 348 349 372 373)(14 15 38 39 62 63 86 87 110 111 134 135 158 159 182 183 206 207 230 231 254 255 278 279 302 303 326 327 350 351 374 375)(16 17 40 41 64 65 88 89 112 113 136 137 160 161 184 185 208 209 232 233 256 257 280 281 304 305 328 329 352 353 376 377)(18 19 42 43 66 67 90 91 114 115 138 139 162 163 186 187 210 211 234 235 258 259 282 283 306 307 330 331 354 355 378 379)(20 21 44 45 68 69 92 93 116 117 140 141 164 165 188 189 212 213 236 237 260 261 284 285 308 309 332 333 356 357 380 381)(22 23 46 47 70 71 94 95 118 119 142 143 166 167 190 191 214 215 238 239 262 263 286 287 310 311 334 335 358 359 382 383)', 'num_adcs': 32, 'num_channels_per_adc': 12, 'serial_number': '19108303752', 'part_number': 'PRB_1_4_0480_1_C', 'port': '1', 'slot': '2'}]
Note that 'probe_type' is absent from both. Every other identifying field (manufacturer, model_name, serial_number, part_number, port, slot) is present, so the recording’s set_probe / get_probe round-trip is otherwise healthy — this is specifically a reader gap.
I cannot pinpoint the exact version where probe_type went missing (the notebook has been in production for a while and the regression surfaced only recently). It is definitely present on ≤ spikeinterface 0.104.1 + probeinterface 0.3.2. Pointers on when the divergence happened would be welcome.
Root cause
probeinterface/neuropixels_tools.py::read_spikeglx calls build_neuropixels_probe (which sets model_name, manufacturer, description, and several ADC/sampling annotations) and then adds serial_number, part_number, port, slot via probe.annotate(...) (around L864–L867). It never calls probe.annotate(probe_type=...).
By contrast read_imro does (around L759–L760):
probe_type = imro_str.strip().split(")")[0].split(",")[0][1:]
probe.annotate(probe_type=probe_type)
SpikeInterface’s BaseRecordingSnippets._set_probes stores probe.annotations verbatim as the recording annotation probes_info, so whatever read_spikeglx omits is simply absent from the recording forever.
Minimal repro
import probeinterface
probe = probeinterface.read_spikeglx("/path/to/run_g0_t0.imec0.ap.meta")
print("probe_type" in probe.annotations) # False
print(sorted(probe.annotations))
Suggested fix (narrow)
In probeinterface/neuropixels_tools.py::read_spikeglx, in the "recording-specific annotations" block around L859–L867, add the same annotation that read_imro already does. The value is directly available in the meta and, as a fallback, via the existing probe_part_number_to_probe_type mapping:
imDatPrb_type = meta.get("imDatPrb_type") \
or probe_part_number_to_probe_type.get(imDatPrb_pn)
probe.annotate(probe_type=imDatPrb_type)
Two lines, symmetric with read_imro.
Design question for maintainers
The underlying reason this regression is so easy to introduce is structural, not just a missing key:
- The probe-level
annotations dict is not serialised by Probe.to_numpy / ProbeGroup.to_numpy (which is what _set_probes uses to persist the probe onto the recording). Annotations survive only through the parallel side-channel recording.annotations["probes_info"], which is rehydrated in baserecordingsnippets.py around L264–L270.
- Therefore the set of keys available after
get_probe() is entirely determined by whichever probe.annotate(...) calls happen to live in each reader. There is no documented minimum contract, so any reader refactor can silently drop keys that downstream users depend on — which is exactly what happened here.
It would help to pick, and document, one of:
- Reader contract.
read_spikeglx, read_openephys, read_imro, etc. must all guarantee a minimum set of annotation keys — proposed minimum: probe_type, model_name, manufacturer, serial_number, part_number. Enforced via a shared test?
- Single source of truth on
Probe. Make probe.annotations the authoritative store and have Probe.to_numpy / from_numpy round-trip scalar annotations, so _set_probes / get_probe no longer rely on a parallel probes_info annotation that can go out of sync with the probe itself.
Missing
probe_typeinprobe.annotations/probes_infowhen probe is built viaread_spikeglxSummary
For a recording loaded with
spikeinterface.extractors.read_spikeglx(...)(or a probe built directly viaprobeinterface.read_spikeglx(meta_file)), the probe-level annotations dict does not containprobe_type, while the IMRO-based path (probeinterface.read_imro) does set it. Any downstream code that doesrecording.get_annotation("probes_info")[0]["probe_type"]breaks silently after upgrading.Environment confirmed on
probeinterface == 0.3.2spikeinterface == 0.104.1imDatPrb_pn = "PRB_1_4_0480_1_C")How it was discovered
A long-running notebook used to read the probe family via:
After upgrading,
probe_typewas gone. Running the diagnostic snippet against the same recording on the versions above prints:Note that
'probe_type'is absent from both. Every other identifying field (manufacturer,model_name,serial_number,part_number,port,slot) is present, so the recording’sset_probe/get_proberound-trip is otherwise healthy — this is specifically a reader gap.I cannot pinpoint the exact version where
probe_typewent missing (the notebook has been in production for a while and the regression surfaced only recently). It is definitely present on ≤spikeinterface 0.104.1+probeinterface 0.3.2. Pointers on when the divergence happened would be welcome.Root cause
probeinterface/neuropixels_tools.py::read_spikeglxcallsbuild_neuropixels_probe(which setsmodel_name,manufacturer,description, and several ADC/sampling annotations) and then addsserial_number,part_number,port,slotviaprobe.annotate(...)(around L864–L867). It never callsprobe.annotate(probe_type=...).By contrast
read_imrodoes (around L759–L760):SpikeInterface’s
BaseRecordingSnippets._set_probesstoresprobe.annotationsverbatim as the recording annotationprobes_info, so whateverread_spikeglxomits is simply absent from the recording forever.Minimal repro
Suggested fix (narrow)
In
probeinterface/neuropixels_tools.py::read_spikeglx, in the "recording-specific annotations" block around L859–L867, add the same annotation thatread_imroalready does. The value is directly available in the meta and, as a fallback, via the existingprobe_part_number_to_probe_typemapping:Two lines, symmetric with
read_imro.Design question for maintainers
The underlying reason this regression is so easy to introduce is structural, not just a missing key:
annotationsdict is not serialised byProbe.to_numpy/ProbeGroup.to_numpy(which is what_set_probesuses to persist the probe onto the recording). Annotations survive only through the parallel side-channelrecording.annotations["probes_info"], which is rehydrated inbaserecordingsnippets.pyaround L264–L270.get_probe()is entirely determined by whicheverprobe.annotate(...)calls happen to live in each reader. There is no documented minimum contract, so any reader refactor can silently drop keys that downstream users depend on — which is exactly what happened here.It would help to pick, and document, one of:
read_spikeglx,read_openephys,read_imro, etc. must all guarantee a minimum set of annotation keys — proposed minimum:probe_type,model_name,manufacturer,serial_number,part_number. Enforced via a shared test?Probe. Makeprobe.annotationsthe authoritative store and haveProbe.to_numpy/from_numpyround-trip scalar annotations, so_set_probes/get_probeno longer rely on a parallelprobes_infoannotation that can go out of sync with the probe itself.