Expose structural indexes by mitghi · Pull Request #449 · simd-lite/simd-json

mitghi · 2026-04-29T20:53:01Z

Hello 👋,

Thanks for this great library.

I would like to ask if its suitable to expose the structural indexes of Stage 1?
Stage 1 already computes the byte offsets of every json structural character.
I've been writing a tool that wants to reuse those offsets rather than recompute them,
otherwise the work which simd-json already does needs to done again in order to build
structural indexes. I find this to useful for cases such as deep scan on byte buffer, and based
on a byte's position, the structural index could be used to build a map from byte range to simd-json
Tape and other useful algorithms that this data enables.

This is an example for the use-case, every value's byte span derived correctly from
Buffers::structural_indexes() + tape walk and given a byte index, the inner object associated
with that position can be found.

  Input ({} bytes):
    {"name":"alice","age":30,"tags":["x","y","z"],"profile":{"city":"NYC","zip":"10001"},"score":99.5,"active":true,"note":null}
     ^pos=3              ^pos=23              ^pos=60          ^pos=78        ^pos=100              ^pos=120

  Derived byte spans for every value:
    [ 0] Object  bytes [  0..124) = {"name":"alice", ... ,"note":null}
    [ 1] String  key="name"    bytes [  8..15) = "alice"
    [ 2] Scalar  key="age"     bytes [ 22..24) = 30
    [ 3] Array   key="tags"    bytes [ 32..45) = ["x","y","z"]
    [ 4] String                bytes [ 33..36) = "x"
    [ 5] String                bytes [ 37..40) = "y"
    [ 6] String                bytes [ 41..44) = "z"
    [ 7] Object  key="profile" bytes [ 56..84) = {"city":"NYC","zip":"10001"}
    [ 8] String  key="city"    bytes [ 64..69) = "NYC"
    [ 9] String  key="zip"     bytes [ 76..83) = "10001"
    [10] Scalar  key="score"   bytes [ 93..97) = 99.5
    [11] Scalar  key="active"  bytes [107..111) = true
    [12] Scalar  key="note"    bytes [119..123) = null

  === Practical: zero-copy passthrough output ===
    raw value of "tags":    ["x","y","z"]              span [32, 45)  len 13
    raw value of "profile": {"city":"NYC","zip":"10001"} span [56, 84)  len 28


  byte   3 → Object (root) span [0..124) len 124        # inside "name"
  byte  11 → Object (root) span [0..124) len 124        # inside "alice"
  byte  23 → Object (root) span [0..124) len 124        # inside the 30
  byte  35 → Object (root) span [0..124) len 124        # inside ["x","y","z"] — array's parent is root
  byte  60 → Object key="profile" span [56..84) len 28  # inside profile.city
  byte  78 → Object key="profile" span [56..84) len 28  # inside profile.zip
  byte 100 → Object (root) span [0..124) len 124        # inside score 99.5
  byte 120 → Object (root) span [0..124) len 124        # inside note null

I appreciate if you could please give me a feedback whether this change makes sense.

Thanks

Stage-1 already computes byte offsets of every JSON structural char. Today the result lives in a private field; downstream callers must either rerun stage-1 or unsafe-transmute the buffer to read it. Adds a public read-only accessor: pub fn structural_indexes(&self) -> &[u32] Zero alloc, zero copy. Slice valid until the next parse reusing the same Buffers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose structural indexes#449

Expose structural indexes#449
mitghi wants to merge 1 commit intosimd-lite:mainfrom
mitghi:expose_structural_indexes

mitghi commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mitghi commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mitghi commented Apr 29, 2026 •

edited

Loading