wip small strings by eightbitraptor · Pull Request #1 · eightbitraptor/ruby

eightbitraptor · 2026-02-17T17:07:30Z

No description provided.

Change slot sizes from {40,80,160,320,640} to {64,128,256,512,1024}. BASE_SLOT_SIZE is now 64 (2^6) and all pool sizes are powers of 2, enabling bit-shift slot indexing instead of magic number division. Replace slot_div_magics[] multiply-and-shift with simple right shift. Simplify heap_add_page() alignment to a single bitmask round-up. Pool 0 aligns to cache line boundaries and embeds more objects (strings: 39 vs 15 chars, arrays: 6 vs 3 elements, ivars: 6 vs 3).

On 32-bit, sizeof(VALUE) is 4 so objects are roughly half the size of 64-bit. Use BASE_SLOT_SIZE_LOG2=5 (32 bytes) instead of 6 (64) to keep slot sizes proportional to pointer width.

RFLOAT mostly

because BASE_SLOT_SIZE is now 32 bytes, it's no longer suitable for use in tests that use it to assume the size of most RVALUE objects, like strings

When RVALUE_OVERHEAD is large (debug builds with RACTOR_CHECK_MODE + GC_DEBUG), the smallest size pool's usable size can be less than sizeof(struct RBasic). The capacity calculation underflows: (8 - 16) / 8 → 0xFFFF (via size_t wraparound, truncated to uint16_t) Since shape_grow_capa iterates capacities from index 0, the garbage 65535 at capacities[0] poisons all ivar capacity growth, causing a buffer overflow in the RUBY_DEBUG assertion that fills unused capacity with Qundef.

When RVALUE_OVERHEAD > 0 (GC_DEBUG, RACTOR_CHECK_MODE), heap[0]'s usable space can equal sizeof(struct RBasic), leaving zero bytes for instance variables. The capacity was incorrectly set to 1, allowing the shape system to embed an IV that overflows into the overhead area. Change the fallback capacity to 0 and switch shape_grow_capa to count-based iteration so that a zero capacity is not confused with the array sentinel.

rb_obj_embedded_size(0) returned sizeof(struct RBasic), which is too small on builds with RVALUE_OVERHEAD (GC_DEBUG) where heap[0] has no usable space beyond RBasic. The as.heap variant needs at least one VALUE of space for the external IV pointer. Clamp the minimum fields_count to 1 so T_OBJECT allocations always request enough space for the as union.

strings (except shared ones) 32 bytes

pm_parse_process initializes the index_lookup_table but nothing seems to use it after it has been allocated. However, pm_compile_scope_node will overwrite the index_lookup_table and cause it to leak memory. This can be seen during bootup with the following memory leaks reported by ASAN: #0 0x60dba31b7af3 in malloc #1 0x60dba32e0718 in rb_gc_impl_malloc gc/default/default.c:8287:5 #2 0x60dba32c7aa7 in ruby_xmalloc_body gc.c:5373:12 #3 0x60dba32c4a54 in ruby_xmalloc gc.c:5355:34 ruby#4 0x60dba3260314 in pm_index_lookup_table_init_heap prism_compile.h:89:29 ruby#5 0x60dba3209388 in pm_parse_process prism_compile.c:11366:5

@bar

* ZJIT: Use shape id as cache key for object layout Since ruby#17158, we can use the shape id as our cache key for determining object layout. This patch changes ZJIT to use shape id instead of testing all bits. Given this program: ```ruby class Foo def initialize; @bar = 123; end def read; @bar; end def add; @baz = 123; end end foo = Foo.new 3.times { foo = Foo.new foo.read foo.add foo.read } ``` We end up with a polymorphic read in the `read` method. On master, the HIR looks like this: ``` Optimized HIR: fn read@../test.rb:9: bb1(): EntryPoint interpreter v1:HeapBasicObject = LoadSelf Jump bb3(v1) bb2(): EntryPoint JIT(0) v4:HeapBasicObject = LoadArg :self@0 Jump bb3(v4) bb3(v6:HeapBasicObject): PatchPoint SingleRactorMode v12:CUInt64 = LoadField v6, :RBASIC_FLAGS@0x0 v14:CUInt64[0xffffffff0000001f] = Const CUInt64(0xffffffff0000001f) v15:CPtr[CPtr(0x8000800000001)] = Const CPtr(0x8000800000001) v16 = RefineType v15, CUInt64 v17:CInt64 = IntAnd v12, v14 v18:CBool = IsBitEqual v17, v16 CondBranch v18, bb5(), bb6() bb5(): v20:BasicObject = LoadField v6, :@bar@0x10 Jump bb4(v20) bb6(): v22:CUInt64[0xffffffff0000001f] = Const CUInt64(0xffffffff0000001f) v23:CPtr[CPtr(0x8000900000001)] = Const CPtr(0x8000900000001) v24 = RefineType v23, CUInt64 v25:CInt64 = IntAnd v12, v22 v26:CBool = IsBitEqual v25, v24 CondBranch v26, bb7(), bb8() bb7(): v28:BasicObject = LoadField v6, :@bar@0x10 Jump bb4(v28) bb8(): v30:BasicObject = GetIvar v6, :@bar Jump bb4(v30) bb4(v13:BasicObject): CheckInterrupts Return v13 ``` On this branch, the HIR is like this: ``` Optimized HIR: fn read@../test.rb:9: bb1(): EntryPoint interpreter v1:HeapBasicObject = LoadSelf Jump bb3(v1) bb2(): EntryPoint JIT(0) v4:HeapBasicObject = LoadArg :self@0 Jump bb3(v4) bb3(v6:HeapBasicObject): PatchPoint SingleRactorMode v13:CShape = LoadField v6, :shape_id@0x4 v14:CShape[0x80008] = Const CShape(0x80008) v15:CBool = IsBitEqual v14, v13 CondBranch v15, bb5(), bb6() bb5(): v17:BasicObject = LoadField v6, :@bar@0x10 Jump bb4(v17) bb6(): v19:CShape = LoadField v6, :shape_id@0x4 v20:CShape[0x80009] = Const CShape(0x80009) v21:CBool = IsBitEqual v20, v19 CondBranch v21, bb7(), bb8() bb7(): v23:BasicObject = LoadField v6, :@bar@0x10 Jump bb4(v23) bb8(): v25:BasicObject = GetIvar v6, :@bar Jump bb4(v25) bb4(v12:BasicObject): CheckInterrupts Return v12 ``` We're able to avoid loading all of the flags, applying a mask, and the testing. The machine code for bb3 looks like this on master (I've removed the nop buffers for patch points): ``` # Insn: v12 LoadField v6, :RBASIC_FLAGS@0x0 # Load field id=RBASIC_FLAGS offset=0 0x122c50124: ldur x1, [x0] # Insn: v14 Const CUInt64(0xffffffff0000001f) # Insn: v15 Const CPtr(0x8000800000001) # Insn: v16 RefineType v15, CUInt64 # Insn: v17 IntAnd v12, v14 0x122c50128: and x2, x1, #0xffffffff0000001f # Insn: v18 IsBitEqual v17, v16 0x122c5012c: mov x3, #1 0x122c50130: movk x3, #0, lsl ruby#16 0x122c50134: movk x3, ruby#8, lsl ruby#32 0x122c50138: movk x3, ruby#8, lsl ruby#48 0x122c5013c: cmp x2, x3 0x122c50140: mov x2, #1 0x122c50144: mov x3, #0 0x122c50148: csel x2, x2, x3, eq 0x122c5014c: tst x2, x2 0x122c50150: b.ne #0x122c501e8 ``` On this branch it looks like this: ``` # Insn: v13 LoadField v6, :shape_id@0x4 # Load field id=shape_id offset=4 0x124dd0124: ldur w1, [x0, ruby#4] # Insn: v14 Const CShape(0x80008) # Insn: v15 IsBitEqual v14, v13 0x124dd0128: mov x2, ruby#8 0x124dd012c: movk x2, ruby#8, lsl ruby#16 0x124dd0130: cmp x2, x1 0x124dd0134: mov x1, #1 0x124dd0138: mov x2, #0 0x124dd013c: csel x1, x1, x2, eq 0x124dd0140: tst x1, x1 0x124dd0144: b.ne #0x124dd01d4 ``` We've eliminated the `and` instruction and only need to do a 32 bit load for the shape. * fix operand order * Remove comments and whitespace * remove stuttering * fix variable name to be more clear * Keep assert * update snapshots

eightbitraptor force-pushed the mvh-small-strings branch from c87b081 to 8ca7cb0 Compare February 17, 2026 21:09

eightbitraptor added 10 commits February 23, 2026 10:40

Round up slot size in time test

32f8fd0

Scale down slot sizes on 32-bit

600a099

On 32-bit, sizeof(VALUE) is 4 so objects are roughly half the size of 64-bit. Use BASE_SLOT_SIZE_LOG2=5 (32 bytes) instead of 6 (64) to keep slot sizes proportional to pointer width.

Add a 32 byte size pool for small objects

066621c

RFLOAT mostly

Introduce RVALUE_SIZE to capture the size of most RVALUES

bcb4414

because BASE_SLOT_SIZE is now 32 bytes, it's no longer suitable for use in tests that use it to assume the size of most RVALUE objects, like strings

wip: experiment with a different shaped RString that makes most embedded

30a605a

strings (except shared ones) 32 bytes

Fix up the tests and a bad merge

09a979c

eightbitraptor force-pushed the mvh-small-strings branch from 8ca7cb0 to 09a979c Compare February 23, 2026 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wip small strings#1

wip small strings#1
eightbitraptor wants to merge 10 commits into
masterfrom
mvh-small-strings

eightbitraptor commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eightbitraptor commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant