Skip to content

Bug: Corrupted PrimaryKeyIndex header causes uncontrolled OOM on database open — should fail gracefully #403

@Mrhs121

Description

@Mrhs121

Ladybug version

No response

What operating system are you using?

No response

What happened?

When I opening a database file, if the on-disk HashIndexHeaderOnDisk for a table's primary key index contains corrupted data (e.g., numEntries =
16544419524162413700), the PrimaryKeyIndex constructor proceeds to allocate memory based on these garbage values, causing the process to consume 17+
GB of RAM and get killed by the OS OOM killer. There is no validation or sanity check on the deserialized header values before allocation.

Image

Discussion: Prevention Strategies

This issue raises a broader question about data integrity guarantees. There seem to be two complementary approaches to prevent this class of problem:

For preventing this class of issue, would it make sense to add protection on both sides:
- Write path: ensure critical metadata (like index headers) can't be left in a corrupted state after an interrupted write
- Read path: add basic sanity checks on deserialized values before allocating memory, so corruption surfaces as a clear error rather than an OOM
crash

Ideally both should exist — write-path prevention to minimize corruption risk, and read-path validation as a safety net when corruption does happen
(disk errors, incomplete file copies, etc.).

I'm not sure which scenario caused the corruption in my case (could be a killed process during write, or a file transfer issue), but either way the
current behavior of silently allocating 17 GB and getting OOM-killed makes diagnosis very difficult. A clear error message at load time would have
saved hours of debugging.

Are there known steps to reproduce?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions