Add UTF-16 Strings#34
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Unicode primitive type aliases ( ChangesUTF-16 String Module
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@engine/native/core/containers/string.cppm`:
- Around line 503-507: The WriteInt function has two issues: (1) negating
minimum signed integer values like INT_MIN causes undefined behavior, and (2)
GrowCapacity return values are not checked for failures, potentially leading to
out-of-bounds writes. In the signed value negation block (checking if val < 0),
use a safer approach such as converting to an unsigned type, using absolute
value safely, or checking for the minimum value before negating. Additionally,
wherever GrowCapacity is called within WriteInt, verify its return value
indicates success before continuing with the write operation; if allocation
fails, return an appropriate error code or handle the failure gracefully to
prevent buffer overruns.
- Around line 351-368: The `StringBuilder::Create` method has a stack-allocated
`fixed_allocator` variable that is used to initialize the `allocator` member,
but the allocator stores a pointer to this local variable which becomes invalid
after the function returns. To fix this, the `fixed_allocator` must be stored as
a member of the `StringBuilder` struct itself to ensure it persists for the
lifetime of the object, allowing subsequent growth and free operations to safely
access it without dereferencing dangling memory.
- Around line 437-446: The StringBuilder::Write methods call GrowCapacity with
only the incoming string length, not accounting for the existing buffer size.
Change the GrowCapacity calls from GrowCapacity(length) to GrowCapacity(size +
length) in both Write overloads (the utf16 version and the other overload
mentioned as also applying to lines 449-457) to ensure the total required
capacity is correctly calculated when appending to a non-empty builder.
- Around line 228-233: The Substr method incorrectly rejects valid substring
requests by returning EMPTY_STRING whenever the requested range touches or
exceeds the string boundary. Instead of returning EMPTY_STRING when pos + len >=
size, check only if pos >= size (completely out of bounds). When pos is valid
but the requested length extends beyond the string end, truncate len to size -
pos so the method returns the available substring. This truncation contract fix
will also resolve the false negatives in StartsWith and EndsWith methods that
depend on Substr.
- Around line 320-324: The `StringBuilder::Create` method has a unit mismatch
where the `capacity` parameter represents UTF-16 elements but is passed directly
to `allocator.alloc` as a byte size without conversion. Multiply `capacity` by
the size of a single UTF-16 element (sizeof(utf16)) when calling
`allocator.alloc` to correctly convert from element count to byte count. This
ensures the allocated storage is large enough to hold the requested number of
UTF-16 characters without under-allocating and risking buffer overflow.
- Around line 371-385: The Reserve method in StringBuilder has two issues: the
memcpy call copies size elements unconditionally even when newCapacity is
smaller, risking buffer overflow, and the oldDst.size field is set as an element
count rather than bytes, causing the allocator.free call to receive incorrect
size units. Fix the memcpy to copy only the minimum of size and newCapacity
elements (to avoid overflow), and set oldDst.size to capacity * sizeof(utf16)
instead of just capacity to ensure the allocator.free call receives the size in
bytes as expected.
- Around line 182-205: The StringCompare method has two critical bugs: the
min_size variable is initialized using itself as arguments to draco::math::min
instead of using the actual size members of both strings, which causes undefined
behavior; additionally, the null-pointer checks don't handle the case where both
string texts are null, causing two empty strings to incorrectly compare as
unequal. Fix the min_size initialization by comparing the size member of the
current string with the size member of the parameter string s, and add a check
at the beginning of the method to return 0 (equal) when both text pointers are
null, ensuring that EMPTY_STRING == EMPTY_STRING evaluates correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f9a636d9-6f20-4bfc-a1a3-f770f0cfaa40
📒 Files selected for processing (6)
engine/native/core/CMakeLists.txtengine/native/core/containers/containers.cppmengine/native/core/containers/string.cppmengine/native/core/definitions/definitions.cppmengine/native/core/definitions/stdtypes.cppmengine/native/core/math/functions.cppm
|
The one structural thing that was made obvious to me when I was making my PR for allocators was that @OldDev78 wants as much implementation to be in .cpp files and there are some functions that are amenable to having their implementation in .cpp files. |
The constexpr items and templates need to be in the module, I'll see what @OldDev78 says about the .cpp implementation to see if it would make sense to put the very few StringBuilder functions into a .cpp file |
If a function serves best as template and/or can be effectively constrexpr (can be constant evaluated), then it's okay to be implemented in the .cppm. Otherwise, the implementation should just be separated into the .cpp. Take note that my biggest concern is to expose dependency on headers as little as possible (so, when I can, also move the includes to the .cpp). |
|
The integral concept is a merge conflict for me :/ |
| return result; | ||
| } | ||
|
|
||
| bool constexpr String::operator==(String s) { |
There was a problem hiding this comment.
Especially when the function is this simple, I´d rather it be inlined in the declaration above.
| /// Initialize with preallocated buffer with initial capacity | ||
| /// | ||
| /// @param allocator Initalized allocator that allows for string builder to grow | ||
| /// @param buffer Preallocated buffer that was created with allocator | ||
| /// @param capacity The initial capacity that buffer has before it needs to be reallocated | ||
| memory::Error Create(Allocator allocator, utf16 *buffer, isize capacity); | ||
|
|
||
| /// Initialize with preallocated constant sized buffer, size in number of elements | ||
| /// | ||
| /// @param buffer Preallocated buffer with no access to allocations | ||
| /// @param bufferMaxCapacity Max Capacity of preallocated buffer | ||
| memory::Error Create(utf16 *buffer, isize bufferMaxCapacity); |
There was a problem hiding this comment.
These constructor functions aren't needed. A fixed allocator already provides this functionality by wrapping a fixed block of memory and can be abstracted to an Allocator type. It just returns the whole block of memory and aligns it for you. The abstraction is there to avoid duplicates like this.
| // Initializes with initial capacity | ||
| memory::Error StringBuilder::Create(Allocator allocator, isize capacity) { | ||
| // TODO(Jon) Assert capacity is greater than 0 | ||
| Slice dst; | ||
| memory::Error err = allocator.alloc(&dst, capacity, alignof(utf16)); |
There was a problem hiding this comment.
Could this initial allocation be defered to the first write? It would make this function error free.
There was a problem hiding this comment.
Yeah I can change the design to work with that. I was initially concerned over capacity if it starts out at 0 but that should be fine as all of the write functions ensure it adds a minimum amount, which I didn't know at the time as I wrote this function first
|
|
||
| memory::Error StringBuilder::Write(rune r) { | ||
| if (r > 0x10FFFF || (r >= 0xD800 && r <= 0xDFFF)) { | ||
| return Error::Other; |
There was a problem hiding this comment.
Maybe another error type unioned with the allocator error type would be a better approach. It would allow for more granularity in this interface.
| static constexpr utf16 hexadecimalNibbles[] = { | ||
| u'0', | ||
| u'1', | ||
| u'2', | ||
| u'3', | ||
| u'4', | ||
| u'5', | ||
| u'6', | ||
| u'7', | ||
| u'8', | ||
| u'9', | ||
| u'A', | ||
| u'B', | ||
| u'C', | ||
| u'D', | ||
| u'E', | ||
| u'F', | ||
| }; |
There was a problem hiding this comment.
This could be a base64 table and reused for all bases less than 64 (I wouldn't worry about padding equals signs here). Unfortunately, the ascii table isn't laid out so conveniently. It'd also give the last case the utility of all bases supported by the table.
There was a problem hiding this comment.
This one is probably the least important note here.
There was a problem hiding this comment.
Yeah this can be a later addition if we need it
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
engine/native/core/containers/string.cppm (2)
349-354:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winInfinite loop: base-2 loop condition checks
valinstead ofunsignedVal.The loop condition
while (val != 0)checks the original (unchanged) parameter, but the loop body only modifiesunsignedVal. Sincevalis never updated, this creates an infinite loop for any non-zero input when using base 2.The base-16 (line 356) and base-10 (line 361) cases correctly use
unsignedVal != 0.Suggested fix
} else if (base == 2) { - while (val != 0) { + while (unsignedVal != 0) { std::memcpy(buff + index, binaryNibbles[unsignedVal & 0xF], sizeof(binaryNibbles[0])); index += 4; unsignedVal >>= 4; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engine/native/core/containers/string.cppm` around lines 349 - 354, The base-2 conversion loop condition checks the parameter `val` instead of `unsignedVal`, causing an infinite loop since `val` is never modified while only `unsignedVal` is updated in the loop body. Change the `while (val != 0)` condition in the `else if (base == 2)` block to `while (unsignedVal != 0)` to match the correct behavior used in the base-16 and base-10 cases.
194-197:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPotential integer overflow when sizes differ significantly.
The cast
(int)(size - s.size)can overflow when the size difference exceedsINT_MAX(e.g., comparing a 3-billion-element string against an empty one). The result would be incorrect comparison semantics.Suggested fix
if (result == 0) { - return (int)(size - s.size); + if (size == s.size) return 0; + return size < s.size ? -1 : 1; } - return result; + return result < 0 ? -1 : 1;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engine/native/core/containers/string.cppm` around lines 194 - 197, The cast (int)(size - s.size) in the return statement can overflow when the size difference exceeds INT_MAX, causing incorrect comparison results for strings with significantly different lengths. Fix this by safely comparing the sizes without casting a potentially large size_t difference to int. Instead, check if size is greater than s.size and return 1 if true, otherwise return -1 when sizes differ, to maintain proper comparison semantics without overflow risk.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@engine/native/core/containers/string.cppm`:
- Around line 32-49: The comparison operators (operator==, operator!=,
operator>, operator>=, operator<, operator<=) do not modify the String object
and should be marked as const member functions. Add the const keyword after the
parameter list for each of these six operators to indicate they don't modify the
object state and to allow them to be called on const String instances.
Additionally, update the StringCompare method declaration (referenced on line
31) to also be const since it is called by these operators and should not
require a mutable String instance.
---
Outside diff comments:
In `@engine/native/core/containers/string.cppm`:
- Around line 349-354: The base-2 conversion loop condition checks the parameter
`val` instead of `unsignedVal`, causing an infinite loop since `val` is never
modified while only `unsignedVal` is updated in the loop body. Change the `while
(val != 0)` condition in the `else if (base == 2)` block to `while (unsignedVal
!= 0)` to match the correct behavior used in the base-16 and base-10 cases.
- Around line 194-197: The cast (int)(size - s.size) in the return statement can
overflow when the size difference exceeds INT_MAX, causing incorrect comparison
results for strings with significantly different lengths. Fix this by safely
comparing the sizes without casting a potentially large size_t difference to
int. Instead, check if size is greater than s.size and return 1 if true,
otherwise return -1 when sizes differ, to maintain proper comparison semantics
without overflow risk.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 99d10637-fa46-476e-b1ad-5f8fdfb9ad32
📒 Files selected for processing (2)
engine/native/core/containers/string.cppengine/native/core/containers/string.cppm
Add UTF-16 Strings
| return err; | ||
| } | ||
|
|
||
| memcpy(newDst.data, buffer, size * sizeof(utf16)); |
There was a problem hiding this comment.
Note for future issue: Make slice version of memcpy.
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Engine/cpp/runtime/core/containers/string.cpp`:
- Around line 79-82: The StringBuilder::Write method with the utf16 const *str
parameter calls StringLength(str) without first checking if the input pointer is
null, which will cause a null pointer dereference and crash. Add a null check at
the beginning of the StringBuilder::Write function (before calling StringLength)
and return an appropriate error code if str is null pointer, similar to how the
TODO comment suggests an assertion should be added.
- Line 96: In the string append operation at line 96 where memcpy is called to
copy str.text into the buffer, replace memcpy with memmove to safely handle the
case where str.text points into the builder's own buffer, preventing undefined
behavior and data corruption from overlapping memory regions.
- Around line 28-29: The capacity field is being assigned newDst.size which
represents byte count, but the write and grow logic throughout the code treats
capacity as the number of UTF-16 code units (elements). This mismatch causes a
critical overflow risk. Fix this by dividing newDst.size by sizeof(utf16) when
assigning to capacity at the line where capacity = newDst.size is set,
converting the byte size to the correct element count so that all subsequent
bounds checking and write operations use consistent units.
In `@Engine/cpp/runtime/core/containers/string.cppm`:
- Around line 358-363: The binary conversion loop in the base == 2 branch has a
non-terminating loop condition issue. The while loop checks `val != 0` but only
updates `unsignedVal` through right-shift operations, which means the loop
condition will never become false if val is non-zero, causing the loop to run
indefinitely and overflow the buffer. Change the loop condition from `while (val
!= 0)` to `while (unsignedVal != 0)` so that the loop terminates when
unsignedVal is completely processed to zero.
- Line 70: The StringIterator constructor performs pointer arithmetic on a
potentially null pointer when initializing the end member, which violates C++
standards. In the StringIterator constructor's member initializer list, add a
guard condition to check if s.text is null before performing the addition s.text
+ s.size. When s.text is nullptr, set end to nullptr instead of attempting the
pointer arithmetic, ensuring the constructor handles empty strings (like
EMPTY_STRING) safely without invoking undefined behavior.
- Around line 190-205: In the StringCompare function's runtime path (the else
branch following the consteval block), replace the memcmp call with
std::char_traits<char16_t>::compare to properly compare UTF-16 code units
instead of raw bytes, which will ensure correct ordering on all endianness
systems. Additionally, replace the cast from isize to int in the final return
statement that compares size and s.size with a ternary operator that returns -1
if size is less than s.size, 1 if greater, or 0 if equal, to avoid integer
overflow and maintain correct relational semantics for large size deltas.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 785a91cc-508b-4684-bbb1-58fa235dcf5a
📒 Files selected for processing (7)
Engine/cpp/runtime/core/CMakeLists.txtEngine/cpp/runtime/core/containers/containers.cppmEngine/cpp/runtime/core/containers/string.cppEngine/cpp/runtime/core/containers/string.cppmEngine/cpp/runtime/core/definitions/definitions.cppmEngine/cpp/runtime/core/definitions/stdtypes.cppmEngine/cpp/runtime/core/math/functions.cppm
✅ Files skipped from review due to trivial changes (1)
- Engine/cpp/runtime/core/containers/containers.cppm
| StringBuilder::Error StringBuilder::Write(utf16 const *str) { | ||
| // TODO(Jon) assert not null | ||
| isize length = StringLength(str); | ||
| Error err = GrowCapacity(size + length); |
There was a problem hiding this comment.
Missing null handling in Write(utf16 const* str) can dereference null.
At Line 81, StringLength(str) is called unconditionally. A null input will crash before returning an error.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Engine/cpp/runtime/core/containers/string.cpp` around lines 79 - 82, The
StringBuilder::Write method with the utf16 const *str parameter calls
StringLength(str) without first checking if the input pointer is null, which
will cause a null pointer dereference and crash. Add a null check at the
beginning of the StringBuilder::Write function (before calling StringLength) and
return an appropriate error code if str is null pointer, similar to how the TODO
comment suggests an assertion should be added.
| if (err != Error::Okay) { | ||
| return err; | ||
| } | ||
| memcpy(buffer + size, str.text, str.size * sizeof(utf16)); |
There was a problem hiding this comment.
Use overlap-safe copy when appending from a String view.
At Line 96, memcpy is undefined for overlapping regions. If str.text points into the builder’s own buffer, this can corrupt output or crash. memmove is the safe primitive here.
Suggested fix
- memcpy(buffer + size, str.text, str.size * sizeof(utf16));
+ memmove(buffer + size, str.text, str.size * sizeof(utf16));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| memcpy(buffer + size, str.text, str.size * sizeof(utf16)); | |
| memmove(buffer + size, str.text, str.size * sizeof(utf16)); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Engine/cpp/runtime/core/containers/string.cpp` at line 96, In the string
append operation at line 96 where memcpy is called to copy str.text into the
buffer, replace memcpy with memmove to safely handle the case where str.text
points into the builder's own buffer, preventing undefined behavior and data
corruption from overlapping memory regions.
| utf16 const *start; | ||
| utf16 const *end; | ||
|
|
||
| constexpr StringIterator(String s) : start{ s.text }, end{ s.text + s.size } {} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the string.cppm file
find . -name "string.cppm" -type fRepository: Redot-Engine/DraconicEngine
Length of output: 120
🏁 Script executed:
# Read the relevant file to understand the String and StringIterator definitions
cat -n Engine/cpp/runtime/core/containers/string.cppm | head -100Repository: Redot-Engine/DraconicEngine
Length of output: 3480
🏁 Script executed:
# Search for uses of StringIterator with empty strings or potential nullptr scenarios
rg -n "StringIterator\s*\(" Engine/cpp/runtime/core/containers/string.cppm | head -20Repository: Redot-Engine/DraconicEngine
Length of output: 155
🏁 Script executed:
# Look for uses of EMPTY_STRING
rg -n "EMPTY_STRING" Engine/cpp/runtime/core/containers/Repository: Redot-Engine/DraconicEngine
Length of output: 249
🏁 Script executed:
# Check if there are operator[] or other accesses on StringIterator
rg -n "StringIterator" Engine/cpp/runtime/core/ -A 3 | head -50Repository: Redot-Engine/DraconicEngine
Length of output: 970
🏁 Script executed:
# Let's check if StringIterator is actually instantiated or used with String objects
rg -n "StringIterator" Engine/cpp/runtime/core/ --type cpp --type cppm -B 2 -A 2 | head -80Repository: Redot-Engine/DraconicEngine
Length of output: 100
🏁 Script executed:
# Check the line 211 context to see what returns EMPTY_STRING
sed -n '200,220p' Engine/cpp/runtime/core/containers/string.cppmRepository: Redot-Engine/DraconicEngine
Length of output: 533
🏁 Script executed:
# Search for StringIterator usage/instantiation
rg -n "StringIterator" Engine/cpp/runtime/core/ -B 1 -A 1Repository: Redot-Engine/DraconicEngine
Length of output: 739
🏁 Script executed:
# Check if there are any methods that create StringIterator objects
rg -n "return.*StringIterator\|StringIterator\s*\{" Engine/cpp/runtime/core/Repository: Redot-Engine/DraconicEngine
Length of output: 53
🏁 Script executed:
# Look at the full string.cppm file to see if StringIterator is actually used anywhere
wc -l Engine/cpp/runtime/core/containers/string.cppmRepository: Redot-Engine/DraconicEngine
Length of output: 122
🏁 Script executed:
# Search for any instantiation of StringIterator in the entire repository
rg -n "StringIterator\s*\(" --type-list | grep -i cppm || echo "No cppm type, trying broader search"Repository: Redot-Engine/DraconicEngine
Length of output: 515
🏁 Script executed:
# Broader search for StringIterator constructor calls
rg -n "StringIterator" --glob "**/*.cpp" --glob "**/*.cppm" --glob "**/*.h" --glob "**/*.hpp" -A 1 -B 1Repository: Redot-Engine/DraconicEngine
Length of output: 739
🏁 Script executed:
# Check if the Advance method is called anywhere, which would require a StringIterator
rg -n "\.Advance\(" Engine/cpp/runtime/core/ -B 3 -A 1Repository: Redot-Engine/DraconicEngine
Length of output: 53
🏁 Script executed:
# Let's understand the C++ standard on null pointer arithmetic by examining if there are comments or documentation
grep -n -i "nullptr\|null.*pointer\|undefined" Engine/cpp/runtime/core/containers/string.cppm | head -20Repository: Redot-Engine/DraconicEngine
Length of output: 328
Fix null-pointer arithmetic in StringIterator constructor.
The expression s.text + s.size at line 70 invokes undefined behavior when s.text == nullptr, which is a valid state for strings (see EMPTY_STRING at line 64). Even though the addition offset is zero, pointer arithmetic on null pointers violates the C++ standard. Apply the suggested guard:
Fix
-constexpr StringIterator(String s) : start{ s.text }, end{ s.text + s.size } {}
+constexpr StringIterator(String s)
+ : start{ s.text }, end{ s.text ? (s.text + s.size) : nullptr } {}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| constexpr StringIterator(String s) : start{ s.text }, end{ s.text + s.size } {} | |
| constexpr StringIterator(String s) | |
| : start{ s.text }, end{ s.text ? (s.text + s.size) : nullptr } {} |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Engine/cpp/runtime/core/containers/string.cppm` at line 70, The
StringIterator constructor performs pointer arithmetic on a potentially null
pointer when initializing the end member, which violates C++ standards. In the
StringIterator constructor's member initializer list, add a guard condition to
check if s.text is null before performing the addition s.text + s.size. When
s.text is nullptr, set end to nullptr instead of attempting the pointer
arithmetic, ensuring the constructor handles empty strings (like EMPTY_STRING)
safely without invoking undefined behavior.
| isize min_size = draco::math::min(size, s.size); | ||
|
|
||
| int result = 0; | ||
| if consteval { | ||
| for (isize i = 0; i < min_size; ++i) { | ||
| if (text[i] != s.text[i]) { | ||
| return text[i] < s.text[i] ? -1 : 1; | ||
| } | ||
| } | ||
| } else { | ||
| result = memcmp(text, s.text, min_size * sizeof(utf16)); | ||
| } | ||
|
|
||
| if (result == 0) { | ||
| return (int)(size - s.size); | ||
| } |
There was a problem hiding this comment.
StringCompare runtime path has incorrect UTF-16 comparison and overflow risk.
Line 200 uses memcmp which compares bytes, not UTF-16 code units—on little-endian systems this produces wrong ordering for non-ASCII values. Line 204 casts the isize difference to int, which overflows for large size deltas and breaks relational semantics. The consteval path correctly compares code units.
Replace memcmp with std::char_traits<char16_t>::compare and use a ternary operator for the size comparison:
Suggested fix
} else {
- result = memcmp(text, s.text, min_size * sizeof(utf16));
+ result = std::char_traits<char16_t>::compare(text, s.text, static_cast<usize>(min_size));
}
if (result == 0) {
- return (int)(size - s.size);
+ return size < s.size ? -1 : (size > s.size ? 1 : 0);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| isize min_size = draco::math::min(size, s.size); | |
| int result = 0; | |
| if consteval { | |
| for (isize i = 0; i < min_size; ++i) { | |
| if (text[i] != s.text[i]) { | |
| return text[i] < s.text[i] ? -1 : 1; | |
| } | |
| } | |
| } else { | |
| result = memcmp(text, s.text, min_size * sizeof(utf16)); | |
| } | |
| if (result == 0) { | |
| return (int)(size - s.size); | |
| } | |
| isize min_size = draco::math::min(size, s.size); | |
| int result = 0; | |
| if consteval { | |
| for (isize i = 0; i < min_size; ++i) { | |
| if (text[i] != s.text[i]) { | |
| return text[i] < s.text[i] ? -1 : 1; | |
| } | |
| } | |
| } else { | |
| result = std::char_traits<char16_t>::compare(text, s.text, static_cast<usize>(min_size)); | |
| } | |
| if (result == 0) { | |
| return size < s.size ? -1 : (size > s.size ? 1 : 0); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Engine/cpp/runtime/core/containers/string.cppm` around lines 190 - 205, In
the StringCompare function's runtime path (the else branch following the
consteval block), replace the memcmp call with
std::char_traits<char16_t>::compare to properly compare UTF-16 code units
instead of raw bytes, which will ensure correct ordering on all endianness
systems. Additionally, replace the cast from isize to int in the final return
statement that compares size and s.size with a ternary operator that returns -1
if size is less than s.size, 1 if greater, or 0 if equal, to avoid integer
overflow and maintain correct relational semantics for large size deltas.
Add UTF-16 Strings
Summary by CodeRabbit