Fix UBSAN issue by switching to smart_str#83
Conversation
When a `smart_string` is never updated, it's `c` field remains NULL.
This is possible with a brotli library failure or with an empty output.
This is triggerable via e.g. `uncompress_add()`.
When that happens, a NULL pointer is passed to `zend_string_init()`
which is not allowed and will cause undefined behaviour in the
subsequent `memcpy()` call that `zend_string_init()` performs.
One way to fix this would be to guard the invocation of `RETVAL_STRINGL()`.
However, a cleaner way is to switch to the `smart_str` API which handles
everything already nicely for us.
As a bonus, `smart_str` will avoid reallocation of the string, so the new
code is not only cleaner but also faster.
Example UBSAN report:
```
/usr/local/include/php/Zend/zend_string.h:191:2: runtime error: null pointer passed as argument 2, which is declared to never be null
#0 0x7c8d9852a291 in zend_string_init /usr/local/include/php/Zend/zend_string.h:191
kjdev#1 0x7c8d98535adf in zif_brotli_uncompress_add /work/php-ext-brotli/brotli.c:1830
kjdev#2 0x623731f73e26 in zend_test_execute_internal /work/php-src/ext/zend_test/observer.c:306
kjdev#3 0x6237323a22bd in ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER /work/php-src/Zend/zend_vm_execute.h:2154
kjdev#4 0x62373259894d in execute_ex /work/php-src/Zend/zend_vm_execute.h:116519
kjdev#5 0x6237325a9126 in zend_execute /work/php-src/Zend/zend_vm_execute.h:121962
kjdev#6 0x623732779b56 in zend_execute_script /work/php-src/Zend/zend.c:1980
kjdev#7 0x623731fa8cce in php_execute_script_ex /work/php-src/main/main.c:2645
kjdev#8 0x623731fa8fd8 in php_execute_script /work/php-src/main/main.c:2685
kjdev#9 0x623732791599 in main /work/php-src/sapi/cgi/cgi_main.c:2529
kjdev#10 0x7c8d9a8381c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
kjdev#11 0x7c8d9a83828a in __libc_start_main_impl ../csu/libc-start.c:360
kjdev#12 0x623730c09ad4 in _start (/work/php-src/build-dbg-ubsan/sapi/cgi/php-cgi+0x1409ad4) (BuildId: 4ad96c24d1e97297ff4db76de2af34c057b26433)
```
📝 WalkthroughWalkthroughThis PR modernizes the Brotli extension's buffer handling by replacing the legacy ChangesBuffer API Modernization
Possibly related PRs
Poem
🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
brotli.c (1)
1806-1822:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle decoder error/finish state before returning output.
brotli_uncompress_add()returnssmart_str_extract(&out)regardless ofBrotliDecoderResult. If decompression ends withBROTLI_DECODER_RESULT_ERROR(ormode == BROTLI_OPERATION_FINISHbut notSUCCESS), this can return partial/empty data as a successful call.Proposed fix
BrotliDecoderResult result = BROTLI_DECODER_RESULT_NEEDS_MORE_OUTPUT; while (result == BROTLI_DECODER_RESULT_NEEDS_MORE_OUTPUT) { ctx->available_out = buffer_size; ctx->next_out = buffer; result = BrotliDecoderDecompressStream(ctx->decoder, &ctx->available_in, &ctx->next_in, &ctx->available_out, &ctx->next_out, 0); size_t buffer_used = buffer_size - ctx->available_out; if (buffer_used) { smart_str_appendl(&out, buffer, buffer_used); } } + if (result == BROTLI_DECODER_RESULT_ERROR || + (mode == BROTLI_OPERATION_FINISH && result != BROTLI_DECODER_RESULT_SUCCESS)) { + efree(buffer); + smart_str_free(&out); + php_error_docref(NULL, E_WARNING, "failed to incremental uncompress"); + RETURN_FALSE; + } + RETVAL_STR(smart_str_extract(&out)); efree(buffer);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@brotli.c` around lines 1806 - 1822, The function brotli_uncompress_add currently returns RETVAL_STR(smart_str_extract(&out)) without checking the final BrotliDecoderResult; after the loop that calls BrotliDecoderDecompressStream, inspect the final result and handle non-success cases: if result != BROTLI_DECODER_RESULT_SUCCESS (including BROTLI_DECODER_RESULT_ERROR or incomplete/NEEDS_MORE_INPUT states), release/clear the smart_str out buffer and return an error/false (or raise an appropriate PHP error) instead of returning partial data; only call RETVAL_STR(smart_str_extract(&out)) when the result is BROTLI_DECODER_RESULT_SUCCESS. Ensure you reference ctx, out, BrotliDecoderDecompressStream, and RETVAL_STR in the fix so the check is implemented right after the decompression loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@brotli.c`:
- Around line 1806-1822: The function brotli_uncompress_add currently returns
RETVAL_STR(smart_str_extract(&out)) without checking the final
BrotliDecoderResult; after the loop that calls BrotliDecoderDecompressStream,
inspect the final result and handle non-success cases: if result !=
BROTLI_DECODER_RESULT_SUCCESS (including BROTLI_DECODER_RESULT_ERROR or
incomplete/NEEDS_MORE_INPUT states), release/clear the smart_str out buffer and
return an error/false (or raise an appropriate PHP error) instead of returning
partial data; only call RETVAL_STR(smart_str_extract(&out)) when the result is
BROTLI_DECODER_RESULT_SUCCESS. Ensure you reference ctx, out,
BrotliDecoderDecompressStream, and RETVAL_STR in the fix so the check is
implemented right after the decompression loop.
|
Since version 7.0 is currently supported, this appears to be an error. |
When a
smart_stringis never updated, it'scfield remains NULL. This is possible with a brotli library failure or with an empty output. This is triggerable via e.g.uncompress_add().When that happens, a NULL pointer is passed to
zend_string_init()which is not allowed and will cause undefined behaviour in the subsequentmemcpy()call thatzend_string_init()performs.One way to fix this would be to guard the invocation of
RETVAL_STRINGL(). However, a cleaner way is to switch to thesmart_strAPI which handles everything already nicely for us.As a bonus,
smart_strwill avoid reallocation of the string, so the new code is not only cleaner but also faster.Example UBSAN report:
Note: this was found by a hybrid static-dynamic analyzer I'm developing.
Summary by CodeRabbit