Skip to content

Optimize checked_ilog and pow when base is a power of two#147250

Merged
rust-bors[bot] merged 5 commits into
rust-lang:mainfrom
Kmeakin:km/optimize-ilog-base-power-of-two
Jun 7, 2026
Merged

Optimize checked_ilog and pow when base is a power of two#147250
rust-bors[bot] merged 5 commits into
rust-lang:mainfrom
Kmeakin:km/optimize-ilog-base-power-of-two

Conversation

@Kmeakin

@Kmeakin Kmeakin commented Oct 2, 2025

Copy link
Copy Markdown
Contributor

View all comments

Optimize checked_ilog and pow when the base is a power of two

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 2, 2025
@rustbot

rustbot commented Oct 2, 2025

Copy link
Copy Markdown
Collaborator

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 8f19ce6 to e5ca8cd Compare October 2, 2025 01:31
@rust-log-analyzer

This comment has been minimized.

@workingjubilee

workingjubilee commented Oct 2, 2025

Copy link
Copy Markdown
Member

does this affect the codegen in practical circumstances? I would expect even a fairly weak optimizer to perform this optimization, making us hand-coding it irrelevant and potentially even harmful because we introduce more conditional logic to chew through.

@Kmeakin Kmeakin changed the title Optimize checked_ilog when base is a power of two Optimize checked_ilog and pow when base is a power of two Oct 2, 2025
@Kmeakin

Kmeakin commented Oct 2, 2025

Copy link
Copy Markdown
Contributor Author

does this affect the codegen in practical circumstances?

It replaces a loop by some bit manipulations. Whether anyone actually calls either function with a compile-time known, power of 2 base is another question. But it can't hurt, since it is guarded by is_statically_known.

I would expect even a fairly weak optimizer to perform this optimization, making us hand-coding it irrelevant

No, in either function, LLVM is not able to see that the loop is performing a log/pow and apply the identities: https://godbolt.org/z/6vMsxc9Kh

and potentially even harmful because we introduce more conditional logic to chew through.

They're guarded by is_statically_known so will be ignored if the base is not known at compile-time

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch 2 times, most recently from 2faa397 to abb7f32 Compare October 2, 2025 02:38
@workingjubilee

Copy link
Copy Markdown
Member

No, in either function, LLVM is not able to see that the loop is performing a log/pow and apply the identities: https://godbolt.org/z/6vMsxc9Kh

...then I'm kinda surprised! Nice catch.

Comment thread library/core/src/num/uint_macros.rs Outdated
Comment thread library/core/src/num/uint_macros.rs Outdated
@scottmcm

scottmcm commented Oct 2, 2025

Copy link
Copy Markdown
Member

does this affect the codegen in practical circumstances?

Would be good to have codegen tests to demonstrate what this is doing -- especially since that way there's a way for people to try removing the special cases later if they think that LLVM no longer needs them.

@scottmcm scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 2, 2025
@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from abb7f32 to 9ab0f63 Compare October 2, 2025 23:13
Comment thread library/core/src/num/uint_macros.rs Outdated
@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 9ab0f63 to 1d0ac82 Compare October 3, 2025 21:16
@workingjubilee

Copy link
Copy Markdown
Member

thanks for the codegen tests!

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 1d0ac82 to c84b99e Compare October 5, 2025 18:29
@Kmeakin

Kmeakin commented Oct 5, 2025

Copy link
Copy Markdown
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 5, 2025
@Kmeakin

Kmeakin commented Oct 10, 2025

Copy link
Copy Markdown
Contributor Author

@scottmcm ping?

Comment thread tests/codegen-llvm/ilog_known_base.rs

@scottmcm scottmcm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here is looking good, but can you make sure there's normal runtime behaviour tests for it too? Notably, after the base conversation (that's fixed in the code) it made me think that that's not currently covered by any tests, so we should have something -- maybe just add to the # Examples that it returns None for base zero or one? (They seem like perfectly reasonable and helpful examples, in addition to giving coverage for this stuff.)

And maybe add some should_panic tests to ensure that the overflow checking is still correct for pow when overflow checks are enabled?

View changes since this review

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 14, 2025
@rust-bors

rust-bors Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

☀️ Try build successful (CI)
Build commit: 759376f (759376f9a4e9acc6447f3f4933a190e8a1bb0ba1, parent: 4f84d9fac456d973d592cf3fb48db958ecf22506)

@rust-timer

This comment has been minimized.

@rust-timer

Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (759376f): comparison URL.

Overall result: ❌ regressions - please read:

Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf.

Next, please: If you can, justify the regressions found in this try perf run in writing along with @rustbot label: +perf-regression-triaged. If not, fix the regressions and do another perf run. Neutral or positive results will clear the label automatically.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.6% [0.4%, 0.8%] 3
Regressions ❌
(secondary)
0.2% [0.2%, 0.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.6% [0.4%, 0.8%] 3

Max RSS (memory usage)

Results (primary 3.0%, secondary -1.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.0% [3.0%, 3.0%] 1
Regressions ❌
(secondary)
2.8% [0.9%, 4.8%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.2% [-3.0%, -0.7%] 9
All ❌✅ (primary) 3.0% [3.0%, 3.0%] 1

Cycles

Results (secondary -1.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.6% [-1.6%, -1.6%] 1
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.0%, secondary 0.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.3%, 0.3%] 1
Regressions ❌
(secondary)
0.3% [0.3%, 0.3%] 1
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.0% [-0.4%, 0.3%] 2

Bootstrap: 510.625s -> 510.701s (0.01%)
Artifact size: 400.75 MiB -> 400.68 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 2, 2026
@jhpratt

jhpratt commented Jun 2, 2026

Copy link
Copy Markdown
Member

Same crates and for roughly the same amount. I'm clueless. Maybe try posting on Zulip, as I'm not sure who to ping.

@jhpratt

jhpratt commented Jun 7, 2026

Copy link
Copy Markdown
Member

Despite the unexpected perf result, it was recommended on Zulip to merge this anyway.

@bors r+

@rust-bors

rust-bors Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

📌 Commit e845f15 has been approved by jhpratt

It is now in the queue for this repository.

@rust-bors rust-bors Bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 7, 2026
@JonathanBrouwer

Copy link
Copy Markdown
Contributor

@bors p=2
Putting this above the rollupable PRs to maximize throughput, since the queue is busy

@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request Jun 7, 2026
…=jhpratt

Optimize `checked_ilog` and `pow` when `base` is a power of two



Optimize `checked_ilog` and `pow` when the base is a power of two
@JonathanBrouwer

Copy link
Copy Markdown
Contributor

Stuck
@bors cancel

@rust-bors

rust-bors Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Auto build was cancelled. Cancelled workflows:

The next pull request likely to be tested is #157548.

@Kmeakin

Kmeakin commented Jun 7, 2026

Copy link
Copy Markdown
Contributor Author

Despite the unexpected perf result, it was recommended on Zulip to merge this anyway.

@bors r+

Thanks. I was going to post to zulip but haven't had time this week. Can you link to the zulip discussion?

@jhpratt

jhpratt commented Jun 7, 2026

Copy link
Copy Markdown
Member

@rust-bors

This comment has been minimized.

@rust-bors rust-bors Bot added merged-by-bors This PR was explicitly merged by bors. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 7, 2026
@rust-bors

rust-bors Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

☀️ Test successful - CI
Approved by: jhpratt
Duration: 3h 12m 18s
Pushing f20a92e to main...

@rust-bors rust-bors Bot merged commit f20a92e into rust-lang:main Jun 7, 2026
13 checks passed
@rustbot rustbot added this to the 1.98.0 milestone Jun 7, 2026
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor
What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 43a4909 (parent) -> f20a92e (this PR)

Test differences

Show 3966 test diffs

Stage 1

  • [codegen] tests/codegen-llvm/ilog_known_base.rs: [missing] -> pass (J1)
  • [codegen] tests/codegen-llvm/pow_known_base.rs: [missing] -> pass (J1)

Stage 2

  • [codegen] tests/codegen-llvm/ilog_known_base.rs: [missing] -> pass (J0)
  • [codegen] tests/codegen-llvm/pow_known_base.rs: [missing] -> pass (J0)

Additionally, 3962 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard f20a92ec01483dc5c58e90e246f266bdad822d86 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. x86_64-gnu-gcc-core-tests: 14m 24s -> 7m 58s (-44.7%)
  2. x86_64-gnu-llvm-21-1: 1h -> 40m 27s (-32.7%)
  3. dist-i586-gnu-i586-i686-musl: 1h 16m -> 1h 36m (+27.2%)
  4. x86_64-msvc-ext3: 1h 54m -> 1h 24m (-26.6%)
  5. dist-i686-mingw: 1h 40m -> 2h 7m (+26.5%)
  6. pr-check-1: 32m 46s -> 24m 55s (-24.0%)
  7. dist-aarch64-linux: 2h 31m -> 1h 55m (-23.6%)
  8. dist-x86_64-apple: 2h 10m -> 1h 41m (-22.4%)
  9. dist-apple-various: 1h 41m -> 1h 19m (-22.1%)
  10. dist-x86_64-linux-alt: 2h 31m -> 2h 1m (-19.7%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer

Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (f20a92e): comparison URL.

Overall result: ❌ regressions - please read:

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.6% [0.3%, 0.7%] 4
Regressions ❌
(secondary)
0.4% [0.4%, 0.4%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.6% [0.3%, 0.7%] 4

Max RSS (memory usage)

Results (primary 5.0%, secondary -5.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
5.0% [5.0%, 5.0%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.5% [-5.5%, -5.5%] 1
All ❌✅ (primary) 5.0% [5.0%, 5.0%] 1

Cycles

This perf run didn't have relevant results for this metric.

Binary size

Results (primary -0.0%, secondary 0.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.1%, 0.3%] 4
Regressions ❌
(secondary)
0.3% [0.3%, 0.3%] 1
Improvements ✅
(primary)
-0.1% [-0.4%, -0.0%] 6
Improvements ✅
(secondary)
-0.0% [-0.0%, -0.0%] 1
All ❌✅ (primary) -0.0% [-0.4%, 0.3%] 10

Bootstrap: 516.093s -> 516.04s (-0.01%)
Artifact size: 400.83 MiB -> 401.28 MiB (0.11%)

@jhpratt

jhpratt commented Jun 7, 2026

Copy link
Copy Markdown
Member

This was known before approving.

@rustbot label +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jun 7, 2026
@Kmeakin Kmeakin deleted the km/optimize-ilog-base-power-of-two branch June 8, 2026 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants