Skip to content

In copy_nonoverlapping, use mul nuw nsw to compute the byte size#157560

Merged
rust-bors[bot] merged 1 commit into
rust-lang:mainfrom
scottmcm:mul_nuw_nsw_in_memcpy
Jun 8, 2026
Merged

In copy_nonoverlapping, use mul nuw nsw to compute the byte size#157560
rust-bors[bot] merged 1 commit into
rust-lang:mainfrom
scottmcm:mul_nuw_nsw_in_memcpy

Conversation

@scottmcm

@scottmcm scottmcm commented Jun 7, 2026

Copy link
Copy Markdown
Member

Seems like we might as well? Adding these flags means the optimizer can tell the limited range on the count of items -- like how we use these flags (#136575) when calculating size_of_val for a slice.

Today we use a wrapping multiplication, which mean that copy_nonoverlapping::<u32>(src, dst, 0x40000000_00000001) appears like 4 bytes -- a perfectly reasonable size! -- once it gets to the memcpy call.

If I'm understanding https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html#safety properly, this is just exploiting existing UB, since src and dst must each be inside an allocation, and those allocations can be at most isize::MAX bytes. (Plus, fundamentally, to be non-overlapping there's not enough space in the address space to be bigger than isize::MAX.)

cc @RalfJung to make sure this is ok, as requested last he found out I was newly exploiting some UB in codegen 🙃
r? codegen

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 7, 2026
@RalfJung

RalfJung commented Jun 7, 2026

Copy link
Copy Markdown
Member

Yeah, in fact both copy and copy_nonoverlapping could use this since we know the result must be at most isize::MAX else it cannot be inbounds.

@saethlin

saethlin commented Jun 7, 2026

Copy link
Copy Markdown
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 7, 2026
rust-bors Bot pushed a commit that referenced this pull request Jun 7, 2026
In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size
@rust-bors

rust-bors Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

☀️ Try build successful (CI)
Build commit: 1fe6841 (1fe6841f2af59a318002c0c0f68aefac7e31098d, parent: 43a4909ee98ed4d006d9d773f5d94dc58e34f846)

@rust-timer

This comment has been minimized.

@rust-timer

Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (1fe6841): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking means the PR may be perf-sensitive. Consider adding rollup=never if this change is not fit for rolling up.

@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This perf run didn't have relevant results for this metric.

Max RSS (memory usage)

Results (primary 3.3%, secondary -5.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.3% [3.3%, 3.3%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.6% [-5.6%, -5.6%] 1
All ❌✅ (primary) 3.3% [3.3%, 3.3%] 1

Cycles

Results (secondary -3.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.5% [-3.5%, -3.5%] 1
All ❌✅ (primary) - - 0

Binary size

This perf run didn't have relevant results for this metric.

Bootstrap: 516.093s -> 517.507s (0.27%)
Artifact size: 400.83 MiB -> 400.81 MiB (-0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 7, 2026
@saethlin

saethlin commented Jun 7, 2026

Copy link
Copy Markdown
Member

@bors r+ rollup (because clearly it doesn't impact perf... Right now)

@rust-bors

rust-bors Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

📌 Commit 4af2ca0 has been approved by saethlin

It is now in the queue for this repository.

@rust-bors rust-bors Bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 7, 2026
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jun 7, 2026
…saethlin

In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size

Seems like we might as well?  Adding these flags means the optimizer can tell the limited range on the count of items -- like how we use these flags (rust-lang#136575) when calculating `size_of_val` for a slice.

Today we use a wrapping multiplication, which mean that `copy_nonoverlapping::<u32>(src, dst, 0x40000000_00000001)` appears like 4 bytes -- a perfectly reasonable size! -- once it gets to the `memcpy` call.

If I'm understanding <https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html#safety> properly, this is just exploiting existing UB, since `src` and `dst` must each be inside an allocation, and those allocations can be at most `isize::MAX` bytes.  (Plus, fundamentally, to be non-overlapping there's not enough space in the address space to be bigger than `isize::MAX`.)

cc @RalfJung to make sure this is ok, as requested last he found out I was newly exploiting some UB in codegen 🙃
r? codegen
jhpratt added a commit to jhpratt/rust that referenced this pull request Jun 7, 2026
…saethlin

In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size

Seems like we might as well?  Adding these flags means the optimizer can tell the limited range on the count of items -- like how we use these flags (rust-lang#136575) when calculating `size_of_val` for a slice.

Today we use a wrapping multiplication, which mean that `copy_nonoverlapping::<u32>(src, dst, 0x40000000_00000001)` appears like 4 bytes -- a perfectly reasonable size! -- once it gets to the `memcpy` call.

If I'm understanding <https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html#safety> properly, this is just exploiting existing UB, since `src` and `dst` must each be inside an allocation, and those allocations can be at most `isize::MAX` bytes.  (Plus, fundamentally, to be non-overlapping there's not enough space in the address space to be bigger than `isize::MAX`.)

cc @RalfJung to make sure this is ok, as requested last he found out I was newly exploiting some UB in codegen 🙃
r? codegen
rust-bors Bot pushed a commit that referenced this pull request Jun 7, 2026
Rollup of 25 pull requests

Successful merges:

 - #157447 (Move cross crate tests into the appropriate folder)
 - #145108 (Resolver: Batched Import Resolution)
 - #156119 (Further optimize `SliceIndex<str>` impl for `Range<usize>`)
 - #157224 (Manually unroll loop in `str::floor_char_boundary`)
 - #157289 (Add infallible primitive type lookups to template arg resolver)
 - #157540 (Cleanup and optimize `render_impls`)
 - #157444 (Couple of work product cleanups)
 - #157543 (Reorganize `tests/ui/issues` [5/N])
 - #153513 (Syntactically reject equality predicates)
 - #155797 (LineWriter: cap write_vectored newline scan to avoid quadratic write_all_vectored)
 - #156155 (macros: report unbound metavariables directly)
 - #156188 (riscv: promote d, e, and f target_features to CfgStableToggleUnstable)
 - #156666 (Clarify meaning of ranges in pointer offset docs)
 - #157078 (Document equivalence of `highest_one` and `ilog2` methods on integers)
 - #157129 (ci: update download-artifact action to v8)
 - #157169 (triagebot: Update messages to direct changes to appropriate repositories)
 - #157323 (Document Repeat::last panic behavior)
 - #157370 (Clarify MaybeUninit::zeroed padding docs)
 - #157399 (Silence llbc's output by default to prevent rustc's linker output warning)
 - #157500 (Improve documentation of `align_of` and `Alignment`.)
 - #157545 (Suggest using comma to separate valid attribute list items)
 - #157559 (chore: Update annotate-snippets to 0.12.16)
 - #157560 (In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size)
 - #157580 (Importing suggestion reported twice when reporting privacy error)
 - #157581 (Test fixup)
rust-bors Bot pushed a commit that referenced this pull request Jun 7, 2026
Rollup of 25 pull requests

Successful merges:

 - #157447 (Move cross crate tests into the appropriate folder)
 - #145108 (Resolver: Batched Import Resolution)
 - #156119 (Further optimize `SliceIndex<str>` impl for `Range<usize>`)
 - #157224 (Manually unroll loop in `str::floor_char_boundary`)
 - #157289 (Add infallible primitive type lookups to template arg resolver)
 - #157540 (Cleanup and optimize `render_impls`)
 - #157444 (Couple of work product cleanups)
 - #157543 (Reorganize `tests/ui/issues` [5/N])
 - #153513 (Syntactically reject equality predicates)
 - #155797 (LineWriter: cap write_vectored newline scan to avoid quadratic write_all_vectored)
 - #156155 (macros: report unbound metavariables directly)
 - #156188 (riscv: promote d, e, and f target_features to CfgStableToggleUnstable)
 - #156666 (Clarify meaning of ranges in pointer offset docs)
 - #157078 (Document equivalence of `highest_one` and `ilog2` methods on integers)
 - #157129 (ci: update download-artifact action to v8)
 - #157169 (triagebot: Update messages to direct changes to appropriate repositories)
 - #157323 (Document Repeat::last panic behavior)
 - #157370 (Clarify MaybeUninit::zeroed padding docs)
 - #157399 (Silence llbc's output by default to prevent rustc's linker output warning)
 - #157500 (Improve documentation of `align_of` and `Alignment`.)
 - #157545 (Suggest using comma to separate valid attribute list items)
 - #157559 (chore: Update annotate-snippets to 0.12.16)
 - #157560 (In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size)
 - #157580 (Importing suggestion reported twice when reporting privacy error)
 - #157581 (Test fixup)
@scottmcm

scottmcm commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

because clearly it doesn't impact perf... Right now

Yeah, it enables fewer optimizations than I'd hoped right now; see llvm/llvm-project#202240 and #157589

@rust-bors rust-bors Bot merged commit 753665a into rust-lang:main Jun 8, 2026
13 checks passed
@rustbot rustbot added this to the 1.98.0 milestone Jun 8, 2026
rust-timer added a commit that referenced this pull request Jun 8, 2026
Rollup merge of #157560 - scottmcm:mul_nuw_nsw_in_memcpy, r=saethlin

In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size

Seems like we might as well?  Adding these flags means the optimizer can tell the limited range on the count of items -- like how we use these flags (#136575) when calculating `size_of_val` for a slice.

Today we use a wrapping multiplication, which mean that `copy_nonoverlapping::<u32>(src, dst, 0x40000000_00000001)` appears like 4 bytes -- a perfectly reasonable size! -- once it gets to the `memcpy` call.

If I'm understanding <https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html#safety> properly, this is just exploiting existing UB, since `src` and `dst` must each be inside an allocation, and those allocations can be at most `isize::MAX` bytes.  (Plus, fundamentally, to be non-overlapping there's not enough space in the address space to be bigger than `isize::MAX`.)

cc @RalfJung to make sure this is ok, as requested last he found out I was newly exploiting some UB in codegen 🙃
r? codegen
@scottmcm scottmcm deleted the mul_nuw_nsw_in_memcpy branch June 8, 2026 01:13
github-actions Bot pushed a commit to rust-lang/stdarch that referenced this pull request Jun 8, 2026
Rollup of 25 pull requests

Successful merges:

 - rust-lang/rust#157447 (Move cross crate tests into the appropriate folder)
 - rust-lang/rust#145108 (Resolver: Batched Import Resolution)
 - rust-lang/rust#156119 (Further optimize `SliceIndex<str>` impl for `Range<usize>`)
 - rust-lang/rust#157224 (Manually unroll loop in `str::floor_char_boundary`)
 - rust-lang/rust#157289 (Add infallible primitive type lookups to template arg resolver)
 - rust-lang/rust#157540 (Cleanup and optimize `render_impls`)
 - rust-lang/rust#157444 (Couple of work product cleanups)
 - rust-lang/rust#157543 (Reorganize `tests/ui/issues` [5/N])
 - rust-lang/rust#153513 (Syntactically reject equality predicates)
 - rust-lang/rust#155797 (LineWriter: cap write_vectored newline scan to avoid quadratic write_all_vectored)
 - rust-lang/rust#156155 (macros: report unbound metavariables directly)
 - rust-lang/rust#156188 (riscv: promote d, e, and f target_features to CfgStableToggleUnstable)
 - rust-lang/rust#156666 (Clarify meaning of ranges in pointer offset docs)
 - rust-lang/rust#157078 (Document equivalence of `highest_one` and `ilog2` methods on integers)
 - rust-lang/rust#157129 (ci: update download-artifact action to v8)
 - rust-lang/rust#157169 (triagebot: Update messages to direct changes to appropriate repositories)
 - rust-lang/rust#157323 (Document Repeat::last panic behavior)
 - rust-lang/rust#157370 (Clarify MaybeUninit::zeroed padding docs)
 - rust-lang/rust#157399 (Silence llbc's output by default to prevent rustc's linker output warning)
 - rust-lang/rust#157500 (Improve documentation of `align_of` and `Alignment`.)
 - rust-lang/rust#157545 (Suggest using comma to separate valid attribute list items)
 - rust-lang/rust#157559 (chore: Update annotate-snippets to 0.12.16)
 - rust-lang/rust#157560 (In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size)
 - rust-lang/rust#157580 (Importing suggestion reported twice when reporting privacy error)
 - rust-lang/rust#157581 (Test fixup)
asukaminato0721 pushed a commit to asukaminato0721/rust-analyzer that referenced this pull request Jun 8, 2026
Rollup of 25 pull requests

Successful merges:

 - rust-lang/rust#157447 (Move cross crate tests into the appropriate folder)
 - rust-lang/rust#145108 (Resolver: Batched Import Resolution)
 - rust-lang/rust#156119 (Further optimize `SliceIndex<str>` impl for `Range<usize>`)
 - rust-lang/rust#157224 (Manually unroll loop in `str::floor_char_boundary`)
 - rust-lang/rust#157289 (Add infallible primitive type lookups to template arg resolver)
 - rust-lang/rust#157540 (Cleanup and optimize `render_impls`)
 - rust-lang/rust#157444 (Couple of work product cleanups)
 - rust-lang/rust#157543 (Reorganize `tests/ui/issues` [5/N])
 - rust-lang/rust#153513 (Syntactically reject equality predicates)
 - rust-lang/rust#155797 (LineWriter: cap write_vectored newline scan to avoid quadratic write_all_vectored)
 - rust-lang/rust#156155 (macros: report unbound metavariables directly)
 - rust-lang/rust#156188 (riscv: promote d, e, and f target_features to CfgStableToggleUnstable)
 - rust-lang/rust#156666 (Clarify meaning of ranges in pointer offset docs)
 - rust-lang/rust#157078 (Document equivalence of `highest_one` and `ilog2` methods on integers)
 - rust-lang/rust#157129 (ci: update download-artifact action to v8)
 - rust-lang/rust#157169 (triagebot: Update messages to direct changes to appropriate repositories)
 - rust-lang/rust#157323 (Document Repeat::last panic behavior)
 - rust-lang/rust#157370 (Clarify MaybeUninit::zeroed padding docs)
 - rust-lang/rust#157399 (Silence llbc's output by default to prevent rustc's linker output warning)
 - rust-lang/rust#157500 (Improve documentation of `align_of` and `Alignment`.)
 - rust-lang/rust#157545 (Suggest using comma to separate valid attribute list items)
 - rust-lang/rust#157559 (chore: Update annotate-snippets to 0.12.16)
 - rust-lang/rust#157560 (In `copy_nonoverlapping`, use `mul nuw nsw` to compute the byte size)
 - rust-lang/rust#157580 (Importing suggestion reported twice when reporting privacy error)
 - rust-lang/rust#157581 (Test fixup)
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jun 8, 2026
Use `mul nuw nsw` in `intrinsics::copy`

Essentially the same as rust-lang#157560, just for `copy` instead of `copy_nonoverlapping`.

> Yeah, in fact both copy and copy_nonoverlapping could use this since we know the result must be at most isize::MAX else it cannot be inbounds.
> ~ rust-lang#157560 (comment)

r? saethlin
rust-timer added a commit that referenced this pull request Jun 8, 2026
Rollup merge of #157588 - scottmcm:copy-nsuw, r=saethlin

Use `mul nuw nsw` in `intrinsics::copy`

Essentially the same as #157560, just for `copy` instead of `copy_nonoverlapping`.

> Yeah, in fact both copy and copy_nonoverlapping could use this since we know the result must be at most isize::MAX else it cannot be inbounds.
> ~ #157560 (comment)

r? saethlin
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants