Introduce `MixedBitSet` #133891

nnethercote · 2024-12-05T05:03:19Z

ChunkedBitSet is good at avoiding excessive memory usage for programs with very large functgions where dataflow bitsets have very large domain sizes. But it's overly heavyweight for small bitsets, because any non-empty ChunkedBitSet takes up at least 256 bytes.

This PR introduces MixedBitSet, which is a simple bitset that uses BitSet for small/medium bitsets and ChunkedBitSet for large bitsets. It's a speed and memory usage win.

r? @Mark-Simulacrum

These blocks are currently interleaved with `ChunkedBitSet` blocks. It makes things hard to find and has annoyed me for a while.

nnethercote · 2024-12-05T05:03:48Z

@bors try @rust-timer queue

Introduce `MixedBitSet` r? `@ghost`

bors · 2024-12-05T05:05:12Z

⌛ Trying commit 853e69c with merge c103766...

bors · 2024-12-05T06:48:23Z

☀️ Try build successful - checks-actions
Build commit: c103766 (c103766e3311d9facbec90ad63ac7cfb7dfc4ce8)

rust-timer · 2024-12-05T08:04:20Z

Finished benchmarking commit (c103766): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.7%	[0.2%, 1.1%]	6
Improvements ✅ (primary)	-0.6%	[-1.3%, -0.1%]	70
Improvements ✅ (secondary)	-0.7%	[-2.3%, -0.1%]	29
All ❌✅ (primary)	-0.6%	[-1.3%, -0.1%]	70

Max RSS (memory usage)

Results (primary -1.6%, secondary -5.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.6%	[-1.9%, -1.4%]	2
Improvements ✅ (secondary)	-5.2%	[-5.4%, -5.1%]	2
All ❌✅ (primary)	-1.6%	[-1.9%, -1.4%]	2

Cycles

Results (primary -1.0%, secondary -2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.0%	[-1.3%, -0.4%]	13
Improvements ✅ (secondary)	-2.4%	[-2.9%, -2.0%]	6
All ❌✅ (primary)	-1.0%	[-1.3%, -0.4%]	13

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 769.678s -> 766.319s (-0.44%)
Artifact size: 330.86 MiB -> 330.88 MiB (0.01%)

It just uses `BitSet` for small/medium sizes (<= 2048 bits) and `ChunkedBitSet` for larger sizes. This is good because `ChunkedBitSet` is slow and memory-hungry at smaller sizes.

It's a performance win because `MixedBitSet` is faster and uses less memory than `ChunkedBitSet`. Also reflow some overlong comment lines in `lint_tail_expr_drop_order.rs`.

rustbot · 2024-12-05T09:27:24Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

nnethercote · 2024-12-05T09:32:51Z

After #133431 removed HybridBitSet it might seem strange to immediately introduce a new bitset. The rationale is:

it's a perf win; and
MixedBitSet is a lot simpler than HybridBitSet.

Mark-Simulacrum

r=me with or without nits fixed

compiler/rustc_mir_transform/src/lint_tail_expr_drop_order.rs

compiler/rustc_index/src/bit_set.rs

A `ChunkedBitSet` has to be at least 2048 bits for it to outperform a `BitSet`, because that's the chunk size. The largest `SparseBitMatrix` encountered when compiling the compiler and the entire rustc-perf benchmark suite is less than 600 bits. This change is a tiny perf win, but the motivation is more about avoiding uses of `ChunkedBitSet` outside of `MixedBitSet`. The test change is necessary to avoid hitting the `<BitSet<T> as BitRelations<ChunkedBitSet<T>>>::subtract` method that has `unimplemented!` in its body and isn't otherwise used.

Just minimizing uses of `ChunkedBitSet`.

`ChunkedBitSet` is no longer used directly by dataflow analyses, with `MixedBitSet` replacing it in those contexts.

nnethercote · 2024-12-08T21:55:46Z

I addressed the comments.

@bors r=Mark-Simulacrum

bors · 2024-12-08T21:55:49Z

📌 Commit fa6ceba has been approved by Mark-Simulacrum

It is now in the queue for this repository.

bors · 2024-12-09T07:13:14Z

⌛ Testing commit fa6ceba with merge f6cb952...

bors · 2024-12-09T09:59:39Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing f6cb952 to master...

rust-timer · 2024-12-09T11:17:16Z

Finished benchmarking commit (f6cb952): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.6%	[0.2%, 1.0%]	8
Improvements ✅ (primary)	-0.6%	[-1.3%, -0.2%]	67
Improvements ✅ (secondary)	-0.7%	[-2.0%, -0.2%]	28
All ❌✅ (primary)	-0.6%	[-1.3%, -0.2%]	67

Max RSS (memory usage)

Results (primary 0.9%, secondary -1.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.1%	[1.2%, 3.1%]	2
Regressions ❌ (secondary)	1.3%	[1.3%, 1.3%]	1
Improvements ✅ (primary)	-1.5%	[-1.5%, -1.5%]	1
Improvements ✅ (secondary)	-1.6%	[-2.3%, -1.4%]	4
All ❌✅ (primary)	0.9%	[-1.5%, 3.1%]	3

Cycles

Results (primary -1.0%, secondary -2.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.0%	[-1.0%, -1.0%]	1
Improvements ✅ (secondary)	-2.9%	[-4.8%, -2.1%]	14
All ❌✅ (primary)	-1.0%	[-1.0%, -1.0%]	1

Binary size

Results (secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Bootstrap: 769.074s -> 767.969s (-0.14%)
Artifact size: 330.87 MiB -> 330.85 MiB (-0.01%)

Mark-Simulacrum · 2024-12-09T13:14:52Z

Overall positive, all primary benchmarks show improvement (or no change).

Move some BitSet code blocks to a better place.

dff5ce6

These blocks are currently interleaved with `ChunkedBitSet` blocks. It makes things hard to find and has annoyed me for a while.

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 5, 2024

nnethercote changed the title ~~Mixed bit set~~ Introduce MixedBitSet Dec 5, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 5, 2024

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 5, 2024

Auto merge of rust-lang#133891 - nnethercote:MixedBitSet, r=<try>

c103766

Introduce `MixedBitSet` r? `@ghost`

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 5, 2024

nnethercote added 2 commits December 5, 2024 20:07

Introduce MixedBitSet.

6ee1a7a

It just uses `BitSet` for small/medium sizes (<= 2048 bits) and `ChunkedBitSet` for larger sizes. This is good because `ChunkedBitSet` is slow and memory-hungry at smaller sizes.

Change ChunkedBitSet<MovePathIndex>s to MixedBitSet.

a065475

It's a performance win because `MixedBitSet` is faster and uses less memory than `ChunkedBitSet`. Also reflow some overlong comment lines in `lint_tail_expr_drop_order.rs`.

nnethercote force-pushed the MixedBitSet branch from 853e69c to 45d8a1b Compare December 5, 2024 09:27

nnethercote marked this pull request as ready for review December 5, 2024 09:27

rustbot assigned Mark-Simulacrum Dec 5, 2024

This was referenced Dec 5, 2024

Remove HybridBitSet #133431

Merged

Rollup of 7 pull requests #133120

Merged

Mark-Simulacrum approved these changes Dec 8, 2024

View reviewed changes

compiler/rustc_mir_transform/src/lint_tail_expr_drop_order.rs Show resolved Hide resolved

compiler/rustc_index/src/bit_set.rs Show resolved Hide resolved

nnethercote added 3 commits December 9, 2024 08:53

Use MixedBitSet instead of ChunkedBitSet in fmt.rs.

34f45f0

Just minimizing uses of `ChunkedBitSet`.

Remove ChunkedBitSet impls that are no longer needed.

fa6ceba

`ChunkedBitSet` is no longer used directly by dataflow analyses, with `MixedBitSet` replacing it in those contexts.

nnethercote force-pushed the MixedBitSet branch from 45d8a1b to fa6ceba Compare December 8, 2024 21:55

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 8, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 9, 2024

bors merged commit f6cb952 into rust-lang:master Dec 9, 2024
7 checks passed

rustbot added this to the 1.85.0 milestone Dec 9, 2024

bors mentioned this pull request Dec 9, 2024

rustc_mir_dataflow cleanups, including some renamings #133938

Merged

Mark-Simulacrum added the perf-regression-triaged The performance regression has been triaged. label Dec 9, 2024

nnethercote deleted the MixedBitSet branch December 10, 2024 00:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `MixedBitSet` #133891

Introduce `MixedBitSet` #133891

nnethercote commented Dec 5, 2024 •

edited

Loading

nnethercote commented Dec 5, 2024

This comment has been minimized.

bors commented Dec 5, 2024

This comment has been minimized.

bors commented Dec 5, 2024

This comment has been minimized.

rust-timer commented Dec 5, 2024

rustbot commented Dec 5, 2024

nnethercote commented Dec 5, 2024

Mark-Simulacrum left a comment

nnethercote commented Dec 8, 2024

bors commented Dec 8, 2024

bors commented Dec 9, 2024

bors commented Dec 9, 2024

rust-timer commented Dec 9, 2024

Mark-Simulacrum commented Dec 9, 2024

Introduce MixedBitSet #133891

Introduce MixedBitSet #133891

Conversation

nnethercote commented Dec 5, 2024 • edited Loading

nnethercote commented Dec 5, 2024

This comment has been minimized.

bors commented Dec 5, 2024

This comment has been minimized.

bors commented Dec 5, 2024

This comment has been minimized.

rust-timer commented Dec 5, 2024

Overall result: ❌✅ regressions and improvements - please read the text below

rustbot commented Dec 5, 2024

nnethercote commented Dec 5, 2024

Mark-Simulacrum left a comment

Choose a reason for hiding this comment

nnethercote commented Dec 8, 2024

bors commented Dec 8, 2024

bors commented Dec 9, 2024

bors commented Dec 9, 2024

rust-timer commented Dec 9, 2024

Overall result: ❌✅ regressions and improvements - please read the text below

Mark-Simulacrum commented Dec 9, 2024

Introduce `MixedBitSet` #133891

Introduce `MixedBitSet` #133891

nnethercote commented Dec 5, 2024 •

edited

Loading