Unrolled build for rust-lang#135481

Rollup merge of rust-lang#135481 - Zalathar:node-flow, r=oli-obk coverage: Completely overhaul counter assignment, using node-flow graphs The existing code for choosing where to put physical counter-increments gets the job done, but is very ad-hoc and hard to modify without introducing tricky regressions. This PR replaces all of that with a more principled approach, based on the algorithm described in "Optimal measurement points for program frequency counts" (Knuth & Stevenson, 1973). --- We start by ensuring that our graph has “balanced flow”, i.e. each node's flow (execution count) is equal to the sum of all its in-edge flows, and equal to the sum of all its out-edge flows. That isn't naturally true of control-flow graphs, so we introduce a wrapper type `BalancedFlowGraph` to fix that by introducing synthetic nodes and edges as needed. Once our graph has balanced flow, the next step is to create another view of that graph in which each node's successors have all been merged into one “supernode”. Consequently, each node's out-edges can be coalesced into a single out-edge to one of those supernodes. Because of the balanced-flow property, the flow of that coalesced edge is equal to the flow of the original node. Having expressed all of our node flows as edge flows, we can then analyze node flows using techniques for analyzing edge flows. We incrementally build a spanning tree over the merged supernodes, such that each new edge in the spanning tree represents a node whose flow can be computed from that of other nodes. When this is done, we end up with a list of “counter terms” for each node, describing which nodes need physical counters, and how the remaining nodes can have their flow calculated by adding and subtracting those physical counters. --- The re-blessed coverage tests show that this results in modest or major improvements for our test programs. Some tests need fewer physical counters, some tests need fewer expression nodes for the same number of physical counters, and some tests show striking reductions in both.
rust-lang-ci · Jan 17, 2025 · 739b7ef · 739b7ef
2 parents 99db273 + 6eabf03
commit 739b7ef
Show file tree

Hide file tree

Showing 56 changed files with 2,040 additions and 1,967 deletions.
diff --git a/compiler/rustc_data_structures/src/graph/iterate/mod.rs b/compiler/rustc_data_structures/src/graph/iterate/mod.rs
@@ -125,6 +125,16 @@ where
     pub fn visited(&self, node: G::Node) -> bool {
         self.visited.contains(node)
     }
+
+    /// Returns a reference to the set of nodes that have been visited, with
+    /// the same caveats as [`Self::visited`].
+    ///
+    /// When incorporating the visited nodes into another bitset, using bulk
+    /// operations like `union` or `intersect` can be more efficient than
+    /// processing each node individually.
+    pub fn visited_set(&self) -> &DenseBitSet<G::Node> {
+        &self.visited
+    }
 }
 
 impl<G> std::fmt::Debug for DepthFirstSearch<G>

diff --git a/compiler/rustc_data_structures/src/graph/mod.rs b/compiler/rustc_data_structures/src/graph/mod.rs
@@ -4,6 +4,7 @@ pub mod dominators;
 pub mod implementation;
 pub mod iterate;
 mod reference;
+pub mod reversed;
 pub mod scc;
 pub mod vec_graph;
 

diff --git a/compiler/rustc_data_structures/src/graph/reversed.rs b/compiler/rustc_data_structures/src/graph/reversed.rs
@@ -0,0 +1,42 @@
+use crate::graph::{DirectedGraph, Predecessors, Successors};
+
+/// View that reverses the direction of edges in its underlying graph, so that
+/// successors become predecessors and vice-versa.
+///
+/// Because of `impl<G: Graph> Graph for &G`, the underlying graph can be
+/// wrapped by-reference instead of by-value if desired.
+#[derive(Clone, Copy, Debug)]
+pub struct ReversedGraph<G> {
+    pub inner: G,
+}
+
+impl<G> ReversedGraph<G> {
+    pub fn new(inner: G) -> Self {
+        Self { inner }
+    }
+}
+
+impl<G: DirectedGraph> DirectedGraph for ReversedGraph<G> {
+    type Node = G::Node;
+
+    fn num_nodes(&self) -> usize {
+        self.inner.num_nodes()
+    }
+}
+
+// Implementing `StartNode` is not possible in general, because the start node
+// of an underlying graph is instead an _end_ node in the reversed graph.
+// But would be possible to define another wrapper type that adds an explicit
+// start node to its underlying graph, if desired.
+
+impl<G: Predecessors> Successors for ReversedGraph<G> {
+    fn successors(&self, node: Self::Node) -> impl Iterator<Item = Self::Node> {
+        self.inner.predecessors(node)
+    }
+}
+
+impl<G: Successors> Predecessors for ReversedGraph<G> {
+    fn predecessors(&self, node: Self::Node) -> impl Iterator<Item = Self::Node> {
+        self.inner.successors(node)
+    }
+}
diff --git a/compiler/rustc_index/src/bit_set.rs b/compiler/rustc_index/src/bit_set.rs
@@ -281,6 +281,24 @@ impl<T: Idx> DenseBitSet<T> {
     }
 
     bit_relations_inherent_impls! {}
+
+    /// Sets `self = self | !other`.
+    ///
+    /// FIXME: Incorporate this into [`BitRelations`] and fill out
+    /// implementations for other bitset types, if needed.
+    pub fn union_not(&mut self, other: &DenseBitSet<T>) {
+        assert_eq!(self.domain_size, other.domain_size);
+
+        // FIXME(Zalathar): If we were to forcibly _set_ all excess bits before
+        // the bitwise update, and then clear them again afterwards, we could
+        // quickly and accurately detect whether the update changed anything.
+        // But that's only worth doing if there's an actual use-case.
+
+        bitwise(&mut self.words, &other.words, |a, b| a | !b);
+        // The bitwise update `a | !b` can result in the last word containing
+        // out-of-domain bits, so we need to clear them.
+        self.clear_excess_bits();
+    }
 }
 
 // dense REL dense
@@ -1087,6 +1105,18 @@ impl<T: Idx> fmt::Debug for ChunkedBitSet<T> {
     }
 }
 
+/// Sets `out_vec[i] = op(out_vec[i], in_vec[i])` for each index `i` in both
+/// slices. The slices must have the same length.
+///
+/// Returns true if at least one bit in `out_vec` was changed.
+///
+/// ## Warning
+/// Some bitwise operations (e.g. union-not, xor) can set output bits that were
+/// unset in in both inputs. If this happens in the last word/chunk of a bitset,
+/// it can cause the bitset to contain out-of-domain values, which need to
+/// be cleared with `clear_excess_bits_in_final_word`. This also makes the
+/// "changed" return value unreliable, because the change might have only
+/// affected excess bits.
 #[inline]
 fn bitwise<Op>(out_vec: &mut [Word], in_vec: &[Word], op: Op) -> bool
 where

diff --git a/compiler/rustc_index/src/bit_set/tests.rs b/compiler/rustc_index/src/bit_set/tests.rs
@@ -75,6 +75,32 @@ fn union_two_sets() {
     assert!(set1.contains(64));
 }
 
+#[test]
+fn union_not() {
+    let mut a = DenseBitSet::<usize>::new_empty(100);
+    let mut b = DenseBitSet::<usize>::new_empty(100);
+
+    a.insert(3);
+    a.insert(5);
+    a.insert(80);
+    a.insert(81);
+
+    b.insert(5); // Already in `a`.
+    b.insert(7);
+    b.insert(63);
+    b.insert(81); // Already in `a`.
+    b.insert(90);
+
+    a.union_not(&b);
+
+    // After union-not, `a` should contain all values in the domain, except for
+    // the ones that are in `b` and were _not_ already in `a`.
+    assert_eq!(
+        a.iter().collect::<Vec<_>>(),
+        (0usize..100).filter(|&x| !matches!(x, 7 | 63 | 90)).collect::<Vec<_>>(),
+    );
+}
+
 #[test]
 fn chunked_bitset() {
     let mut b0 = ChunkedBitSet::<usize>::new_empty(0);