More layout sanity checks #99983

RalfJung · 2022-07-31T13:29:57Z

eddyb · 2022-08-02T06:58:31Z

compiler/rustc_middle/src/ty/mod.rs

@@ -124,6 +124,7 @@ mod erase_regions;
 mod generics;
 mod impls_ty;
 mod instance;
+mod layout_sanity_check;


_{(this is most of a comment first written when I started a review on the original state of the PR but didn't submit it right away)}

You could deal with the large file by putting the checks in ty/layout_checks.rs or ty/layout/checks.rs (the latter would mean mod checks; in layout.rs and would be able to do use super::*; to improve everything including private items - I have mixed feelings but it's practical I think).

_{(thoughts unactionable in this PR follow below)}

Honestly we should consider to what crate to move the entire layout module, since it's mostly just providing queries, and it feels awkward where it is.

I also wish it was rustc_middle::ty::abi or rustc_middle::abi (could even go into mir but I doubt that makes as much sense) for consistency with rustc_target::abi but that's another mess.

I wish we used the term 'abi' less, not more, given how ambiguous it is.^^

eddyb · 2022-08-02T06:59:11Z

compiler/rustc_middle/src/ty/layout_sanity_check.rs

+        /// Yields non-1-ZST fields of the type
+        fn non_zst_fields<'tcx, 'a>(


Could this have 1zst/1_zst or align1zst/align1_zst in the name instead of zst?

It doesn't actually produce 1-ZST though, I had to change it to produce all ZST. Will adjust the comment.

FWIW this function seems useful to have elsewhere as well? Not sure where to put it though.

eddyb · 2022-08-06T04:24:32Z

Hmm, so you know how tools like pointee_info_at dig into a type to find primitive leaves?

It might be easier to make an abstraction like this, if only for the layout checks (note: "cases" below refers to enum Variants::Multiple and union fields):

// HACK(eddyb) used below to disambiguate between the two uses of `Size`,
// but we should probably have this as a completely separate type, with:
// `Size`: measured in bytes and/or bits; Add+Mul ops (Sub not needed)
// `Offset`: measured in bytes (no "bit offsets"); Add+Sub ops (Mul not needed)
type Offset = Size;

// HACK(eddyb) necessary method sadly missing from `std::ops::Range`.
impl<T> Range<T> {
    fn includes(&self, other: Range<T>) -> bool {
        self.start <= other.start && other.end <= self.end
    }
}

struct ScalarLeaf {
    /// Where in the parent layout this leaf is.
    ///
    /// Must be the right size, i.e. `extent.end - extent.start == primitive`.
    //
    // FIXME(eddyb) does the above condition make `extent.end` redundant?
    // (move to a `fn extent(&self, cx: ...) -> Range<Offset>` helper method?)
    extent: Range<Offset>,

    // FIXME(eddyb) instead of ignoring validity ranges, should they be merged
    // across e.g. `enum` variants?
    primitive: Primitive,
}

struct ScalarLeafWithinError {
    /// No leaves found in `search_extent`.
    OnlyPadding,

    /// A leaf intersects `search_extent` without being contained in it,
    /// i.e. the leaf is "straddling" `search_extent`'s start or end.
    PartialIntersection(ScalarLeaf),

    /// Two leaves found within `search_extent` that differ in their `extent`,
    /// `primitive`, or both.
    Mismatch(ScalarLeaf, ScalarLeaf),
}

/// Find the unique `Scalar` (but ignoring validity ranges, so... `Primitive`¹?)
/// leaf field wholly contained within `search_extent` in `layout`.
/// 
/// An `Ok(leaf)` result indicates a leaf field that is:
/// * "wholly contained within": `search_extent.includes(leaf.extent)`
/// * "unique": `layout` always contains (across all `enum`/`union` cases):
///   * `Scalar(leaf.primitive)` exactly filling `leaf.extent`
///     * some `enum`/`union` cases may contain padding at `leaf.extent` instead,
///       but all other `enum`/`union` cases must agree on the same `leaf`
///     * `Scalar` validity might vary, but the `Primitive` must match exactly
///   * only padding, in `search_extent` "around" `leaf.extent`,
///     i.e. neither of these extents contain any leaves of their own:
///     * `search_extent.start..leaf.extent.start` (before `leaf`)
///     * `leaf.extent.end..search_extent.end` (after `leaf`)
/// 
/// ¹ I wish `Primitive` was called `Scalar` but what would we call `Scalar`?
///   (maybe `Primitive` should be renamed to `ScalarType`/`ScalarKind`?)
fn scalar_leaf_within(
    layout: &TyAndLayout<'tcx>,
    search_extent: Range<Offset>,
) -> Result<ScalarLeaf, ScalarLeafWithinError> {
    fn try_for_each_leaf_intersecting<E>(
        base_offset: Offset,
        layout: &TyAndLayout<'tcx>,
        filter_extent: Range<Offset>,
        each: &mut impl FnMut(ScalarLeaf) -> Result<(), E>
    ) -> Result<(), E> {
        for (i, field_offset) in layout.fields.enumerate() {
            if field_offset < filter_extent.end {
                let field = layout.field(i);
                if filter_extent.start < field_offset + field.size {
                    try_for_each_leaf_intersecting(
                        base_offset + field_offset,
                        field,
                        filter_extent,
                        each,
                    )?;
                }
            }
        }
        for v in 0..layout.variants.len() {
            try_for_each_leaf_intersecting(base_offset, layout.for_variant(v), filter_extent, each)?;
        }

        if layout.fields.is_empty() && !layout.is_zst() {
            match layout.abi {
                Abi::Scalar(scalar) => each(ScalarLeaf {
                    extent: base_offset..base_offset+scalar.size(),
                    primitive: scalar.primitive,
                })?,

                // FIXME(eddyb) can we guarantee all non-ZST leaves are `Scalar`?
                _ => unreachable!(),
            }
        }

        Ok(())
    }

    let mut found = None;
    try_for_each_leaf_intersecting(
        Offset::ZERO,
        layout,
        search_extent,
        |candidate| match found {
            _ if !search_extent.includes(&candidate.extent) => {
                Err(ScalarLeafError::PartialIntersection(candidate))
            }
            Some(previous) if candidate != previous => {
                Err(ScalarLeafError::Mismatch(previous, candidate))
            }
            _ => {
                found = Some(candidate);
                Ok(())
            }
        }
    )?;
    Ok(found.ok_or(ScalarLeafWithinError::OnlyPadding))
}

With that abstraction in hand, we can define the conditions for:

Abi::Scalar(scalar):

// HACK(eddyb) needed to deal with `Offset` vs `Size` type differences.
let layout_extent = Offset::ZERO..Offset::ZERO+layout.size;

assert_eq!(
    scalar_leaf_within(layout, layout_extent),
    Ok(ScalarLeaf {
        extent: layout_extent,
        primitive: scalar.primitive,
    }),
);

Abi::ScalarPair(a, b):

// HACK(eddyb) needed to deal with `Offset` vs `Size` type differences.
let layout_extent = Offset::ZERO..Offset::ZERO+layout.size;
let a_extent = Offset::ZERO..Offset::ZERO+a.size;
// FIXME(eddyb) there should be a nicer way to make `start..start+len` ranges.
let b_extent = a_extent.end.align_to(b.align)..;
let b_extent = b_extent.start..b_extent.start+b.size;

assert_eq!(
    scalar_leaf_within(layout, layout_extent.start..b_extent.start),
    Ok(ScalarLeaf {
        extent: a_extent,
        primitive: a.primitive,
    }),
);
assert_eq!(
    scalar_leaf_within(layout, b_extent.start..layout_extent.end),
    Ok(ScalarLeaf {
        extent: b_extent,
        primitive: b.primitive,
    }),
);

This should cover any field hierarchy, but here are some extra thoughts:

I ~~wasted~~ spent a lot of time on designing scalar_leaf_within's interface,
but try_for_each_leaf_intersecting might be better to use directly?
- its filter_extent could be removed if we outright just want to iterate all
  leaves and for e.g. Abi::ScalarPair(a, b) we check whether every found leaf
  is either matching a or b (in extent and primitive)
this isn't trying to tackle Scalar validity ranges, but I don't think that's
possible without making scalar_leaf_within recursive (which I avoided above
to make it simpler, as I'm effectively flattening all the leaves together),
and having logic for merging across enum variants and union fields
- even then, you'd still have to handle e.g. NonZero* types and niches
we should probably have a separate Offset from Size no matter what
the whole "extent" terminology was an attempt at minimizing ambiguity/confusion
over ideas like "a subslice of a layout" (as in, the subset of a layout's
fields that occupy bytes exclusively in a subrange of 0..layout.size),
but I'm not sure it's optimal at all, ideas welcome

eddyb

r=me if you don't want to take up getting The Fun Layout Search pseudocode working (would still be good to have an issue tracking how good these checks are etc.)

RalfJung · 2022-08-07T12:41:59Z

I opened an issue at #100231, also linking specifically to your proposed checks.

@bors r=eddyb rollup=iffy

bors · 2022-08-07T12:42:01Z

📌 Commit 27013d2 has been approved by eddyb

It is now in the queue for this repository.

bors · 2022-08-07T13:17:56Z

⌛ Testing commit 27013d2 with merge 5a9c3a2...

bors · 2022-08-07T15:58:24Z

☀️ Test successful - checks-actions
Approved by: eddyb
Pushing 5a9c3a2 to master...

rust-timer · 2022-08-07T17:15:02Z

Finished benchmarking commit (5a9c3a2): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

Primary benchmarks: no relevant changes found
Secondary benchmarks: 😿 relevant regression found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	2.5%	2.5%	1
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	N/A	N/A	0

Cycles

Results

Primary benchmarks: no relevant changes found
Secondary benchmarks: 😿 relevant regression found

	mean¹	max	count²
Regressions 😿 (primary)	N/A	N/A	0
Regressions 😿 (secondary)	2.8%	2.8%	1
Improvements 🎉 (primary)	N/A	N/A	0
Improvements 🎉 (secondary)	N/A	N/A	0
All 😿🎉 (primary)	N/A	N/A	0

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

the arithmetic mean of the percent change ↩ ↩²
number of relevant changes ↩ ↩²

rust-highfive assigned eddyb Jul 31, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 31, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 31, 2022

RalfJung force-pushed the more-layout-checks branch from fea513d to c618abe Compare July 31, 2022 13:32

This comment was marked as resolved.

Sign in to view

RalfJung added 3 commits August 1, 2022 09:28

check that Scalar layout is newtype around a suitable type

e0d5c19

tighter checks for (some) ScalarPair layouts

c2e4197

move layout sanity check to its own file

6e27150

RalfJung force-pushed the more-layout-checks branch from b180d53 to 6e27150 Compare August 1, 2022 13:32

This comment has been minimized.

Sign in to view

fix for unions with scalar layout

98e11b8

eddyb reviewed Aug 2, 2022

View reviewed changes

fix a comment

27013d2

eddyb approved these changes Aug 7, 2022

View reviewed changes

RalfJung mentioned this pull request Aug 7, 2022

Tracking issue for layout sanity checks #100231

Open

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 7, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 7, 2022

bors merged commit 5a9c3a2 into rust-lang:master Aug 7, 2022

rustbot added this to the 1.65.0 milestone Aug 7, 2022

bors mentioned this pull request Aug 7, 2022

Added randomized field padding to -Z randomize-layout #97861

Closed

RalfJung deleted the more-layout-checks branch August 8, 2022 13:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More layout sanity checks #99983

More layout sanity checks #99983

RalfJung commented Jul 31, 2022

This comment was marked as resolved.

This comment has been minimized.

eddyb Aug 2, 2022

RalfJung Aug 2, 2022

eddyb Aug 2, 2022

RalfJung Aug 2, 2022

RalfJung Aug 2, 2022

eddyb commented Aug 6, 2022 •

edited

Loading

eddyb left a comment

RalfJung commented Aug 7, 2022

bors commented Aug 7, 2022

bors commented Aug 7, 2022

bors commented Aug 7, 2022

rust-timer commented Aug 7, 2022

		/// Yields non-1-ZST fields of the type
		fn non_zst_fields<'tcx, 'a>(

More layout sanity checks #99983

More layout sanity checks #99983

Conversation

RalfJung commented Jul 31, 2022

This comment was marked as resolved.

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eddyb commented Aug 6, 2022 • edited Loading

eddyb left a comment

Choose a reason for hiding this comment

RalfJung commented Aug 7, 2022

bors commented Aug 7, 2022

bors commented Aug 7, 2022

bors commented Aug 7, 2022

rust-timer commented Aug 7, 2022

Footnotes

eddyb commented Aug 6, 2022 •

edited

Loading