Add `CodeExtent::Remainder` variant; pre-req for new scoping/drop rules. #21657

pnkfelix · 2015-01-26T15:17:42Z

Add CodeExtent::Remainder variant; pre-req for new scoping/drop rules.

This new enum variant introduces finer-grain code extents, i.e. we now track that a binding lives only for a suffix of a block, and (importantly) will be dropped when it goes out of scope before the bindings that occurred earlier in the block.

Both of these notions are neatly captured by marking the block (and each suffix) as an enclosing scope of the next suffix beneath it.

This is work that is part of the foundation for issue #8861.

(It actually has been seen in earlier posted pull requests, in particular #21022; I have just factored it out into its own PR to ease my own near-future rebasing, and also get people used to the new rules.)

These finer grained scopes do mean that some code is newly rejected by rustc; for example:

let mut map : HashMap<u8, &u8> = HashMap::new();
let tmp = Box::new(2);
map.insert(43, &*tmp);

This will now fail to compile with a message that *tmp does not live long enough, because the scope of tmp is now strictly smaller than
that of map, and the use of &u8 in map's type requires that the borrowed references are all to data that live at least as long as the map.

The usual fix for a case like this is to move the binding for tmp up above that of map; note that you can still leave the initialization in the original spot, like so:

let tmp;
let mut map : HashMap<u8, &u8> = HashMap::new();
tmp = box 2;
map.insert(43, &*tmp);

Similarly, one can encounter an analogous situation with Vec: one would need to rewrite:

let mut vec = Vec::new();
let tmp = 'c';
vec.push(&tmp);

as:

let tmp;
let mut vec = Vec::new();
tmp = 'c';
vec.push(&tmp);

In some corner cases, it does not suffice to reorder the bindings; in particular, when the types for both bindings need to reflect exactly the same code extent, and a parent/child relationship between them does not work.

In pnkfelix's experience this has arisen most often when mixing uses of cyclic data structures while also allowing a lifetime parameter 'a to flow into a type parameter context where the type is invariant with respect to the type parameter. An important instance of this is arena::TypedArena<T>, which is invariant with respect to T.

The reason that variance is relevant is this: if TypedArena were covariant with respect to its type parameter, then we could assign it the longer lifetime when it is initialized, and then convert it to a subtype (via covariance) with a shorter lifetime when necessary. But TypedArena is invariant with respect to its type parameter, and thus if S is a subtype of T (in particular, if S has a lifetime parameter that is shorter than that of T), then a TypedArena<S> is unrelated to TypedArena<T>). Note that the fact that TypedArena<T> is invariant with respect to T comes straight from the API of that trait; nikomatsakis and pnkfelix has discussed "arena-like" API's that would somehow be covariant with respect to T, but have not actually managed to develop such an API that is actually expressible in pure Rust without further linguistic extensions/changes.

Concretely, consider code like this:

struct Node<'a> { sibling: Option<&'a Node<'a>> }
struct Context<'a> {
    // because of this field, `Context<'a>` is invariant with respect to `'a`.
    arena: &'a TypedArena<Node<'a>>,
    ...
}
fn new_ctxt<'a>(arena: &'a TypedArena<Node<'a>>) -> Context<'a> { ... }
fn use_ctxt<'a>(fcx: &'a Context<'a>) { ... }

let arena = TypedArena::new();
let ctxt = new_ctxt(&arena);

use_ctxt(&ctxt);

In these situations, if you try to introduce two bindings via two distinct let statements, each is (with this commit) assigned a distinct extent, and the region inference system cannot find a single region to assign to the lifetime 'a that works for both of the bindings. So you get an error that ctxt does not live long enough; but moving its binding up above that of arena just shifts the error so now the compiler complains that arena does not live long enough.

SO: What to do? The easiest fix in this case is to ensure that the two bindings do get assigned the same static extent, by stuffing both
bindings into the same let statement, like so:

let (arena, ctxt): (TypedArena, Context);
arena = TypedArena::new();
ctxt = new_ctxt(&arena);

use_ctxt(&ctxt);

Due to the new code restrictions outlined above, this is a ...

[breaking-change]

rust-highfive · 2015-01-26T15:17:54Z

r? @huonw

(rust_highfive has picked a reviewer for you, use r? to override)

pnkfelix · 2015-01-26T15:19:57Z

@bors: r=nikomatsakis ff5e236

pnkfelix · 2015-01-26T15:20:57Z

(one can see niko's review notes on #21022 )

huonw · 2015-01-27T07:35:24Z

(Needs a rebase.)

This new variant introduces finer-grain code extents, i.e. we now track that a binding lives only for a suffix of a block, and (importantly) will be dropped when it goes out of scope *before* the bindings that occurred earlier in the block. Both of these notions are neatly captured by marking the block (and each suffix) as an enclosing scope of the next suffix beneath it. This is work that is part of the foundation for issue rust-lang#8861. (It actually has been seen in earlier posted pull requests; I have just factored it out into its own PR to ease my own rebasing.) ---- These finer grained scopes do mean that some code is newly rejected by `rustc`; for example: ```rust let mut map : HashMap<u8, &u8> = HashMap::new(); let tmp = Box::new(2); map.insert(43, &*tmp); ``` This will now fail to compile with a message that `*tmp` does not live long enough, because the scope of `tmp` is now strictly smaller than that of `map`, and the use of `&u8` in map's type requires that the borrowed references are all to data that live at least as long as the map. The usual fix for a case like this is to move the binding for `tmp` up above that of `map`; note that you can still leave the initialization in the original spot, like so: ```rust let tmp; let mut map : HashMap<u8, &u8> = HashMap::new(); tmp = box 2; map.insert(43, &*tmp); ``` Similarly, one can encounter an analogous situation with `Vec`: one would need to rewrite: ```rust let mut vec = Vec::new(); let tmp = 'c'; vec.push(&tmp); ``` as: ``` let tmp; let mut vec = Vec::new(); tmp = 'c'; vec.push(&tmp); ``` ---- In some corner cases, it does not suffice to reorder the bindings; in particular, when the types for both bindings need to reflect exactly the *same* code extent, and a parent/child relationship between them does not work. In pnkfelix's experience this has arisen most often when mixing uses of cyclic data structures while also allowing a lifetime parameter `'a` to flow into a type parameter context where the type is *invariant* with respect to the type parameter. An important instance of this is `arena::TypedArena<T>`, which is invariant with respect to `T`. (The reason that variance is relevant is this: *if* `TypedArena` were covariant with respect to its type parameter, then we could assign it the longer lifetime when it is initialized, and then convert it to a subtype (via covariance) with a shorter lifetime when necessary. But `TypedArena` is invariant with respect to its type parameter, and thus if `S` is a subtype of `T` (in particular, if `S` has a lifetime parameter that is shorter than that of `T`), then a `TypedArena<S>` is unrelated to `TypedArena<T>`.) Concretely, consider code like this: ```rust struct Node<'a> { sibling: Option<&'a Node<'a>> } struct Context<'a> { // because of this field, `Context<'a>` is invariant with respect to `'a`. arena: &'a TypedArena<Node<'a>>, ... } fn new_ctxt<'a>(arena: &'a TypedArena<Node<'a>>) -> Context<'a> { ... } fn use_ctxt<'a>(fcx: &'a Context<'a>) { ... } let arena = TypedArena::new(); let ctxt = new_ctxt(&arena); use_ctxt(&ctxt); ``` In these situations, if you try to introduce two bindings via two distinct `let` statements, each is (with this commit) assigned a distinct extent, and the region inference system cannot find a single region to assign to the lifetime `'a` that works for both of the bindings. So you get an error that `ctxt` does not live long enough; but moving its binding up above that of `arena` just shifts the error so now the compiler complains that `arena` does not live long enough. SO: What to do? The easiest fix in this case is to ensure that the two bindings *do* get assigned the same static extent, by stuffing both bindings into the same let statement, like so: ```rust let (arena, ctxt): (TypedArena, Context); arena = TypedArena::new(); ctxt = new_ctxt(&arena); use_ctxt(&ctxt); ``` Due to the new code rejections outlined above, this is a ... [breaking-change]

pnkfelix · 2015-01-27T09:27:46Z

@bors r=nikomatsakis d6bf04a

bors · 2015-01-27T11:07:27Z

⌛ Testing commit d6bf04a with merge d77f6d5...

Add `CodeExtent::Remainder` variant; pre-req for new scoping/drop rules. This new enum variant introduces finer-grain code extents, i.e. we now track that a binding lives only for a suffix of a block, and (importantly) will be dropped when it goes out of scope *before* the bindings that occurred earlier in the block. Both of these notions are neatly captured by marking the block (and each suffix) as an enclosing scope of the next suffix beneath it. This is work that is part of the foundation for issue #8861. (It actually has been seen in earlier posted pull requests, in particular #21022; I have just factored it out into its own PR to ease my own near-future rebasing, and also get people used to the new rules.) ---- These finer grained scopes do mean that some code is newly rejected by `rustc`; for example: ```rust let mut map : HashMap<u8, &u8> = HashMap::new(); let tmp = Box::new(2); map.insert(43, &*tmp); ``` This will now fail to compile with a message that `*tmp` does not live long enough, because the scope of `tmp` is now strictly smaller than that of `map`, and the use of `&u8` in map's type requires that the borrowed references are all to data that live at least as long as the map. The usual fix for a case like this is to move the binding for `tmp` up above that of `map`; note that you can still leave the initialization in the original spot, like so: ```rust let tmp; let mut map : HashMap<u8, &u8> = HashMap::new(); tmp = box 2; map.insert(43, &*tmp); ``` Similarly, one can encounter an analogous situation with `Vec`: one would need to rewrite: ```rust let mut vec = Vec::new(); let tmp = 'c'; vec.push(&tmp); ``` as: ```rust let tmp; let mut vec = Vec::new(); tmp = 'c'; vec.push(&tmp); ``` ---- In some corner cases, it does not suffice to reorder the bindings; in particular, when the types for both bindings need to reflect exactly the *same* code extent, and a parent/child relationship between them does not work. In pnkfelix's experience this has arisen most often when mixing uses of cyclic data structures while also allowing a lifetime parameter `'a` to flow into a type parameter context where the type is *invariant* with respect to the type parameter. An important instance of this is `arena::TypedArena<T>`, which is invariant with respect to `T`. (The reason that variance is relevant is this: *if* `TypedArena` were covariant with respect to its type parameter, then we could assign it the longer lifetime when it is initialized, and then convert it to a subtype (via covariance) with a shorter lifetime when necessary. But `TypedArena` is invariant with respect to its type parameter, and thus if `S` is a subtype of `T` (in particular, if `S` has a lifetime parameter that is shorter than that of `T`), then a `TypedArena<S>` is unrelated to `TypedArena<T>`.) Concretely, consider code like this: ```rust struct Node<'a> { sibling: Option<&'a Node<'a>> } struct Context<'a> { // because of this field, `Context<'a>` is invariant with respect to `'a`. arena: &'a TypedArena<Node<'a>>, ... } fn new_ctxt<'a>(arena: &'a TypedArena<Node<'a>>) -> Context<'a> { ... } fn use_ctxt<'a>(fcx: &'a Context<'a>) { ... } let arena = TypedArena::new(); let ctxt = new_ctxt(&arena); use_ctxt(&ctxt); ``` In these situations, if you try to introduce two bindings via two distinct `let` statements, each is (with this commit) assigned a distinct extent, and the region inference system cannot find a single region to assign to the lifetime `'a` that works for both of the bindings. So you get an error that `ctxt` does not live long enough; but moving its binding up above that of `arena` just shifts the error so now the compiler complains that `arena` does not live long enough. * SO: What to do? The easiest fix in this case is to ensure that the two bindings *do* get assigned the same static extent, by stuffing both bindings into the same let statement, like so: ```rust let (arena, ctxt): (TypedArena, Context); arena = TypedArena::new(); ctxt = new_ctxt(&arena); use_ctxt(&ctxt); ``` ---- Due to the new code restrictions outlined above, this is a ... [breaking-change]

bors · 2015-01-27T13:37:51Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-32-nopt-t, auto-win-32-opt, auto-win-64-nopt-t, auto-win-64-opt

andrewrk · 2015-01-29T19:45:18Z

It's rather frustrating when the compiler rejects safe code and I have to introduce let statements without assignments to work around the issue. This is empty complaining here, but boy, it sure would be nice if we could avoid some of these new requirements. It would be a big deal for developer happiness if rustc could avoid these false positives, or at least as many of them as possible.

richard-uk1 · 2015-01-30T15:06:14Z

Maybe in the future rust could be cleaver and try different extents until it finds one that works: so for example:

let mut map : HashMap<u8, &u8> = HashMap::new();
let tmp = Box::new(2);
map.insert(43, &*tmp);

The compiler would first try

let mut map: HashMap<u8, &u8> = HashMap::new();
{
    let tmp = Box::new(2);
    {
        map.insert(43, &*tmp);
    } // <- fail here 
}

and given there is a failure try

let tmp = Box::new(2);
{
    let mut map: HashMap<u8, &u8> = HashMap::new();
    {
        map.insert(43, &*tmp);
    }
}
// no fails this time

and use the second option.

The problem with this is whether the result is well defined - so this would either have to emit a warning, or give an error but give the second example (or it's compressed form) as a possible fix.

pzol · 2015-01-30T19:55:43Z

Totatlly agree with @andrewrk

pnkfelix · 2015-01-31T19:17:18Z

@derekdreery both Box and HashMap have destructors. So in that particular example, reordering their bindings is not something that its really sound for the compiler to do, since it changes the destruction order.

(Admittedly, we could, and maybe will, extend the language with some way to say that the destructor associated with a type is in some sense "pure", or at least "local", in that it only touches data that is owned by the thing being destroyed, and never accesses borrows from elsewhere -- such "pure" destructors would probably be somewhat safer to reorder automatically. But even then, you're talking about automatically changing the order of interactions with the low-level system allocator, which seems like a fairly scary thing to do.)

(maybe these examples of surprises and/or unsoundness were what you were alluding to in your final sentence. But in any case, I would not object to having the compiler suggest such fixes ... )

richard-uk1 · 2015-02-07T15:19:58Z

@pnkfelix it is what I was alluding to :). I think if the compiler does suggest that as a fix, it needs to be something like "doing x will make the code compile (but is not necessarily what you want)" rather than "doing x will fix your code".

rust-highfive assigned huonw Jan 26, 2015

pnkfelix assigned pnkfelix and unassigned huonw Jan 26, 2015

nikomatsakis mentioned this pull request Jan 26, 2015

New destructor semantics #8861

Closed

7 tasks

pnkfelix added 6 commits January 27, 2015 10:26

accommodate new scoping rules in rustc and rustdoc source.

86bde93

accommodate new scoping rules in test/run-pass.

9fe8d86

accommodate new scoping rules in test/compile-fail.

70192ab

accommodate new scoping rules in libstd unit tests.

2c2f0f1

accommodate new scoping rules in librustc_driver unit tests.

43d08f8

pnkfelix force-pushed the block-remainder-extents branch from ff5e236 to d6bf04a Compare January 27, 2015 09:27

bors merged commit d6bf04a into rust-lang:master Jan 27, 2015

pnkfelix mentioned this pull request Jan 31, 2015

(It turned out to be Un-)sound Generic Drop rust-lang/rfcs#769

Merged

jimblandy mentioned this pull request Feb 11, 2015

Improve lifetime error messages re: block suffixes and dropck #21875

Closed

This was referenced Feb 14, 2015

fallout from Sound Generic Drop to be addressed later #22321

Closed

dropck: track order of destruction for r-value temporaries #22323

Closed

pnkfelix mentioned this pull request Mar 6, 2015

Order of code affects lifetime resolution #23108

Closed

pnkfelix mentioned this pull request Mar 17, 2015

RefCell::borrow does not pass borrow-check without a seemingly no-op let binding #23338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `CodeExtent::Remainder` variant; pre-req for new scoping/drop rules. #21657

Add `CodeExtent::Remainder` variant; pre-req for new scoping/drop rules. #21657

pnkfelix commented Jan 26, 2015

rust-highfive commented Jan 26, 2015

pnkfelix commented Jan 26, 2015

pnkfelix commented Jan 26, 2015

huonw commented Jan 27, 2015

pnkfelix commented Jan 27, 2015

bors commented Jan 27, 2015

bors commented Jan 27, 2015

andrewrk commented Jan 29, 2015

richard-uk1 commented Jan 30, 2015

pzol commented Jan 30, 2015

pnkfelix commented Jan 31, 2015

richard-uk1 commented Feb 7, 2015

Add CodeExtent::Remainder variant; pre-req for new scoping/drop rules. #21657

Add CodeExtent::Remainder variant; pre-req for new scoping/drop rules. #21657

Conversation

pnkfelix commented Jan 26, 2015

rust-highfive commented Jan 26, 2015

pnkfelix commented Jan 26, 2015

pnkfelix commented Jan 26, 2015

huonw commented Jan 27, 2015

pnkfelix commented Jan 27, 2015

bors commented Jan 27, 2015

bors commented Jan 27, 2015

andrewrk commented Jan 29, 2015

richard-uk1 commented Jan 30, 2015

pzol commented Jan 30, 2015

pnkfelix commented Jan 31, 2015

richard-uk1 commented Feb 7, 2015

Add `CodeExtent::Remainder` variant; pre-req for new scoping/drop rules. #21657

Add `CodeExtent::Remainder` variant; pre-req for new scoping/drop rules. #21657