-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider reverting the merge of collections into alloc #43112
Comments
@brson I'm not sure I agree with the reasons you laid out. In my mind liballoc provides the ability to depend on dynamic memory allocation, but nothing else. On top of that one assumption you can build up the entire crate (pointers + collections) with no more assumptions than libcore already has. In that sense I don't see this as a violation of layering but rather just giving you one layer and then everything else that it implies.
Is this a problem? Linkers gc'ing sections should take care of this? Isn't this equivalent to libcore bringing utf-8 formatting when you otherwise don't need it?
Do you have an example of this? |
I'm in favor of undoing the merge. In general, I'd prefer not to reduce the number of crates in the std facade. I'm In the particular case of alloc / collections it feels odd to have a full suite Personally, I'd prefer if |
If we're going to separate this crate then I think we need to be super clear about why. For example the previous incarnation of
Yes, and to me this is not a reason to separate the crates. This is already a "problem" in the standard library where the "fix" is otherwise too unergonomic to work with. |
If cargo had the ability to run with multiple different forks of std see this RFC, this would be much less of an issue. |
It is a problem. I do not think depending on linker to throw away unwanted code is a reasonable excuse to make unsatisfactory architectural decisions. If this were an out-of-tree project where people have to spend CPU time compiling code, then telling people to just merge your abstraction layers, compile it all, and throw most of it away later, would not be acceptable.
Yes. The entire motivation for this merge is to follow it up with this PR, which is seemingly not possible without the collections being literally in the alloc crate. No other collections crate can achieve this.
This was indeed a weakness of the previous alloc crate. I'd love to pull the smart pointers out if possible, and would love to not make the conflation worse.
I don't think this follows. Each layer in the facade adds new capabilities. Alloc is a crucial capability that separates different classes of Rust software. |
So merging alloc and collections happened to be able to |
If the goal is to minimize library code bloat, should we move to |
@brson thanks for reopening discussion, and @japaric thanks for the ping.
Exactly. I have a few reasons why I think combining facade crates relating to allocation is bad, but I want to start positively with why I think breaking facde crates further apart in general is good. Writing this with @joshlf's help. Small, quickly compiled binaries are nice, but more important is the efficiency of the development process. We want a large, diverse, and flexible no-std ecosystem, because ultimately we are targeting a large diverse space of projects. Small libraries are key to this both because they reduce the cognitive load of everyone developing those libraries, and allow more developers to participate. For cognitive load, it's important both to be utterly ignorant of the code that doesn't matter---as in, don't know what you don't know---and incrementally learn the code that does. For distributed development and diverse goals, the benefits are probably more obvious. Lighter coupling means less coordination so more people can scratch their own itches in isolation. But it also allows complexity to be managed by splitting the ecosystem into small, modular components - this allows people to take only what they need, and thus only opt-in to as much complexity as their application requires. All this begs the question---how far do we go down the road of splitting coarse crates into fine crates? I think quite far. It is my hope that as crates behind the facade and the language mature, more crates will be able to be written in stable (or otherwise trustworthy) code, and be moved outside rust-lang/rust into their own repo. Likewise, std should be able to (immediately and transitively) depend on third party crates just fine---the lockfile keeps this safe. Rustbuild, and my RFC #1133 are our friends here. To put it another way, there should be as little magic in rust-lang/rust as possible because magic cannot be replicated outside of rust-lang/rust. By splitting crates, we decrease the risk that a given piece of near-stable code will be bundled with code that cannot be stabilized, thus preventing the former from ever becoming stabilized and thus "de-magicked" in the sense of becoming eligible for being moved outside of rust-lang/rust. This runs counter to the arguments in favor of the collections/alloc merge. In concrete terms, I recall there being talk of incorporating Back to allocation in particular, @joshlf's been doing some great work with object allocation (type-parametric allocators that only allocate objects of a particular type and cache initialized objects for performance), and it would be nice to use the default collections with that: the fact is, most collections only need to allocate a few types of objects, and so could work with object allocators just fine. Now if we were to combine alloc and collections and the object allocator traits in one crate, that would be a very unwieldy crate playing not just double but triple duty.
Besides the technical reasons, as @japaric and I mentioned elsewhere, including anything allocation related in core, even something as harmless as a trait that need not be used, will scare away developers of real-time systems. OTOH, I like the idea of allocator types being usable without Also, there's an analogous problematic cycle to avoid when defining the static allocator even if it is initialized statically: For crates which implement global allocators it's incorrect for them to use CC @eternaleye |
Given @Ericson2314 's comment, we (again, jointly) would like to make a proposal for how all of these various components might be structured. We have the following goals:
Thus, we propose the following:
|
I don't think that the arguments provided suggest that that should happen. Keeping all of the collections together doesn't increase the difficulty of understanding how things work because, for the most part, collections do not depend on one another, and do not share magic under the hood. From a usability perspective, they're logically related, so it would not be surprising to a developer to find them together in the same crate. The arguments we and others have presented do suggest splitting collections into its own thing - separate from, e.g., |
@joshlf ^ seems to imply to me that it would suggest that level of granularity. |
Ah, we definitely didn't mean to imply that. @Ericson2314 can weigh in when he gets a chance, but speaking for myself, I would interpret that as "quite far within reason." I don't think that our reasoning provides a good argument for splitting collections up into separate crates, although maybe @Ericson2314 will disagree and will think we should even go that far. |
Well I... wouldn't fight that level of granularity if many thers want it :). Trying to think of a principle of why it's more important to separate alloc from collections than each collection from each other, I arrived at a sort of tick-tock model where one crate (tick) adds some new capability, and the next (tok) builds a bunch of with the capabilities added so far (it's "marginally pure"). Crates like alloc or a kernel bindings (e.g |
@brson: Just a minor correction here: The referenced PR doesn't require collections to be within the alloc crate. It only requires An out-of-tree collections crate would be able to make the same impls, as it could have both |
Presumably the visibility requirement exists because |
@joshlf: Yes, that's correct. |
In that case, that'd be my suggestion - to either make a separate |
I don't think I entirely agree with this. The whole premise of this crate is that we're shipping it in binary form so the compilation time doesn't matter too much. We absolutely rely on gc-sections for so many other features I think the ship has long since sailed on making an optional feature of the linker that we invoke. I think it's also important to keep this all in perspective and extra concrete. On 2017-06-01 liballoc took 0.3s to compile in release mode and libcollections took 3s. Today (2017-07-11) liballoc takes 3.2s to compile in release mode. This is practically nothing compared to crates in the ecosystem.
I think this is too quick an interpretation, though. As @murarth pointed out above we're not empowering std collections with extra abilities. Any collection outside std can have these impls.
I think my main point is that we should not automatically go back to what we were doing before. I believe the separation before didn't make sense, and I believe the current separate makes a more sense. If there's a desire to separate the concept of allocation from the default collections that's fine by me, but I don't think we should blindly attempt to preserve what existed previously which wasn't really all that well thought out (the alloc crate looked basically exactly the same as when I first made it ever so long ago) I'd personally find it quite useful if we stick to concrete suggestions here. The facade is full of subtle intricacies that make seemingly plausible proposals impossible to implement today and only marginally possible in the future. One alternative is the title of this issue, "Consider reverting the merge of collections into alloc". I have previously stated why I disagree with this. Another alternative by @joshlf above relies on defining collection types that don't know about I also further more disagree with rationale that keeps |
I wasn't thinking that you'd add default type parameters after-the-fact, but rather re-export as a newtype. E.g., in collections: pub struct Vec<T, A: Alloc> { ... } And then in std: use collections::vec;
use heap::Heap;
pub type Vec<T, A: Alloc = Heap> = vec::Vec<T, A>;
That's fine - as we mentioned, keeping |
no-std devs will be recompiling it.
How much longer does it take to build a final binary that depends only on alloc not collections? I suspect that will tell a different story?
Huh? The issue is Vec, Arc, and RC need to live in the same crate, but that create need not contain the allocator traits. I'd say we do indeed have a problem and while moving all 3 of those to collections is a good step, there's still a problem because anyone else writing there own vec-like thing runs into the same issue.
I think there is some consensus beside you that the first step could be making a smaller alloc than before: with no Rc or Arc, and maybe not Box either? Heap and the Alloc traits would stay in alloc, and then as a second step either the traits would move to core, or heap would move to its own crate.
@joshlf lf beat me to using an alias (or if they fails newtype). CC @gankro cause HashMap and the hasher is 100% analogous. |
So can I, but nobody was saying just put it in libstd! libcore <- liballoc <- {libcollections | liballoc-system} <- libstd: I can see any sub-graph (including libcore) being useful.
|
Yes I understand, and I'm re-emphasizing that this does not work today. Whether it will ever work is still up in the air. As a result it's not a viable proposal at this time.
No, I highly doubt it. Everything in
This is missing the point. @brson originally though that by moving
Can you articulate precisely what you think this problem is?
I disagree with two things here. I don't think it makes sense to couple
Again, this is not possible. I was the one that put |
Let's consider this from another point of view. I see the current crate hierarchy as follows:
As a no-std/embedded developer, I do not see any practical use in having what's currently in liballoc split into any number of crates. It is permeated by infallibility on every level, from The savings in terms of code size do not exist because the majority of embedded software is compiled with LTO and at least opt-level 1 even for debugging; without LTO, libcore alone would blow through the entire storage budget, and without opt-level 1, the generated code is all of: too large, too slow, and too hard to read at that special moment you have to read assembly listings. It seems straightforward and obviously good to put the |
If you, @whitequark, as an embedded developer, don't mind putting the trait in libcore, then I think your opinion overrides mine and @Ericson2314's since we aren't embedded devs :) |
@joshlf Mostly it's that I don't really understand the argument for keeping it out. The argument for having separate libcore and liballoc goes: libcore's only dependency in terms of linking is libcompiler_builtins, and it introduces no global entities, whereas liballoc introduces linking dependencies to the A trait that isn't used has no cost. |
Right; my apologies; I forgot about that.
I thought it was a coherence issue. If it's a name-reachability issue with
Ah there is some naming confusing here because @joshlf used a type alias. A wrapper struct is what I consider a newtype, and that would work, right?. Wrapping every inherent method or using a one-off trait is annoying, but works. |
This is perhaps getting a bit off topic, but what I really want is |
@tarcieri the best solution to that is custom preludes. |
@Ericson2314 custom preludes don't really solve the problem I'd like to solve, which is allowing crates to leverage The whole point of automatically adding them to a core/alloc prelude would be to remove this explicit dependency. Then use of |
@tarcieri I'm not sure what to tell you, sorry. Some insta-stable trick to expose items through abnormal means based on whether anything implements allocator anywhere is....unwise in my book. I'd say stabilize the crates quicker, but @whitequark bring up a good point that our current story around handling allocation failure in everything but the allocator trait itself is atrocious: unergonomic and unsafe. I'm loath to stabilize the "beneath the facade" interfaces until that is fixed. |
What? That's the exact opposite of reality. It is safe (because crashes are safe), and it's ergonomic, because explicicly handling allocation failures in typical server, desktop, or even hosted embedded software has a high cost/benefit ratio. Consider this. With mandatory explicit representation of allocation failure, almost every function that returns or mutates a
Also, let's say you have a public function that returns The only remotely workable solution I see is extending the API of |
To add to this, the majority of small embedded devices whose memory is truly scarce cope in one of the two ways:
As such, I feel like the primary structure providing fallible allocations would be some sort of memory pool. This can be easily handled outside the alloc crate. |
As a former libs team member, I'm not opposed to adding try_push, try_reserve, etc to the stdlib at this point in time. Someone just needs to put in the legwork to design the API, which I think was partially blocked on landing allocator APIs -- to provide guidance on what allocation error types are like -- when this first came up. I believe the gecko team broadly wants these functions, as there are some allocations (often user-controlled, like images) which are relatively easy and useful to make fallible. |
@whitequark Sorry, I meant handling allocation in the client is unergonomic/unsafe. Everyone agrees that what we do today is both safe and ergonomic, but not flexible in that regard. |
Then this sounds like the best solution to me, yeah. |
This thread has gone in a lot of different directions, and I'm having I'm afraid I framed this incorrectly by presenting a defense of the This was a major breaking architectural decision, done out of process. I suggest we take the following actions:
|
My gut feeling is there are some complex interrelationships between these types which are difficult to model given the previous alloc/collections split. I think that was the motivation for the merge in the first place. As an example, at the error-chain meeting we discussed moving This means any crates that want to work with I guess my question is: is there still a path forward for unlocking |
@tarcieri with or without the revert, it can go in alloc. If @brson Ah, I wasn't actually sure what the process is for changing crates whose very existence is unstable. Thanks for clarifying. |
My apologies if that's all orthogonal to the revert. If that's the case, I have no objections. |
A couple of process points:
From the libs team meeting, it sounded like @alexcrichton was quite confident that the original crate organization was buying us nothing, and that the problems people are interested in solving are better addressed in some other way. I think it'd be good for @alexcrichton and @brson to sync up, and then summarize their joint perspective on thread, before we make more changes here. |
There doesn't seem to have been any progress on this in the last 3 weeks; and PR #42565 is blocked on this resolving one way or the other. What steps do we need to take to unstick this? "Watch me hit this beehive with a big ol' stick", he said, pressing Comment |
#42565 was the sort of thing I was alluding to earlier. Is there a path forward for that if the merger were to be reverted? |
Yes -- move just |
I've re-read this entire thread, and to me the key points are: Clarity of layering. I think @whitequark put it best:
though we are working toward supporting fallible allocation in Crates providing features that aren't used. It's true that the crates.io ecosystem in general skews toward small crates, but these core library crates are a very important exception to that. The thread has made clear that bloat isn't an issue (due to linkers), nor is compilation time (which is trivial here, due to generics). Special treatment of core crates. Grouping type and/or trait definitions together can enable impls that are not possible externally. However, (1) the very same issues apply to any choice of breaking up crates and (2) the standard library already makes substantial and vital use of its ability to provide impls locally. Separating collections from the global heap assumption. There is not currently a plausible path to do this, but with some language additions there may eventually be. But by the same token, this kind of question also seems amenable to a feature flag treatment. Personally, I find the new organization has a much more clear rationale than the current one, and is simpler as well. Stakeholders from impacted communities (the @rfcbot fcp close |
Team member @aturon has proposed to close this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Bummer. Then I suppose the next fronts for pushing for modularity are:
|
🔔 This is now entering its final comment period, as per the review above. 🔔 |
Please don't revert it. This change has been out in the wild for a long time, and many projects have updated to the new organization of |
I'm going to close this issue now that the full libs team has signed off. (@jackpot51, to be clear, that means: we will not revert.) |
This PR merges the collections crate into the alloc crate, with the intent of enabling this PR.
Here are some reasons against the merge:
It is a violation of layering / seperation of responsibilities. There is no conceptual reason for collections and allocation to be in the same crate. It seems to have been done to solve a language limitation, for the enablement of a fairly marginal feature. The tradeoff does not seem worth it to me.
It forces any no_std projects that want allocation to also take collections with it. There are presumably use cases for wanting allocator support without the Rust collections design (we know the collections are insufficient for some use cases).
It gives the std collections crate special capabilities that other collections may not implement themselves - no other collections will be able to achieve feature parity with the conversion this merger is meant to enable.
Personally I value the decomposition of the standard library into individual reusable components and think the merge is moving in the wrong direction.
I am curious to know what embedded and os developers think of this merge cc @japaric @jackpot51 .
The text was updated successfully, but these errors were encountered: