Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion to add Array#partition as well #2

Closed
MaxGraey opened this issue Oct 24, 2019 · 36 comments
Closed

Suggestion to add Array#partition as well #2

MaxGraey opened this issue Oct 24, 2019 · 36 comments

Comments

@MaxGraey
Copy link

MaxGraey commented Oct 24, 2019

Examples

Basics:

const a = 1, b = 2, c = 3, d = 4
const [ab, cd] = [a,b,c,d].partition((_, index) => index >= 2)
const [ac, bd] = [a,b,c,d].partition(value => value % 2 === 0)

Immutable (out of place) Quick Sort:

function quickSort([p, ...tail]) {
  if (!tail.length) return [p, ...tail]
  const [l, r] = tail.partition(x => x > p)
  return [...quickSort(l), p, ...quickSort(r)]
}

quickSort([5, 2, 3, -1])

Efficient and correct implementation usually not trivial:

// not trivial
function partition(arr, callback) {
  return arr.reduce((result, value, index, array) => {
    result[callback(value, index, array) ? 1 : 0].push(value);
    return result;
  }, [[], []]);
}

or

// not efficient (twice linear scan)
function partition(arr, callback) {
  return [
    arr.filter((...args) => !callback(...args)),
    arr.filter(callback)
  ];
}

or based this proposal:

function partition(arr, callback) {
  return [
    arr.filterOut(callback),
    arr.filter(callback)
  ];
}

Related

OCaml: List.partition
Haskell: Data.List.partition
Lodash: _.partition
RxJS: RxJS/partition
Kotlin: Array.partition

@chicoxyzzy
Copy link
Member

chicoxyzzy commented Oct 24, 2019

For the note, underscore _.partition and lodash _.partition are also exist (16,309 downloads weekly for lodash version)

@MaxGraey
Copy link
Author

Thanks added to Related

@MaxGraey MaxGraey changed the title Suggestion adding Array#partition as well Suggestion to add Array#partition as well Oct 24, 2019
@MaxGraey
Copy link
Author

Also RxJS

@jridgewell
Copy link
Member

Hi @MaxGraey, I think partition needs to be its own proposal. It's certainly related to select/reject, but it's different enough that we would have to argue over semantics delaying the whole process.

@jridgewell
Copy link
Member

Reopening to discuss. @MaxGraey, do you know of any libraries/projects/etc that are using partition?

/cc @michaelficarra

@jridgewell jridgewell reopened this Dec 6, 2019
@michaelficarra
Copy link
Member

michaelficarra commented Dec 6, 2019

edit: Sorry, I didn't see that these were mentioned in earlier comments.

@jridgewell
Copy link
Member

@michaelficarra yup. Are you aware of projects using them, though?

@ckknight
Copy link

ckknight commented Dec 6, 2019

@jridgewell we use partition at my work rather extensively, typically around usage of https://github.com/graphql/dataloader, which can often provide arrays of the type (T | Error)[]. Being able to partition errors separately from non-errors is a typical pattern for us.

The source is not available for this usage, but here are some real-life examples that I've copied:

const [rejectedReviews, approvedReviews] = partition(allReviews, (review) => review.rejectionReasonCode != null);
// Type-safety of this function ensures well-typed `errors` and `items` below
function isError(candidate: unknown): candidate is Error {
  return candidate instanceof Error;
}
const itemsAndErrors = await dataLoader.loadMany(ids);
const [errors, items] = partition(itemsAndErrors, isError);

@jridgewell
Copy link
Member

@ckknight: Thanks!

I added _.partition data from the HTTP Archive and Github Archive, see #4. Usage is at ~1.5% of _.filter usage (which is a little less than _.select).

@MaxGraey
Copy link
Author

MaxGraey commented Dec 7, 2019

btw lodash.partition has33k downloads/week

@MaxGraey
Copy link
Author

MaxGraey commented Dec 7, 2019

I guess query exactly_.partition in db isn't fully represent all cases like:

const partition = require('lodash.partition');
const { partition } = require('lodash');

and etc

@schmod
Copy link

schmod commented Jan 23, 2020

As a radical proposal, what about implementing Array#partition instead of Array#filterOut? Users who only desire the falsy partition can destructure the output, and only consume the second value.

const [, cd] = [a,b,c,d].partition((_, index) => index >= 2)
// is equivalent to...
const cd = [a,b,c,d].filterOut((_, index) => index >= 2)

A sufficiently-clever implementation should ideally be able to recognize this situation, and internally fall back to a more-efficient filter or filterOut algorithm to inhibit any major performance penalties over a standalone filterOut method.

@MaxGraey
Copy link
Author

MaxGraey commented Jan 23, 2020

@schmod Good point!
Btw I propose reversed order result as [filterOutResult, filterResult] which mean it will even more simple:

const [ab] = [a,b,c,d].partition((_, index) => index >= 2);
// which equivalent to
const ab = [a,b,c,d].filterOut((_, index) => index >= 2);

@jridgewell
Copy link
Member

As a radical proposal, what about implementing Array#partition instead of Array#filterOut? Users who only desire the falsy partition can destructure the output, and only consume the second value.

I find partition to be particularly clunky. Having to grab the either the first or second array, depending on which you care about, seems much less ergonomic than just doing filter or filterOut to get the result.

Then, is it [falsey, truthy], or [truthy, falsey], because it's not self explanatory. Lodash has a similar (but more generic) form with _.groupBy, which would return a object with keys returned by the cb and arrays as values. That seems more useful than partition.

Then there's also the query data. Use of partition is relatively small at ~2%, while reject 10-25%, and filter at 70-80%.

While partition is still an option for the proposal, I don't think it's the best one.

@michaelficarra
Copy link
Member

I think groupBy is a great alternative. A partition implementation that returns a { true, false } record is simply a special case that calls ToBoolean on the result. I think either one would satisfactorily solve the problem this proposal is trying to address.

@MaxGraey
Copy link
Author

MaxGraey commented Jan 25, 2020

That's true. partition is a specialized case of groupBy. But filterOut also specialized case of filter. Let's compare.
This:

const lessThan2 = arr.filter(val => !(val >= 2));
// or const [lessThan2] = arr.groupBy(val => Number(val >= 2));

much less expressive than

const lessThan2 = arr.filterOut(val => val >= 2);

same as:

const [lessThan2, greaterOrEqThan2] = arr.groupBy(val => Number(val >= 2));

less expressive than

const [lessThan2, greaterOrEqThan2] = arr.partition(val => val >= 2);

@ghost
Copy link

ghost commented Jan 6, 2021

You have to sort of intricately "know" what order the tuple returns the values in, it might be better to return a struct:

{
    in,
    out
}

ex:

const {
    in: greaterThanOrEq2,
    out: lessThan2
} = arr.partition(val => val >= 2);

Although, thinking about it, in and out could be ambiguous...

@ckknight
Copy link

ckknight commented Jan 6, 2021

Thinking about it, in and out could be ambiguous...

Oh, then how about included and excluded?

@MaxGraey
Copy link
Author

MaxGraey commented Jan 6, 2021

Thinking about it, in and out could be ambiguous...

it also contradicts the generally accepted conventions already in JS:
Array#entries, Map#entries, Set#entries. All this return array like tuples.

Main advantages of array tuple you could always skip unnecessary part:

const [, cd] = [a,b,c,d].partition((_, index) => index >= 2)
const [ab] = [a,b,c,d].partition((_, index) => index < 2)

but also you could this:

const { 1: cd } = [a,b,c,d].partition((_, index) => index >= 2)
const { 0: ab } = [a,b,c,d].partition((_, index) => index < 2)

with objects:

const { out: cd } = [a,b,c,d].partition((_, index) => index >= 2)
const { in: ab } = [a,b,c,d].partition((_, index) => index < 2)

@ghost
Copy link

ghost commented Jan 6, 2021

Main advantages of array tuple you could always skip unnecessary part:

You can do the same with the struct, just don't use the other property from the struct.

const {
    in: greaterOrEqThan2,
    out: lessThan2
} = arr.partition(val => val >= 2);

vs

const { out: lessThan2 } = arr.partition(val => val >= 2);

Array#entries, Map#entries, Set#entries. All this return array like tuples.

They return arrays of key-value pairs, in contrast, here would you want to say that there is an association between what was filtered in and out? I would say "no," because we are intentionally trying to separate them into two totally separate categories, therefore you wouldn't want to pair them.

But actually, returning a tuple seems more like the opposite of Array#flat, and it shows that the array has been partitioned into two separate sections, while still being an array. So I guess it can be seen either way.

Regardless, I personally would support Array#partition over Array#filterOut, it is more reusable and offers a utility that neither Array#filterOut, nor Array#filter provide. I have made a function that explicitly looped over arrays a few times in order to obtain the behavior of the Array#partiton before.

How about getting a TypedArray#partition too?

@MaxGraey
Copy link
Author

MaxGraey commented Feb 20, 2021

I think partition could be generalize more and include slide, stride and chunk features. See this talk:
https://www.youtube.com/watch?v=-_lqZJK2vjI

@jridgewell
Copy link
Member

Well, I finally needed a partition today: ampproject/amphtml@dc3db0e#diff-d06fcad39d9e58f050ba123f5f506cbc1a8865fde8a6016c4dc32bbfc71d06c9R256-R264

@dead-claudia
Copy link

I've several times done stuff like this, so it may be worth doublechecking for that kind of pattern:

const foos = items.filter(item => item.type === "foo")
const bars = items.filter(item => item.type !== "foo")

And of course there's several ways this could be expressed, each of which I've seen personally in the wild:

  • filter(f) + reject(f)
  • filter(f) + filter(i => !f(i))
  • filter(x => x.type === "a") + filter(x => x.type === "b") where the two are mutually exclusive

@ghost
Copy link

ghost commented Mar 28, 2021

@isiahmeadows I'm not sure what you're trying to point out, can you clarify?

Your code example is equivalent to the following:

const partitioner = item => item.type === "foo";

const foos = items.filter(partitioner);
const bars = items.filter(item => !partitioner(item));

Which is the current problem that we would like to fix.

filter(f) + reject(f)

I don't get this right away, can you give an example?

where the two are mutually exclusive

As long as they are mutually exclusive, the original proposal should work just fine.

As for your example, would you want a method that takes two comparer functions?

const [
    foos,
    bars
] = items.partition(
    item => item.type === "foo",
    item => item.type === "bar"
);

I feel like that could easily be extended to accept N functions, and return an array of N, storing all values that the function returned a truthy value for, but really seems like it should be a user-implemented library solution.

@jridgewell
Copy link
Member

I don't get this right away, can you give an example?

@isiahmeadows is demonstrating that they needed both the "filtered out" and the "filtered in" at the same time, which partition solves. It's a +1 that partition is useful.

I feel like that could easily be extended to accept N functions

This seems a bit more complicated than a generic groupBy. The ordering of the functions would affect the output, where groupBy could easily handle the cases.

@zloirock
Copy link
Contributor

@jridgewell hey, maybe already makes sense to add it to the proposal? Seems it's really required and the community wanna it.

@ckknight
Copy link

Seems it's really required and the community wanna it.

Absolutely not. It overcomplicates the algorithm, the understanding of purpose, and overall documentation of this hypothetical method.

partition, like the other Array.prototype methods, should be as simple as possible, but no simpler.

Please, let's not overcomplicate things. It will be difficult to know when to teach, when to use, and whether it's efficient enough to bother with. If you want to worry about those things, then use an NPM package. I don't want to see that canonicalized within ECMAScript.

@zloirock
Copy link
Contributor

zloirock commented Mar 30, 2021

@ckknight for the previous 5 months .partition logic was required for me at least 5-10 times. NPM packages for everything that should be in the standard library is bullshit - I don't wanna depend on a separate non-standard script for a simple action (hi, left-pad hell).

@MaxGraey
Copy link
Author

@ckknight I remember how this approach is "useful" on "leftpad" example

@ckknight
Copy link

@ckknight I remember how this approach is "useful" on "leftpad" example

I am in favor of having a simple, useful partition method, on the official Array.prototype within the ECMAScript standard. I am opposed to overcomplicating the algorithm and signature for it.

@zloirock
Copy link
Contributor

@ckknight so why do you think that I mean something else?

@ghost
Copy link

ghost commented Mar 30, 2021

@ckknight The algorithm would require about 3 more lines of ES spec text, compared to filter/filterOut, one line to call an abstract operation to get the array species constructor, an extra variable declaration using that constructor, the line with "let k be k + 1" can be removed, as it is now always pushing to an array, and the creation of the structure to return.

Regardless, the current Array#filter discards useful data that developers intend to use. Array#filter feels broken, there's no question about it.

@jridgewell
Copy link
Member

I also needed a groupBy recently: https://github.com/babel/babel/pull/13021/files#diff-3c2e3e5a9cc9bdb04de11117cfaedee2a2da03910480e9186c252a62fff7ce27R31-R42

hey, maybe already makes sense to add it to the proposal?

Yah, I can bring up both at the next meeting.

@bakkot
Copy link

bakkot commented Mar 30, 2021

I also prefer groupBy to partition, even in cases where I just want to split into two buckets, for the reason @jridgewell gives above: groupBy means not having to remember which entry is which. Given that the genesis of this whole proposal is that people get the behavior for filter backwards, it seems better to go with the API without this issue, especially given that it's strictly more powerful.

@gibson042
Copy link

The spec text currently includes groupBy even though the README doesn't mention it, is it in or out?

@jridgewell
Copy link
Member

jridgewell commented Jul 15, 2021

groupBy was welcomed by the committee, and is being moved to https://github.com/tc39/proposal-array-grouping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants