Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: getNodeIdsConnected should remove duplicate values #8054

Merged
merged 3 commits into from
May 6, 2022

Conversation

ahaoboy
Copy link
Contributor

@ahaoboy ahaoboy commented May 4, 2022

↪️ Pull Request

getNodeIdsConnectedFrom and getNodeIdsConnectedTo use Array save nodeId, and we should remove duplicate values

  let graph = new Graph();
  let a = graph.addNode(1);
  let b = graph.addNode(2);
  let c = graph.addNode(3);
  graph.addEdge(a, b);
  graph.addEdge(a, c);
  graph.addEdge(a, b, 2);
  // [ 1, 2, 1 ]
  console.log(graph.getNodeIdsConnectedFrom(a, -1))
  // [ 0, 0 ]
  console.log(graph.getNodeIdsConnectedTo(b, -1))

🚨 Test instructions

By using set to remove duplicate values

  let graph = new AdjacencyList();
  let a = graph.addNode();
  let b = graph.addNode();
  let c = graph.addNode();
  graph.addEdge(a, b);
  graph.addEdge(a, c);
  graph.addEdge(a, b, 2);
  assert.deepEqual(graph.getNodeIdsConnectedFrom(a, -1), [b, c]);
  assert.deepEqual(graph.getNodeIdsConnectedTo(b, -1), [a]);

✔️ PR Todo

  • Added/updated unit tests for this change
  • Filled out test instructions (In case there aren't any unit tests)
  • Included links to related issues/PRs

@wbinnssmith wbinnssmith requested a review from lettertwo May 4, 2022 17:13
Copy link
Contributor

@lettertwo lettertwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks for the PR!

getNodesConnected{From, To} can be called a lot during bundling. So, it might be worth avoiding copying the nodes set to an array.

One idea is to use the set to track seen edges instead (see suggestions).

I still want to get some real world measurements, but a benchmark I put together that emulates a Parcel build shows:

┌─────────┬────────┬─────┬──────────┬────────────────────┬────────────────────┐
│ (index) │ count  │ min │   max    │        mean        │       stddev       │
├─────────┼────────┼─────┼──────────┼────────────────────┼────────────────────┤
│   v2    │ 106743 │ 203 │  743423  │ 456.43437977197567 │ 2708.5082103128275 │
│   set   │ 106743 │ 309 │ 51019775 │ 1138.6295869518376 │ 156157.29457667368 │
│  seen   │ 106743 │ 285 │  564223  │  599.489961871036  │ 3242.0693190347592 │
└─────────┴────────┴─────┴──────────┴────────────────────┴────────────────────┘

rows:

  • v2 is the current behavior
  • set is copying the set to an array
  • seen is using a set to track seen edges.

columns:

  • count is the number of calls to getNodeIdsConnectedFrom that occurred for one of the graphs during the build
  • the min, max, mean, stddev are measures of how long the function took to execute

I'll give this a whirl in a build and see if there are any significant slowdowns and report back.

packages/core/graph/src/AdjacencyList.js Outdated Show resolved Hide resolved
packages/core/graph/src/AdjacencyList.js Outdated Show resolved Hide resolved
packages/core/graph/src/AdjacencyList.js Outdated Show resolved Hide resolved
@ahaoboy
Copy link
Contributor Author

ahaoboy commented May 6, 2022

It's a pity that js iterators do not support filter/map/unique, return type is Iterable is better

@lettertwo lettertwo self-requested a review May 6, 2022 18:28
Copy link
Contributor

@lettertwo lettertwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making that change!

I ran a large Parcel build a few times with both approaches, and saw (best of 3 runs for each):

v2:   ✨  Done in 171.62s.
set:  ✨  Done in 181.97s.
seen: ✨  Done in 176.45s.

An iterable API might be better here (if most uses of these methods turn out to be partial iterations like find()), but would require a fair amount of cascading changes across core.

@lettertwo lettertwo merged commit 504e546 into parcel-bundler:v2 May 6, 2022
gorakong pushed a commit that referenced this pull request Nov 3, 2022
* upstream/v2:
  fix: getNodeIdsConnected should remove duplicate values (#8054)
  chore: should log unsupported type not zero and toLocaleString's option typo (#8002)
  Allow animated images (#8018)
  Add "key" and "update_url" to webextension manifest schema (#8043)
  Allow to define the "compilerOptions" of the VueTransformer (#8031)
  fix(image transformer): Update supported formats (#8028)
  support for `oauth2` field in mv3 (#8037)
  Update @parcel/css to 1.8.2 (#8044)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants