Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perf] Improve ExportMap.for performance on larger codebases #2756

Merged
merged 1 commit into from
Apr 12, 2023

Commits on Apr 12, 2023

  1. [perf] ExportMap: Improve ExportMap.for performance on larger codeb…

    …ases
    
    While looking at a larger code base https://gitlab.com/gitlab-org/gitlab,
    `ExportMap.for` seems to take a significant amount of time. More than
    half of it is spend on calculating hashes with `hashObject`.
    
    Digging a little deeper, it seems like we are calling it around 500
    thousand times. Each iteration calculates the hash of a context object.
    This context object is created inside of the `childContext` function and
    consists of four parts:
    - `settings` -> an Object itself
    - `parserOptions` -> an Object itself
    - `parserPath` -> a String
    - `path` -> a String.
    
    Interestingly `settings`, `parserOptions` and `parserPath` rarely do
    change for us, so calculating their hashes on every iteration seems
    unnecessary. `hashObject` recursively calculates the hashes of each
    key / value pair.
    
    So instead of doing:
    
    ```js
    cacheKey = hashObject({settings, parserOptions, parserPath, path})
    ```
    We could also do:
    ```js
    cacheKey = parserPath +
      hashObject(parserOptions) +
      hashObject(settings) +
      path
    ```
    
    This would be just as stable as before, although resulting in longer
    cache keys. `parserPath` and `path` would not need to be hashed,
    because they are strings and a single character change in them would
    result in a different cache key.
    
    Furthermore we can memoize the hashes of `parserOptions` and
    `settings`, in case they didn't change compared. The equality is
    checked with a simple `JSON.stringify`.
    
    We move this `cacheKey` calculation to `childContext`, adding the
    cache key to the `context` object. This way, we can fall back to the
    old calculation inside of `ExportMap.for`, as it is a public
    interface which consumers might be using.
    
    In our code base the results speak for itself:
    - 51.59s spent in `ExportMap.for`, 0ms spent in `childContext`.
        - 16.89s is spent in node:crypto/hash `update` (overall)
    - 41.02s spent in `ExportMap.for, 1.91s spent in `childContext`.
        - Almost no time spent in `hashObject`, actually all calls in
          our flame graph come from other code paths
        - 7.86s is spent in node:crypto/hash `update` (overall)
    
    So on this machine, and project, we are cutting the execution time
    of `ExportMap.for` in half. On machines, which are hashing slower,
    the effect might be more pronounced. Similarly machines with less
    memory, as the `hashObject` function creates a lot of tiny strings.
    
    One side-effect here could be, that the memoization is in-efficient
    if the `settings` or `parserOptions` change often. (I cannot think
    of such a scenario, but I am not that versed in the use cases of
    this plugin.) But even then, the overhead should mainly be the
    `JSON.stringify`.
    leipert authored and ljharb committed Apr 12, 2023
    Configuration menu
    Copy the full SHA
    0ae35c0 View commit details
    Browse the repository at this point in the history