Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: add experimental support for using mimalloc allocator #404

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Commits on Aug 13, 2024

  1. fix: add missing stub for scanner benchmark

    Fixes:
    
    ```
    luajit: ...and-t/bin/benchmarks/../../lua/wincent/commandt/init.lua:199: attempt to call field 'nvim_buf_is_valid' (a nil value)
    ```
    wincent committed Aug 13, 2024
    Configuration menu
    Copy the full SHA
    3407f81 View commit details
    Browse the repository at this point in the history
  2. fix: fix another scanner benchmark error

    Fixes:
    
    ```
    luajit: ...and-t/bin/benchmarks/../../lua/wincent/commandt/init.lua:244: attempt to index field 'scanners' (a nil value)
    ```
    wincent committed Aug 13, 2024
    Configuration menu
    Copy the full SHA
    2af6a2d View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6be9611 View commit details
    Browse the repository at this point in the history
  4. perf: add experimental support for using mimalloc allocator

    Vendoring from:
    
    - https://github.com/microsoft/mimalloc
    
    and specifically:
    
    - https://github.com/microsoft/mimalloc/releases/tag/v2.0.6
    
    I added a script to pull down the release archive and dump it into a
    directory, because I don't want to use a submodule for this (people
    installing a Vim plugin from a Git repo shouldn't have to know/worry
    about whether it needs or uses submodules). Space on disk for this set
    of files (some of which are obviously redundant in our context) is:
    
        du -sh lua/wincent/commandt/lib/vendor/github/microsoft
        4.8M    lua/wincent/commandt/lib/vendor/github/microsoft
    
    As it is not clear whether this is going to be a great idea or not, it
    only takes effect if you call `make` with `USE_MIMALLOC` set. You can
    verify that it actually _is_ overriding the standard `malloc()` etc
    calls by running a command with `MIMALLOC_VERBOSE`, which will cause it
    to print some extra info out:
    
        env MIMALLOC_VERBOSE=1 TIMES=1 bin/benchmarks/scanner.lua
    
    Impact (unfortunately, a bit inconclusive) on scanner and matcher
    benchmarks follows. Note that numbers shouldn't be compared across
    machines because they were produced at different times (for example, the
    M3 numbers are from a different version of the OS, and the branch was
    rebased, compared with the other machines).
    
    On mid-2015 MacBook Pro
    =======================
    
    These numbers are all over the map due to thermal throttling.
    
                   best    avg      sd     +/-      p     (best)    (avg)      (sd)     +/-     p
          buffer 0.04094 0.04178 0.00278 [-0.6%]        (0.04100) (0.04186) (0.00287) [-0.6%]
            file 0.30707 0.31436 0.02486 [-1.0%]   0.05 (0.30735) (0.31473) (0.02499) [-1.0%]  0.05
            find 0.05827 0.06678 0.01162 [+1.5%]   0.05 (0.92013) (0.93752) (0.04453) [-1.0%] 0.025
             git 0.05163 0.06000 0.01115 [+3.3%] 0.0005 (1.00993) (1.02469) (0.04072) [-0.7%] 0.025
              rg 0.06419 0.07229 0.01203 [+3.8%]  0.005 (1.61018) (1.66326) (0.08803) [+0.3%]
        watchman 0.01095 0.01121 0.00068 [+0.2%]        (1.16830) (1.17605) (0.01835) [+0.6%] 0.005
           total 0.54387 0.56643 0.04391 [+0.4%]        (5.09873) (5.15811) (0.15328) [-0.1%]
    
                            best      avg      sd      +/-     p     (best)    (avg)      (sd)      +/-     p
             pathological  0.44648  0.48275 0.19826 [-10.0%]  0.01 (0.44705) (0.48350) (0.19793) [-10.0%]  0.01
                command-t  0.41205  0.44292 0.21658  [+3.8%] 0.005 (0.41255) (0.44364) (0.21681)  [+3.8%] 0.005
        chromium (subset)  2.75724  2.99017 0.47925  [-1.3%]       (0.51232) (0.55960) (0.17228)  [-1.5%]
         chromium (whole)  3.18933  3.63241 0.64392  [-0.7%]       (0.41821) (0.49571) (0.14853)  [-0.3%]  0.05
               big (400k)  4.90155  5.51271 1.20748  [-1.0%]       (0.65297) (0.74723) (0.23045)  [-4.5%]  0.05
                    total 11.74815 13.06097 2.16866  [-1.2%]       (2.47007) (2.72968) (0.54795)  [-2.8%] 0.025
    
    M1 MacBook Pro
    ==============
    
                   best    avg      sd     +/-     p     (best)    (avg)      (sd)     +/-     p
          buffer 0.04407 0.05368 0.01123 [-1.4%] 0.025 (0.04433) (0.05413) (0.01150) [-1.6%] 0.025
            file 0.20902 0.21428 0.01060 [+1.0%]  0.01 (0.20902) (0.21511) (0.01219) [+1.1%] 0.005
            find 0.02687 0.03006 0.01015 [+3.9%]  0.05 (0.63141) (0.64156) (0.03483) [+0.7%]  0.05
             git 0.02693 0.02995 0.00980 [+2.2%]       (0.71734) (0.72825) (0.04266) [-0.4%]
              rg 0.02916 0.03318 0.01136 [+2.9%]       (0.90193) (0.91710) (0.07157) [+1.4%] 0.005
        watchman 0.01100 0.01156 0.00165 [-0.7%]       (1.18802) (1.21274) (0.13422) [+1.5%] 0.005
           total 0.36119 0.37272 0.03632 [+1.1%]       (3.71713) (3.76889) (0.18577) [+0.9%] 0.005
    
                            best    avg      sd     +/-     p     (best)    (avg)      (sd)     +/-     p
             pathological 0.28526 0.29636 0.08356 [-4.0%] 0.025 (0.28527) (0.29647) (0.08343) [-4.0%] 0.025
                command-t 0.23759 0.24616 0.07356 [+1.6%]       (0.23760) (0.24618) (0.07354) [+1.6%]
        chromium (subset) 1.56761 1.58469 0.03655 [-0.3%]       (0.41376) (0.42040) (0.02032) [-0.4%]
         chromium (whole) 1.87180 1.88726 0.06174 [-0.4%] 0.025 (0.31695) (0.32809) (0.03497) [+0.4%]
               big (400k) 2.90455 2.92204 0.07185 [-0.2%]       (0.48384) (0.50533) (0.07608) [-0.0%]
                    total 6.88851 6.93650 0.15002 [-0.4%] 0.025 (1.74550) (1.79647) (0.14517) [-0.5%]
    
    M3 MacBook Pro
    ==============
    
                   best    avg      sd      +/-      p     (best)    (avg)      (sd)      +/-      p
          buffer 0.01255 0.01400 0.00409  [+2.0%]        (0.01260) (0.01447) (0.00635)  [-3.3%]
            file 0.14749 0.15026 0.00629 [+38.1%] 0.0005 (0.14843) (0.15115) (0.00626) [+37.9%] 0.0005
            find 0.20783 0.27306 0.12796 [+15.8%] 0.0005 (1.13360) (1.38588) (0.55490) [+15.3%] 0.0005
             git 0.21748 0.25155 0.10398 [+13.0%] 0.0005 (1.17693) (1.40937) (0.54965)  [+9.1%] 0.0005
              rg 0.20640 0.26983 0.12977 [+12.2%] 0.0005 (1.55310) (1.78037) (0.55921)  [+6.9%] 0.0005
        watchman 0.01813 0.01980 0.00287  [+6.1%] 0.0005 (1.19740) (1.21007) (0.02198)  [-0.2%]
           total 0.81542 0.97850 0.33560 [+17.1%] 0.0005 (5.23262) (5.95132) (1.66475)  [+8.7%] 0.0005
    
                            best    avg      sd     +/-      p     (best)    (avg)      (sd)     +/-     p
             pathological 0.21079 0.22604 0.10943 [+4.8%]  0.025 (0.21107) (0.22640) (0.10972) [+4.7%] 0.025
                command-t 0.16694 0.17164 0.04923 [-0.6%]        (0.16716) (0.17228) (0.05253) [-0.5%]
        chromium (subset) 1.35310 1.36239 0.02010 [+0.1%]        (0.28797) (0.29255) (0.01108) [+0.3%]
         chromium (whole) 1.11148 1.11599 0.01258 [+0.3%]   0.01 (0.12167) (0.12478) (0.00828) [-0.2%]
               big (400k) 1.67454 1.68249 0.05630 [+0.6%] 0.0005 (0.18195) (0.18487) (0.00876) [+0.0%]
                    total 4.52863 4.55855 0.15573 [+0.5%]   0.01 (0.97644) (1.00087) (0.12712) [+1.0%]
    
    Ryzen 5950X Arch Linux
    ======================
    
                   best    avg      sd     +/-   p   (best)    (avg)      (sd)      +/-     p
          buffer 0.02465 0.02544 0.01098 [-0.4%]   (0.02467) (0.02546) (0.01099)  [-0.5%]
            file 0.09906 0.09948 0.00124 [-0.1%]   (0.09943) (0.09995) (0.00130)  [-0.2%]
            find 0.01852 0.01885 0.00084 [+0.5%]   (0.25137) (0.25430) (0.00762)  [+0.1%]
             git 0.01718 0.01811 0.00210 [+0.6%]   (0.22095) (0.22468) (0.01156)  [-0.6%]
              rg 0.01748 0.01792 0.00105 [+0.5%]   (0.60575) (0.61077) (0.01562)  [-0.1%]
        watchman 0.00178 0.00186 0.00033 [-5.6%]   (0.02282) (0.02717) (0.02826) [-11.5%]
           total 0.17975 0.18165 0.01018 [-0.0%]   (1.23025) (1.24233) (0.04061)  [-0.4%] 0.05
    
                            best    avg      sd     +/-      p     (best)    (avg)      (sd)      +/-      p
             pathological 0.26186 0.27703 0.10940 [-4.4%] 0.0005 (0.26196) (0.27715) (0.10946)  [-4.4%] 0.0005
                command-t 0.19271 0.20058 0.05044 [-3.0%] 0.0005 (0.19279) (0.20065) (0.05047)  [-3.0%] 0.0005
        chromium (subset) 1.83627 1.89158 0.25631 [-3.8%]   0.01 (0.45977) (0.49985) (0.21028) [-15.7%]  0.005
         chromium (whole) 1.36877 1.38916 0.06031 [+2.6%] 0.0005 (0.12129) (0.12530) (0.01659)  [-0.4%]
               big (400k) 2.39053 2.43636 0.11813 [+1.8%] 0.0005 (0.19600) (0.20396) (0.02644)  [-0.1%]
                    total 6.09256 6.19472 0.33431 [-0.2%]        (1.24139) (1.30690) (0.25114)  [-7.5%]  0.005
    wincent committed Aug 13, 2024
    Configuration menu
    Copy the full SHA
    e1b73e4 View commit details
    Browse the repository at this point in the history
  5. style: stop newly vendored files from being formatted

    The .prettierignore change is because there are a couple of things in
    the Markdown files that Prettier doesn't like.
    
    The clang-format thing comes from a tip here:
    
    - https://stackoverflow.com/a/57272592/2103996
    
    Should prevent CI failures like this one:
    
    - https://github.com/wincent/command-t/actions/runs/2979207632
    wincent committed Aug 13, 2024
    Configuration menu
    Copy the full SHA
    625cd25 View commit details
    Browse the repository at this point in the history
  6. fix(gcc): add -fPIC to mimalloc compilation

    Wasn't needed on clang, but is needed with gcc:
    
        /usr/bin/ld: mimalloc-override.o: relocation R_X86_64_TPOFF32
        against `recurse' can not be used when making a shared object;
        recompile with -fPIC
    wincent committed Aug 13, 2024
    Configuration menu
    Copy the full SHA
    6f337ee View commit details
    Browse the repository at this point in the history
  7. chore: update vendored mimalloc from 2.0.6 to 2.1.7

    I can't see a changelog or release notes in the repo, so here is the
    diff:
    
    - microsoft/mimalloc@v2.0.6...v2.1.7
    wincent committed Aug 13, 2024
    Configuration menu
    Copy the full SHA
    836698d View commit details
    Browse the repository at this point in the history