Add specialized copy method #143

joaquimg · 2020-07-11T00:50:14Z

Following the lines of jump-dev/Clp.jl#94

Add a batch copy, it seems that the batch is not much better than one-by-one for GLPK.

Running the runbench.jl file from the perf folder.

Before this PR we had:

 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations
                   ──────────────────────   ───────────────────────
 Tot / % measured:       143s / 100%            30.4GiB / 100%

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 bcs + v      100    32.9s  23.0%   329ms   7.23GiB  23.8%  74.0MiB
   build      100    18.8s  13.1%   188ms   7.22GiB  23.7%  73.9MiB
   opt        100    13.8s  9.66%   138ms   7.64MiB  0.02%  78.2KiB
 cs           100    29.0s  20.3%   290ms   5.28GiB  17.4%  54.1MiB
   build      100    15.0s  10.5%   150ms   5.28GiB  17.3%  54.0MiB
   opt        100    13.7s  9.61%   137ms   7.64MiB  0.02%  78.2KiB
 bc + s       100    27.8s  19.4%   278ms   5.83GiB  19.2%  59.7MiB
   opt        100    13.6s  9.53%   136ms     0.00B  0.00%    0.00B
   copy       100    10.8s  7.52%   108ms   3.81GiB  12.5%  39.1MiB
   build      100    3.12s  2.19%  31.2ms   2.01GiB  6.62%  20.6MiB
 bcs          100    26.5s  18.6%   265ms   5.28GiB  17.4%  54.1MiB
   opt        100    13.8s  9.63%   138ms   7.64MiB  0.02%  78.2KiB
   build      100    12.5s  8.72%   125ms   5.28GiB  17.3%  54.0MiB
 c + s        100    24.8s  17.3%   248ms   5.27GiB  17.3%  53.9MiB
   opt        100    13.6s  9.50%   136ms     0.00B  0.00%    0.00B
   copy       100    9.07s  6.34%  90.7ms   3.25GiB  10.7%  33.3MiB
   build      100    1.88s  1.32%  18.8ms   2.01GiB  6.62%  20.6MiB
 data         100    2.00s  1.40%  20.0ms   1.54GiB  5.06%  15.8MiB
 ──────────────────────────────────────────────────────────────────

After:

 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations
                   ──────────────────────   ───────────────────────
 Tot / % measured:       138s / 100%            28.7GiB / 100%

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 bcs + v      100    31.0s  22.4%   310ms   7.23GiB  25.2%  74.0MiB
   build      100    16.9s  12.3%   169ms   7.22GiB  25.2%  73.9MiB
   opt        100    13.7s  9.94%   137ms   7.64MiB  0.03%  78.2KiB
 cs           100    30.6s  22.2%   306ms   5.28GiB  18.4%  54.1MiB
   build      100    16.6s  12.0%   166ms   5.28GiB  18.4%  54.0MiB
   opt        100    13.7s  9.91%   137ms   7.64MiB  0.03%  78.2KiB
 bcs          100    26.4s  19.1%   264ms   5.28GiB  18.4%  54.1MiB
   opt        100    13.7s  9.93%   137ms   7.64MiB  0.03%  78.2KiB
   build      100    12.3s  8.94%   123ms   5.28GiB  18.4%  54.0MiB
 bc + s       100    24.8s  18.0%   248ms   4.82GiB  16.8%  49.3MiB
   opt        100    13.1s  9.51%   131ms     0.00B  0.00%    0.00B
   copy       100    9.08s  6.58%  90.8ms   2.80GiB  9.78%  28.7MiB
   build      100    2.34s  1.69%  23.4ms   2.01GiB  7.03%  20.6MiB
 c + s        100    22.7s  16.5%   227ms   4.50GiB  15.7%  46.1MiB
   opt        100    13.1s  9.51%   131ms     0.00B  0.00%    0.00B
   copy       100    7.29s  5.28%  72.9ms   2.49GiB  8.69%  25.5MiB
   build      100    2.07s  1.50%  20.7ms   2.01GiB  7.03%  20.6MiB
 data         100    2.58s  1.87%  25.8ms   1.54GiB  5.37%  15.8MiB
 ──────────────────────────────────────────────────────────────────

I haven't used the profiler though, there might be some gains.
Moreover, jump-dev/MathOptInterface.jl#1122 could help here.

joaquimg · 2020-07-11T01:00:39Z

If we skip MOIU.canonical, we get:

 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations
                   ──────────────────────   ───────────────────────
 Tot / % measured:       137s / 100%            26.9GiB / 100%

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 bcs + v      100    31.5s  23.1%   315ms   7.23GiB  26.9%  74.0MiB
   build      100    17.5s  12.8%   175ms   7.22GiB  26.9%  73.9MiB
   opt        100    13.7s  10.0%   137ms   7.64MiB  0.03%  78.2KiB
 cs           100    28.5s  20.9%   285ms   5.28GiB  19.6%  54.1MiB
   build      100    14.6s  10.7%   146ms   5.28GiB  19.6%  54.0MiB
   opt        100    13.6s  10.0%   136ms   7.64MiB  0.03%  78.2KiB
 bcs          100    26.4s  19.3%   264ms   5.28GiB  19.6%  54.1MiB
   opt        100    13.7s  10.0%   137ms   7.64MiB  0.03%  78.2KiB
   build      100    12.4s  9.07%   124ms   5.28GiB  19.6%  54.0MiB
 bc + s       100    26.2s  19.2%   262ms   3.93GiB  14.6%  40.3MiB
   opt        100    13.1s  9.61%   131ms     0.00B  0.00%    0.00B
   copy       100    8.48s  6.21%  84.8ms   1.92GiB  7.13%  19.6MiB
   build      100    4.30s  3.15%  43.0ms   2.01GiB  7.49%  20.6MiB
 c + s        100    21.9s  16.0%   219ms   3.62GiB  13.5%  37.1MiB
   opt        100    13.1s  9.59%   131ms     0.00B  0.00%    0.00B
   copy       100    6.70s  4.90%  67.0ms   1.61GiB  5.97%  16.4MiB
   build      100    1.80s  1.32%  18.0ms   2.01GiB  7.49%  20.6MiB
 data         100    2.17s  1.59%  21.7ms   1.54GiB  5.73%  15.8MiB
 ──────────────────────────────────────────────────────────────────

codecov-commenter · 2020-07-11T01:13:36Z

Codecov Report

Merging #143 into master will increase coverage by 1.32%.
The diff coverage is 95.09%.

@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
+ Coverage   84.07%   85.39%   +1.32%     
==========================================
  Files           7        8       +1     
  Lines        1237     1390     +153     
==========================================
+ Hits         1040     1187     +147     
- Misses        197      203       +6

Impacted Files	Coverage Δ
src/GLPK.jl	`100.00% <ø> (ø)`
src/MOI_wrapper/MOI_wrapper.jl	`89.62% <89.47%> (+0.10%)`	⬆️
src/MOI_wrapper/MOI_copy.jl	`95.83% <95.83%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57e08b7...ce678aa. Read the comment docs.

odow

Is it worth adding this for the minor speedup then?

joaquimg · 2020-07-11T02:59:26Z

So there are a few caveats,
This as is now not super useful.
However, considering the contiguous indexing might boost a lot this.

odow · 2020-07-11T03:02:16Z

One difference is that this is allocating way more than the Clp version. Do we know why?

joaquimg · 2020-07-11T03:07:31Z

Clp does not have the CleverDict to keep track of data later, that's my current bet.

joaquimg · 2020-07-11T17:04:52Z

I modified the comparisson scripts:
Before

 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations
                   ──────────────────────   ───────────────────────
 Tot / % measured:      36.9s / 49.4%           5.79GiB / 100%

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 bcs + v       20    4.96s  27.2%   248ms   1.44GiB  25.0%  74.0MiB
   build       20    3.98s  21.9%   199ms   1.44GiB  24.9%  73.9MiB
   opt         20    916ms  5.03%  45.8ms   1.53MiB  0.03%  78.2KiB
 bcs           20    3.53s  19.4%   176ms   1.06GiB  18.2%  54.1MiB
   build       20    2.54s  14.0%   127ms   1.05GiB  18.2%  54.0MiB
   opt         20    914ms  5.02%  45.7ms   1.53MiB  0.03%  78.2KiB
 cs            20    3.44s  18.9%   172ms   1.06GiB  18.2%  54.1MiB
   build       20    2.46s  13.5%   123ms   1.05GiB  18.2%  54.0MiB
   opt         20    912ms  5.01%  45.6ms   1.53MiB  0.03%  78.2KiB
 bc + s        20    3.19s  17.5%   159ms   1.17GiB  20.1%  59.7MiB
   copy        20    2.10s  11.5%   105ms    781MiB  13.2%  39.0MiB
   opt         20    843ms  4.63%  42.1ms     0.00B  0.00%    0.00B
   build       20    183ms  1.01%  9.17ms    412MiB  6.95%  20.6MiB
 c + s         20    3.08s  16.9%   154ms   1.05GiB  18.2%  53.9MiB
   copy        20    1.92s  10.6%  96.1ms    666MiB  11.2%  33.3MiB
   opt         20    917ms  5.03%  45.8ms     0.00B  0.00%    0.00B
   build       20    185ms  1.02%  9.25ms    412MiB  6.95%  20.6MiB
 data           1   13.4ms  0.07%  13.4ms   15.8MiB  0.27%  15.8MiB
 ──────────────────────────────────────────────────────────────────

After

 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations
                   ──────────────────────   ───────────────────────
 Tot / % measured:      36.2s / 47.3%           5.45GiB / 100%

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 bcs + v       20    5.08s  29.6%   254ms   1.44GiB  26.5%  74.0MiB
   build       20    4.07s  23.8%   203ms   1.44GiB  26.5%  73.9MiB
   opt         20    933ms  5.45%  46.7ms   1.53MiB  0.03%  78.2KiB
 cs            20    3.59s  21.0%   179ms   1.06GiB  19.4%  54.1MiB
   build       20    2.58s  15.1%   129ms   1.05GiB  19.4%  54.0MiB
   opt         20    946ms  5.52%  47.3ms   1.53MiB  0.03%  78.2KiB
 bcs           20    3.52s  20.6%   176ms   1.06GiB  19.4%  54.1MiB
   build       20    2.51s  14.6%   125ms   1.05GiB  19.4%  54.0MiB
   opt         20    948ms  5.53%  47.4ms   1.53MiB  0.03%  78.2KiB
 bc + s        20    2.71s  15.8%   136ms   0.97GiB  17.8%  49.7MiB
   copy        20    1.64s  9.57%  81.9ms    582MiB  10.4%  29.1MiB
   opt         20    818ms  4.77%  40.9ms     0.00B  0.00%    0.00B
   build       20    194ms  1.13%  9.71ms    412MiB  7.39%  20.6MiB
 c + s         20    2.21s  12.9%   110ms    924MiB  16.6%  46.2MiB
   copy        20    1.15s  6.73%  57.6ms    512MiB  9.18%  25.6MiB
   opt         20    794ms  4.64%  39.7ms     0.00B  0.00%    0.00B
   build       20    201ms  1.17%  10.0ms    412MiB  7.39%  20.6MiB
 data           1   13.6ms  0.08%  13.6ms   15.8MiB  0.28%  15.8MiB
 ──────────────────────────────────────────────────────────────────

Timing are better because I forced GC in between tests.
Main comparison here is the copy lines.
When solver is integrated with Cache it is not using the new copy_to, probably passing directly...

joaquimg · 2020-07-11T17:35:15Z

after the PR we have the following:

Rough estimates:
43% glp_load_matrix
12% canonical (jump-dev/MathOptInterface.jl#1118)
20% adding constraint index to conmap (similar jump-dev/MathOptInterface.jl#1122)
9% pass constraint attributes (note that there is nothing to pass in this test) (jump-dev/MathOptInterface.jl#1121)

basically there is still 40% of the time that we are doing bad

joaquimg · 2020-07-12T00:09:41Z

Master:

 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations
                   ──────────────────────   ───────────────────────
 Tot / % measured:      27.6s / 46.0%           5.79GiB / 100%

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 bcs + v       20    3.48s  27.4%   174ms   1.44GiB  25.0%  74.0MiB
   build       20    2.74s  21.6%   137ms   1.44GiB  24.9%  73.9MiB
   opt         20    674ms  5.31%  33.7ms   1.53MiB  0.03%  78.2KiB
 bcs           20    2.38s  18.8%   119ms   1.06GiB  18.2%  54.1MiB
   build       20    1.65s  13.0%  82.4ms   1.05GiB  18.2%  54.0MiB
   opt         20    676ms  5.32%  33.8ms   1.53MiB  0.03%  78.2KiB
 cs            20    2.36s  18.6%   118ms   1.06GiB  18.2%  54.1MiB
   build       20    1.65s  13.0%  82.4ms   1.05GiB  18.2%  54.0MiB
   opt         20    654ms  5.15%  32.7ms   1.53MiB  0.03%  78.2KiB
 bc + s        20    2.30s  18.1%   115ms   1.17GiB  20.1%  59.7MiB
   copy        20    1.46s  11.5%  73.1ms    781MiB  13.2%  39.0MiB
   opt         20    625ms  4.92%  31.3ms     0.00B  0.00%    0.00B
   build       20    160ms  1.26%  8.02ms    412MiB  6.95%  20.6MiB
 c + s         20    2.16s  17.0%   108ms   1.05GiB  18.2%  53.9MiB
   copy        20    1.33s  10.5%  66.5ms    666MiB  11.2%  33.3MiB
   opt         20    631ms  4.97%  31.6ms     0.00B  0.00%    0.00B
   build       20    145ms  1.15%  7.27ms    412MiB  6.95%  20.6MiB
 data           1   12.5ms  0.10%  12.5ms   15.8MiB  0.27%  15.8MiB
 ──────────────────────────────────────────────────────────────────

After last commit:

 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations
                   ──────────────────────   ───────────────────────
 Tot / % measured:      25.8s / 44.0%           4.98GiB / 100%

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 bcs + v       20    3.45s  30.4%   173ms   1.44GiB  29.0%  74.0MiB
   build       20    2.73s  24.0%   136ms   1.44GiB  29.0%  73.9MiB
   opt         20    670ms  5.89%  33.5ms   1.53MiB  0.03%  78.2KiB
 cs            20    2.39s  21.0%   119ms   1.06GiB  21.2%  54.1MiB
   build       20    1.66s  14.6%  83.1ms   1.05GiB  21.2%  54.0MiB
   opt         20    665ms  5.86%  33.3ms   1.53MiB  0.03%  78.2KiB
 bcs           20    2.34s  20.6%   117ms   1.06GiB  21.2%  54.1MiB
   build       20    1.64s  14.4%  81.8ms   1.05GiB  21.2%  54.0MiB
   opt         20    644ms  5.67%  32.2ms   1.53MiB  0.03%  78.2KiB
 bc + s        20    1.79s  15.7%  89.3ms    754MiB  14.8%  37.7MiB
   copy        20    1.05s  9.25%  52.5ms    341MiB  6.70%  17.1MiB
   opt         20    519ms  4.57%  26.0ms     0.00B  0.00%    0.00B
   build       20    162ms  1.43%  8.10ms    412MiB  8.09%  20.6MiB
 c + s         20    1.38s  12.1%  68.8ms    684MiB  13.4%  34.2MiB
   copy        20    655ms  5.77%  32.8ms    272MiB  5.33%  13.6MiB
   opt         20    525ms  4.62%  26.3ms     0.00B  0.00%    0.00B
   build       20    142ms  1.25%  7.09ms    412MiB  8.09%  20.6MiB
 data           1   13.2ms  0.12%  13.2ms   15.8MiB  0.31%  15.8MiB
 ──────────────────────────────────────────────────────────────────

So we can double the speed in the c + s case, I will look into other cases and PR the DoubleDict to MOI

joaquimg · 2020-07-14T00:13:17Z

depends on jump-dev/MathOptInterface.jl#1126

odow reviewed Jul 11, 2020

View reviewed changes

mlubin mentioned this pull request Jul 18, 2020

p-median Julia benchmarks jump-dev/MOIPaperBenchmarks#1

Merged

joaquimg mentioned this pull request Jul 26, 2020

GLPK performance jump-dev/MOIPaperBenchmarks#3

Merged

joaquimg and others added 7 commits September 11, 2020 14:54

add specialize copy methods

4d3aa7b

add contiguous indexing and array resizing

707e977

experiment DoubleDict

1fc0ffd

start cleanup (requires double dicts)

7b2f6a7

cleanup and resort on dense dict

d82d949

dont look for certification when locally infeasible

13eba43

Tidy and fix MOI lower bound

83430b5

odow force-pushed the jg/perf branch from a9d591a to 83430b5 Compare September 11, 2020 03:01

odow closed this Sep 13, 2020

odow reopened this Sep 13, 2020

odow added 2 commits September 14, 2020 12:09

Use in tests

256d37f

Add tests for MOI.copy_to

ce678aa

odow merged commit f3a3d0e into master Sep 14, 2020

odow deleted the jg/perf branch September 14, 2020 07:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add specialized copy method #143

Add specialized copy method #143

joaquimg commented Jul 11, 2020

joaquimg commented Jul 11, 2020

codecov-commenter commented Jul 11, 2020 •

edited by codecov bot

Loading

odow left a comment

joaquimg commented Jul 11, 2020

odow commented Jul 11, 2020

joaquimg commented Jul 11, 2020

joaquimg commented Jul 11, 2020

joaquimg commented Jul 11, 2020 •

edited

Loading

joaquimg commented Jul 12, 2020 •

edited

Loading

joaquimg commented Jul 14, 2020

Add specialized copy method #143

Add specialized copy method #143

Conversation

joaquimg commented Jul 11, 2020

joaquimg commented Jul 11, 2020

codecov-commenter commented Jul 11, 2020 • edited by codecov bot Loading

Codecov Report

odow left a comment

Choose a reason for hiding this comment

joaquimg commented Jul 11, 2020

odow commented Jul 11, 2020

joaquimg commented Jul 11, 2020

joaquimg commented Jul 11, 2020

joaquimg commented Jul 11, 2020 • edited Loading

joaquimg commented Jul 12, 2020 • edited Loading

joaquimg commented Jul 14, 2020

codecov-commenter commented Jul 11, 2020 •

edited by codecov bot

Loading

joaquimg commented Jul 11, 2020 •

edited

Loading

joaquimg commented Jul 12, 2020 •

edited

Loading