-
-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the E-Graph Pattern Matcher #6
Comments
The pattern matcher now uses shared buffers between recursive calls. |
An update on the issue. The pattern matcher works fairly well, but still has some pitfalls that cause bottlenecks and need to be solved. E-matching is still the big bottleneck in the equality saturation process. The main bottleneck in the pattern matcher is the ematchlist function. Profiling allocations shows that the code location where most bytes are allocated, is inside this function, when creating a fresh buffer of substitutions for each time the system is trying to match against a composite pattern of the form One can look at this flamegraph to confirm this. There is a bad performance loss as the size of patterns grows. Let's consider this example using Metatheory
@metatheory_init ()
using Metatheory.EGraphs
using Metatheory.Library
using Metatheory.Util
using Metatheory.EGraphs.Schedulers
Metatheory.options.printiter = true
Metatheory.options.verbose = true
function rep(x, op, n::Int)
foldl((x, y) -> :(($op)($x, $y)), repeat([x], n))
end
rep(:a, :*, 3)
Mid = @theory begin
a * :ε => :ε
:ε * a => :ε
end
Massoc = @theory begin
a * (b * c) => (a * b) * c
(a * b) * c => a * (b * c)
end
T = [
@rule :b*:B => :ε
RewriteRule(Pattern(rep(:(:a), :*, 2)), Pattern(:(:ε)))
RewriteRule(Pattern(rep(:(:b), :*, 3)), Pattern(:(:ε)))
RewriteRule(Pattern(rep(:(:a*:b), :*, 7)), Pattern(:(:ε)))
RewriteRule(Pattern(rep(:(:a*:b*:a*:B), :*, 5)), Pattern(:(:ε)))
]
G = Mid∪Massoc∪T
expr = :(a*b*a*a*a*b*b*b*a*B*B*B*B*a)
g = EGraph(expr)
params = SaturationParams(timeout=5)
saturate!(g, G, params)
ex = extract!(g, astsize)
rewrite(ex, Mid) Even though the egraph grows to only 37 eclasses, 202 nodes, it takes in total 125 seconds to match against the egraph! See the equality saturation report from this test run
How to attack this problem?I've tried attacking this problem in many ways. Some ideas:
|
New pattern matcher architecture started in branch |
Merged into master |
The current pattern matcher is an unefficient version of the pattern matcher in
https://www.hpl.hp.com/techreports/2003/HPL-2003-148.pdf
Adapted from https://github.com/philzook58/EGraphs.jl/
By now, the pattern matcher uses channels as generators.
This architecture should be reconsidered for efficient parallelization.
Another pattern matcher architecture, based on a small virtual machine is
http://leodemoura.github.io/files/ematching.pdf
If this solution is considered, the abstract virtual machine could be implemented as low level as possible.
The text was updated successfully, but these errors were encountered: