Optimize activation of bundles with no inter-bundle path overlap #7144

sqyang94 · 2024-10-30T19:28:08Z

What is the underlying problem you're trying to solve?

In our production environment, we have noticed that OPA bundle activation would take up to 20 minutes with only 5 bundles (2 data bundles with data.json only and 3 policy bundles with Rego files only), causing a delay whenever we update our bundles. We have also observed high CPU usage on our servers during bundle activation (which would happen regularly because of polling).

Profiling result has shown that the majority of bundle activation time was spent in the call chain of bundle.activateBundles -> ... -> ast.CheckPathConflicts -> ... -> txn.Read -> ... -> deepcopy.DeepCopy (recursive calls). On further inspection, this code is to perform conflict checks on the AST to make sure there is no intra- or inter-bundle path conflicts, as described here in the OPA documentation.

It turns out in our case, OPA was performing multiple times of path conflict check, which would try to apply the "removal" and "update" operations of the entire data bundle on every loaded documents (Rego + JSON). This involves deep-copying the entire bundle and checking every rule in a policy against a bundle, causing a very long delay of bundle activation.

In our use case, however, all of our data bundles have a manifest that defines a roots field. When building these bundles, OPA has already validated that all Rego packages and JSON paths don't leak outside the defined roots. Additionally, upon bundle activation, OPA already checks if the roots would overlap between bundles (code). So when activating the bundles, if two bundles don't have overlapped roots, then I don't think that it is necessary to check path conflicts between them.

Describe the ideal solution

After consulting OPA maintainers via Slack, they clarified that the reason why both root overlap check and path conflict check exists is that a bundle may contain both data and policies, in which they may conflict with each other.

Therefore, I think the ideal solution is that in path conflict checks, if a bundle has defined a root in its manifest, instead of checking the entire ensemble of all bundles, OPA can just check the path conflicts within the defined roots of the bundle to prevent these unnecessary checks. For example, if a data bundle has a root of foo.bar.baz, then the check should only check the subtree from data.foo.bar.baz.

We have tested this approach on an internal OPA fork. With path conflict checks from a child AST node instead of from the root, we confirmed that the activation time went down from ~20 minutes to almost always under a second. We also witnessed our CPU utilization going down by a significant amount after the fix.

The change we applied to our internal fork can be seen here. If what's described in this section is reasonable, I'm happy to open a PR :)

Describe a "Good Enough" solution

N/A. The above solution seems pretty straight-forward and is minimal effort.

Additional Context

Example Case

In the following example, we have one data bundle with only data.json and two policy bundles with only Rego policy files. All of them have a manifest in its tarball.

$ tar tzf data-bundle.tar.gz
/.manifest             <- "roots": ["foo/bar"]
/data.json

$ tar tzf policy-bundle.tar.gz
/.manifest             <- "roots": ["foo/authz", "foo/baz"]
/foo
/foo/authz
/foo/authz/authz.rego
/foo/baz
/foo/baz/baz.rego

Then, when activating data-bundle, OPA should only check path conflicts with the sub-AST starting from child node data/foo/bar/baz, instead from the root node data. This way, it would prevent deep-copying the AST from policy-bundle and also prevent checking every variable from that policy, thus making bundle activation more efficient.

However, if there is no manifest or no root defined in the manifest file, then OPA still has to perform the check from the root AST node because it cannot guarantee there is no root overlap between bundles.

The text was updated successfully, but these errors were encountered:

ashutosh-narkar · 2024-10-30T20:36:26Z

Logically this makes sense for a multi-bundle scenario where the roots definitely have to be defined. More detailed testing in the draft would be great. Thanks for looking into this. This issue seems to have the same goal.

sqyang94 added the feature-request label Oct 30, 2024

sqyang94 linked a pull request Nov 4, 2024 that will close this issue

perf: optimize activation of bundles with no inter-bundle path overlap #7155

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize activation of bundles with no inter-bundle path overlap #7144

Optimize activation of bundles with no inter-bundle path overlap #7144

sqyang94 commented Oct 30, 2024

ashutosh-narkar commented Oct 30, 2024

Optimize activation of bundles with no inter-bundle path overlap #7144

Optimize activation of bundles with no inter-bundle path overlap #7144

Comments

sqyang94 commented Oct 30, 2024

What is the underlying problem you're trying to solve?

Describe the ideal solution

Describe a "Good Enough" solution

Additional Context

Example Case

ashutosh-narkar commented Oct 30, 2024