Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize activation of bundles with no inter-bundle path overlap #7144

Open
sqyang94 opened this issue Oct 30, 2024 · 1 comment · May be fixed by #7155
Open

Optimize activation of bundles with no inter-bundle path overlap #7144

sqyang94 opened this issue Oct 30, 2024 · 1 comment · May be fixed by #7155

Comments

@sqyang94
Copy link

What is the underlying problem you're trying to solve?

In our production environment, we have noticed that OPA bundle activation would take up to 20 minutes with only 5 bundles (2 data bundles with data.json only and 3 policy bundles with Rego files only), causing a delay whenever we update our bundles. We have also observed high CPU usage on our servers during bundle activation (which would happen regularly because of polling).

Profiling result has shown that the majority of bundle activation time was spent in the call chain of bundle.activateBundles -> ... -> ast.CheckPathConflicts -> ... -> txn.Read -> ... -> deepcopy.DeepCopy (recursive calls). On further inspection, this code is to perform conflict checks on the AST to make sure there is no intra- or inter-bundle path conflicts, as described here in the OPA documentation.

It turns out in our case, OPA was performing multiple times of path conflict check, which would try to apply the "removal" and "update" operations of the entire data bundle on every loaded documents (Rego + JSON). This involves deep-copying the entire bundle and checking every rule in a policy against a bundle, causing a very long delay of bundle activation.

In our use case, however, all of our data bundles have a manifest that defines a roots field. When building these bundles, OPA has already validated that all Rego packages and JSON paths don't leak outside the defined roots. Additionally, upon bundle activation, OPA already checks if the roots would overlap between bundles (code). So when activating the bundles, if two bundles don't have overlapped roots, then I don't think that it is necessary to check path conflicts between them.

Describe the ideal solution

After consulting OPA maintainers via Slack, they clarified that the reason why both root overlap check and path conflict check exists is that a bundle may contain both data and policies, in which they may conflict with each other.

Therefore, I think the ideal solution is that in path conflict checks, if a bundle has defined a root in its manifest, instead of checking the entire ensemble of all bundles, OPA can just check the path conflicts within the defined roots of the bundle to prevent these unnecessary checks. For example, if a data bundle has a root of foo.bar.baz, then the check should only check the subtree from data.foo.bar.baz.

We have tested this approach on an internal OPA fork. With path conflict checks from a child AST node instead of from the root, we confirmed that the activation time went down from ~20 minutes to almost always under a second. We also witnessed our CPU utilization going down by a significant amount after the fix.

The change we applied to our internal fork can be seen here. If what's described in this section is reasonable, I'm happy to open a PR :)

Describe a "Good Enough" solution

N/A. The above solution seems pretty straight-forward and is minimal effort.

Additional Context

Example Case

In the following example, we have one data bundle with only data.json and two policy bundles with only Rego policy files. All of them have a manifest in its tarball.

$ tar tzf data-bundle.tar.gz
/.manifest             <- "roots": ["foo/bar"]
/data.json

$ tar tzf policy-bundle.tar.gz
/.manifest             <- "roots": ["foo/authz", "foo/baz"]
/foo
/foo/authz
/foo/authz/authz.rego
/foo/baz
/foo/baz/baz.rego

Then, when activating data-bundle, OPA should only check path conflicts with the sub-AST starting from child node data/foo/bar/baz, instead from the root node data. This way, it would prevent deep-copying the AST from policy-bundle and also prevent checking every variable from that policy, thus making bundle activation more efficient.

However, if there is no manifest or no root defined in the manifest file, then OPA still has to perform the check from the root AST node because it cannot guarantee there is no root overlap between bundles.

@ashutosh-narkar
Copy link
Member

Logically this makes sense for a multi-bundle scenario where the roots definitely have to be defined. More detailed testing in the draft would be great. Thanks for looking into this. This issue seems to have the same goal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants