You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the underlying problem you're trying to solve?
In our production environment, we have noticed that OPA bundle activation would take up to 20 minutes with only 5 bundles (2 data bundles with data.json only and 3 policy bundles with Rego files only), causing a delay whenever we update our bundles. We have also observed high CPU usage on our servers during bundle activation (which would happen regularly because of polling).
Profiling result has shown that the majority of bundle activation time was spent in the call chain of bundle.activateBundles -> ... -> ast.CheckPathConflicts -> ... -> txn.Read -> ... -> deepcopy.DeepCopy (recursive calls). On further inspection, this code is to perform conflict checks on the AST to make sure there is no intra- or inter-bundle path conflicts, as described here in the OPA documentation.
It turns out in our case, OPA was performing multiple times of path conflict check, which would try to apply the "removal" and "update" operations of the entire data bundle on every loaded documents (Rego + JSON). This involves deep-copying the entire bundle and checking every rule in a policy against a bundle, causing a very long delay of bundle activation.
In our use case, however, all of our data bundles have a manifest that defines a roots field. When building these bundles, OPA has already validated that all Rego packages and JSON paths don't leak outside the defined roots. Additionally, upon bundle activation, OPA already checks if the roots would overlap between bundles (code). So when activating the bundles, if two bundles don't have overlapped roots, then I don't think that it is necessary to check path conflicts between them.
Describe the ideal solution
After consulting OPA maintainers via Slack, they clarified that the reason why both root overlap check and path conflict check exists is that a bundle may contain both data and policies, in which they may conflict with each other.
Therefore, I think the ideal solution is that in path conflict checks, if a bundle has defined a root in its manifest, instead of checking the entire ensemble of all bundles, OPA can just check the path conflicts within the defined roots of the bundle to prevent these unnecessary checks. For example, if a data bundle has a root of foo.bar.baz, then the check should only check the subtree from data.foo.bar.baz.
We have tested this approach on an internal OPA fork. With path conflict checks from a child AST node instead of from the root, we confirmed that the activation time went down from ~20 minutes to almost always under a second. We also witnessed our CPU utilization going down by a significant amount after the fix.
The change we applied to our internal fork can be seen here. If what's described in this section is reasonable, I'm happy to open a PR :)
Describe a "Good Enough" solution
N/A. The above solution seems pretty straight-forward and is minimal effort.
Additional Context
Example Case
In the following example, we have one data bundle with only data.json and two policy bundles with only Rego policy files. All of them have a manifest in its tarball.
Then, when activating data-bundle, OPA should only check path conflicts with the sub-AST starting from child node data/foo/bar/baz, instead from the root node data. This way, it would prevent deep-copying the AST from policy-bundle and also prevent checking every variable from that policy, thus making bundle activation more efficient.
However, if there is no manifest or no root defined in the manifest file, then OPA still has to perform the check from the root AST node because it cannot guarantee there is no root overlap between bundles.
The text was updated successfully, but these errors were encountered:
Logically this makes sense for a multi-bundle scenario where the roots definitely have to be defined. More detailed testing in the draft would be great. Thanks for looking into this. This issue seems to have the same goal.
What is the underlying problem you're trying to solve?
In our production environment, we have noticed that OPA bundle activation would take up to 20 minutes with only 5 bundles (2 data bundles with
data.json
only and 3 policy bundles with Rego files only), causing a delay whenever we update our bundles. We have also observed high CPU usage on our servers during bundle activation (which would happen regularly because of polling).Profiling result has shown that the majority of bundle activation time was spent in the call chain of
bundle.activateBundles
-> ... ->ast.CheckPathConflicts
-> ... ->txn.Read
-> ... ->deepcopy.DeepCopy
(recursive calls). On further inspection, this code is to perform conflict checks on the AST to make sure there is no intra- or inter-bundle path conflicts, as described here in the OPA documentation.It turns out in our case, OPA was performing multiple times of path conflict check, which would try to apply the "removal" and "update" operations of the entire data bundle on every loaded documents (Rego + JSON). This involves deep-copying the entire bundle and checking every rule in a policy against a bundle, causing a very long delay of bundle activation.
In our use case, however, all of our data bundles have a manifest that defines a
roots
field. When building these bundles, OPA has already validated that all Rego packages and JSON paths don't leak outside the defined roots. Additionally, upon bundle activation, OPA already checks if the roots would overlap between bundles (code). So when activating the bundles, if two bundles don't have overlapped roots, then I don't think that it is necessary to check path conflicts between them.Describe the ideal solution
After consulting OPA maintainers via Slack, they clarified that the reason why both root overlap check and path conflict check exists is that a bundle may contain both data and policies, in which they may conflict with each other.
Therefore, I think the ideal solution is that in path conflict checks, if a bundle has defined a
root
in its manifest, instead of checking the entire ensemble of all bundles, OPA can just check the path conflicts within the defined roots of the bundle to prevent these unnecessary checks. For example, if a data bundle has a root offoo.bar.baz
, then the check should only check the subtree fromdata.foo.bar.baz
.We have tested this approach on an internal OPA fork. With path conflict checks from a child AST node instead of from the root, we confirmed that the activation time went down from ~20 minutes to almost always under a second. We also witnessed our CPU utilization going down by a significant amount after the fix.
The change we applied to our internal fork can be seen here. If what's described in this section is reasonable, I'm happy to open a PR :)
Describe a "Good Enough" solution
N/A. The above solution seems pretty straight-forward and is minimal effort.
Additional Context
Example Case
In the following example, we have one data bundle with only
data.json
and two policy bundles with only Rego policy files. All of them have a manifest in its tarball.Then, when activating
data-bundle
, OPA should only check path conflicts with the sub-AST starting from child nodedata/foo/bar/baz
, instead from the root nodedata
. This way, it would prevent deep-copying the AST frompolicy-bundle
and also prevent checking every variable from that policy, thus making bundle activation more efficient.However, if there is no manifest or no root defined in the manifest file, then OPA still has to perform the check from the root AST node because it cannot guarantee there is no root overlap between bundles.
The text was updated successfully, but these errors were encountered: