Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compiler): allow explicit lift qualifications of preflight objects #5935

Merged
merged 42 commits into from
Mar 19, 2024

Conversation

yoav-steinberg
Copy link
Contributor

@yoav-steinberg yoav-steinberg commented Mar 13, 2024

Fixes: #76
Creates a lift builtin function that can be used in inflight code to explicitly add lift qualifications to a method:

bring cloud;

let bucket = new cloud.Bucket();
bucket.addObject("k", "value");

let some_ops = ["put", "list"]; // We can define a list of ops in preflight code to be used when explicitly qualifying a lift

class Foo {
  pub inflight mehtod() {
    lift(bucket, some_ops); // Explicitly add some permissions to `bucket` using a preflight expression
    lift(bucket, ["delete"]); // Add more permissions to bucket using a literal
    log(bucket.get("k")); // Good old implicit qualification adds `get` permissions
    let b = bucket; // We can now use an inflight variable `b` to reference a preflight object `bucket`
    b.put("k2", "value2"); // We don't get a compiler error here, because explicit lifts are being used in the method disabling compiler qualification errors
    for k in b.list() { // `list` works on `bucket` because of explicit qualification and `b` references `bucket`
      log(k);
    }
    b.delete("k2"); // `delete` also works because of explicit qualification
    assert(bucket.tryGet("k2") == nil); `yay!`
  }
}

let foo = new Foo();

test "a test" {
  foo.mehtod();
}

Checklist

  • Title matches Winglang's style guide
  • Description explains motivation and solution
  • Tests added (always)
  • Docs updated (only required for features)
  • Added pr/e2e-full label if this feature requires end-to-end testing

By submitting this pull request, I confirm that my contribution is made under the terms of the Wing Cloud Contribution License.

Copy link

Thanks for opening this pull request! 🎉
Please consult the contributing guidelines for details on how to contribute to this project.
If you need any assistence, don't hesitate to ping the relevant owner over Slack.

Topic Owner
Wing SDK and utility APIs @chriscbr
Wing Console @ainvoner, @skyrpex, @polamoros
JSON, structs, primitives and collections @hasanaburayyan
Platforms and plugins @hasanaburayyan
Frontend resources (website, react, etc) @tsuf239
Language design @eladb
VSCode extension and language server @markmcculloh
Compiler architecture, inflights, lifting @yoav-steinberg
Wing Testing Framework @tsuf239
Wing CLI @markmcculloh
Build system, dev environment, releases @markmcculloh
Library Ecosystem @chriscbr
Documentation @hasanaburayyan
SDK test suite @tsuf239
Examples @hasanaburayyan
Wing Playground @eladcon

@monadabot
Copy link
Contributor

monadabot commented Mar 13, 2024

Console preview environment is available at https://wing-console-pr-5935.fly.dev 🚀

Last Updated (UTC) 2024-03-17 11:24

@monadabot
Copy link
Contributor

monadabot commented Mar 13, 2024

Benchmarks

Comparison to Baseline 🟥⬜🟥🟥⬜⬜⬜⬜⬜⬜🟩⬜🟩
Benchmark Before After Change
version 58ms±0.66 61ms±0.71 +4ms (+6.3%)🟥
empty.test.w -t sim 500ms±22.26 504ms±5.27 +5ms (+0.9%)⬜
empty.test.w -t tf-aws 699ms±5.54 732ms±8.58 +33ms (+4.78%)🟥
hello_world.test.w -t sim 520ms±7.86 540ms±6.44 +20ms (+3.78%)🟥
hello_world.test.w -t tf-aws 1657ms±15.07 1677ms±16.62 +20ms (+1.21%)⬜
functions_10.test.w -t sim 581ms±8.48 596ms±9.4 +14ms (+2.45%)⬜
functions_10.test.w -t tf-aws 2227ms±9.26 2239ms±15.07 +11ms (+0.51%)⬜
jsii_small.test.w -t sim 490ms±4.16 493ms±3.52 +3ms (+0.7%)⬜
jsii_small.test.w -t tf-aws 714ms±5.46 715ms±5.69 +0ms (+0.05%)⬜
jsii_big.test.w -t sim 2928ms±17.67 2903ms±10.63 -25ms (-0.86%)⬜
jsii_big.test.w -t tf-aws 3205ms±19.16 3121ms±12.31 -84ms (-2.62%)🟩
functions_1.test.w -t sim 533ms±7.09 519ms±6.02 -14ms (-2.63%)⬜
functions_1.test.w -t tf-aws 978ms±5.48 948ms±5.68 -29ms (-3.02%)🟩

⬜ Within 1.5 standard deviations
🟩 Faster, Above 1.5 standard deviations
🟥 Slower, Above 1.5 standard deviations

Benchmarks may vary outside of normal expectations, especially when running in GitHub Actions CI.

Results
name mean min max moe sd
version 61ms 60ms 63ms 1ms 1ms
empty.test.w -t sim 504ms 491ms 512ms 5ms 7ms
empty.test.w -t tf-aws 732ms 710ms 752ms 9ms 12ms
hello_world.test.w -t sim 540ms 526ms 556ms 6ms 9ms
hello_world.test.w -t tf-aws 1677ms 1649ms 1723ms 17ms 23ms
functions_10.test.w -t sim 596ms 570ms 613ms 9ms 13ms
functions_10.test.w -t tf-aws 2239ms 2219ms 2284ms 15ms 21ms
jsii_small.test.w -t sim 493ms 485ms 500ms 4ms 5ms
jsii_small.test.w -t tf-aws 715ms 699ms 726ms 6ms 8ms
jsii_big.test.w -t sim 2903ms 2882ms 2935ms 11ms 15ms
jsii_big.test.w -t tf-aws 3121ms 3089ms 3141ms 12ms 17ms
functions_1.test.w -t sim 519ms 507ms 532ms 6ms 8ms
functions_1.test.w -t tf-aws 948ms 932ms 959ms 6ms 8ms
Last Updated (UTC) 2024-03-17 11:30

@monadabot monadabot added the ⚠️ pr/review-mutation PR has been mutated and will not auto-merge. Clear this label if the changes look good! label Mar 13, 2024
Copy link
Contributor

@eladb eladb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very excited about this!

libs/wingc/src/ast.rs Outdated Show resolved Hide resolved
libs/wingc/src/lifting.rs Outdated Show resolved Hide resolved
libs/wingc/src/type_check.rs Outdated Show resolved Hide resolved
libs/wingc/src/type_check.rs Show resolved Hide resolved
libs/wingc/src/lifting.rs Outdated Show resolved Hide resolved
@Chriscbr
Copy link
Contributor

Chriscbr commented Mar 13, 2024

I'm playing around with the changes a bit on my local machine -- to be honest, I'm not sure I'm (personally) in love with the syntax.

I feel like it's potentially confusing that lift is a function that you call within an inflight function body (the body being where you put code that runs on the cloud), but instead it's really being used to provide declarative information about the enclosing function. For example, take a look at this code:

bring cloud;

let b1 = new cloud.Bucket() as "b1";
let b2 = new cloud.Bucket() as "b2";

let selectBucket = inflight (i: num) => {
  if i % 2 == 0 {
    return b1;
  } else {
    return b2;
  }
};

new cloud.Function(inflight () => {
  if false {
    lift(b1, ["put"]); // <-- calling "lift" inside of some control flow
    lift(b2, ["put"]);
  }
  let bucket = selectBucket(1);
  bucket.put("hello.txt", "world");
});

I think this code snippet could be pretty confusing to someone learning Wing for the first time, since many a user would rightfully assume the code inside the if false branch is going to be ignored. But as implemented by this PR, the statements are executed and the "put" permissions are added to the cloud.Function.

Likewise, when I read this code I'd expect that since lift looks like a function call, I could just refactor them into a separate function:

let addLifts = () => {
  lift(b1, ["put"]);
  lift(b2, ["put"]);
};

new cloud.Function(inflight () => {
  addLifts();
  let bucket = selectBucket(1);
  bucket.put("hello.txt", "world");
});

Unfortunately this doesn't work (it's a compile error). Even if we change it to "lift!()", it would still looks like a function call is being made.

I feel like the mechanism (providing some way to declare sets of preflight objects and operations that should be added to the internal "lift map") seems like right direction, but my preference would be if we could iterate on the syntax a bit more before merging this feature. I see @eladb has already approved this so I don't know if this is open to discussion, but I figured I'd shoot my shot sharing the feedback.

@eladb
Copy link
Contributor

eladb commented Mar 13, 2024

I agree with you @Chriscbr that this solution could be potentially confusing for the reasons you mentioned. These reasons came up when we discussed this last week.

There are many things we can do but I suspect most of them will require dedicated syntax to be designed.

What I like about this proposal (and the reason I approved the PR) is that it's a small incremental step which allows us to experiment with this idea of explicit lifting without introducing new dedicated syntax just yet. At this stage of the project, I am not sure the investment required to design a dedicated syntax for lifting is the right thing to focus on (especially because it has the potential to suck us all into this wonderful conversation).

I suggest is to start with this to unblock the use case, and continue the discussion for dedicated syntax asynchronously. I'd love to see some proposals and as usual I have some ideas ;-)

@yoav-steinberg
Copy link
Contributor Author

yoav-steinberg commented Mar 13, 2024

@Chriscbr I see what you mean and you're right.
Per your example, Just mentioning that I think:

Unfortunately this doesn't work

Is wrong. You're just missing an inflight annotation to the addLifts definition and it should work, I think.

But more generally I think there are three different approaches here:

  1. Something along the line of this PR, where you can fairly easily without learning any special syntax or library features add these explicit qualifications right next to where they make sense:
inflight () => {
  lift(default_resource, ["do"]);
  lift(backup_resource, ["do"]);
  let var b = default_resource;
  if something {
     b = backup_resource;
  }
  b.do();
}

You explained nicely the disadvantage of this approach and it's also somewhat limiting compared to the library option.
2. Add some compiler directive for specifying these things outside of but next to the method. Again limiting and also requires additional grammar.
3. Make this a library feature where we define lift qualifications and then assign them to a closure or method. We then use these library features preflight to control the lift map. This is probably the cleanest way except that we currently don't have a way to reference a method of a type and that it isn't as straightforward to use as the approach in this PR.

I'm totally open to discussion here.
For now I'm continuing work on this PR.
Ping me if you want to brainstorm on this.

@eladb
Copy link
Contributor

eladb commented Mar 13, 2024

One small tweak we can do (we discussed this as well) is to require that lift() will be used as a statement (i.e return type void) and that all lifts will be placed at the top of the closure definition.

@yoav-steinberg yoav-steinberg added the 🚧 pr/do-not-merge PRs with this label will not be automatically merged by mergify. label Mar 14, 2024
@yoav-steinberg yoav-steinberg marked this pull request as ready for review March 14, 2024 11:32
@yoav-steinberg yoav-steinberg requested a review from a team as a code owner March 14, 2024 11:32
@yoav-steinberg
Copy link
Contributor Author

One small tweak we can do (we discussed this as well) is to require that lift() will be used as a statement (i.e return type void) and that all lifts will be placed at the top of the closure definition.

Here's where I stand after a lengthy talk with @Chriscbr, either add this ☝️ and merge this PR or take a look at this new rfc: #5951 and continue from there (or both?).

@eladb
Copy link
Contributor

eladb commented Mar 17, 2024

The RFC will take time to stabilize.

I vote for merging this pull request (with the requirements that all lifts are at the top).

This is a stable incremental step forward, it will unblock this use case and will allow us to continue the syntax discussion over the RFC without blocking.

@yoav-steinberg yoav-steinberg removed the ⚠️ pr/review-mutation PR has been mutated and will not auto-merge. Clear this label if the changes look good! label Mar 17, 2024
@yoav-steinberg
Copy link
Contributor Author

@Chriscbr @MarkMcCulloh this is basically ready, please read #5935 (comment) and if you don't object, I'll merge this.

@yoav-steinberg yoav-steinberg removed the 🚧 pr/do-not-merge PRs with this label will not be automatically merged by mergify. label Mar 19, 2024
Copy link
Contributor

mergify bot commented Mar 19, 2024

Thanks for contributing, @yoav-steinberg! This PR will now be added to the merge queue, or immediately merged if yoav/explicit_lifting is up-to-date with main and the queue is empty.

mergify bot added a commit that referenced this pull request Mar 19, 2024
mergify bot added a commit that referenced this pull request Mar 19, 2024
@mergify mergify bot merged commit d55ea52 into main Mar 19, 2024
15 checks passed
@mergify mergify bot deleted the yoav/explicit_lifting branch March 19, 2024 07:44
@monadabot
Copy link
Contributor

Congrats! 🚀 This was released in Wing 0.61.17.

mergify bot pushed a commit that referenced this pull request Apr 2, 2024
Fixes #6002

To support the needs of cloud applications with many resources, Wing's compiler performs static analysis to identify all of the infrastructure resources that are used by different bits of runtime code. (In Wing parlance these are usually called preflight objects, and inflight functions, respectively). By identifying these relationships, the compiler can automatically generate applications whose resources have least-privileged access. For example, database read/write permissions only need to be added to the compute resources that perform queries on it.

This analysis is sufficient for many programs. But in general, obtaining the minimal set of resources needed for arbitrary inflight code is, in the worst case, as hard as the [halting problem](https://en.wikipedia.org/wiki/Halting_problem). To illustrate, the inflight code below only needs access to the preflight object `buckets[0]`, but it's impossible for any reasonable compiler to deduce this.

```js
bring cloud;

let buckets = [
  new cloud.Bucket() as "b1",
  new cloud.Bucket() as "b2"
];

new cloud.Function(inflight () => {
  let var i = 10;
  while i > 0 {
    i -= 1;
  }
  buckets.at(i).put("key", "value");
});
```

Even if Wing's analyzer did provide a [covering set](https://en.wikipedia.org/wiki/Set_cover_problem) of the resources and permissions needed by inflight code (in other words, a set of permissions that might be broader than necessary), there are cases when users will want or need to scope down permissions.

#5935 addresses some of these needs by letting the user annotate functions with additional lifting information:

```js
new cloud.Function(inflight () => {
  lift(buckets.at(1), ["put"]);
  let var i = 10;
  while i > 0 {
    i -= 1;
  }
  buckets.at(i).put("key", "value");
});
```

This was an important change to unblock many use cases. Unfortunately, a side effect is that we've lost the guarantee that every program that compiles and works in the Wing simulator will also work in the cloud. That's because the simulator doesn't simulate permissions -- so even if bad `lift()` annotations are provided by the user, a program will still work in the simulator. Ouch!

As a rule of thumb, we want to maximize the extent to which Wing programs working on the simulator result in the same programs working on the cloud. Even though we can't catch these errors at compile time, they can be detected at runtime. To that end, this PR implements a basic permissions model within the Wing simulator. Whenever a resource tries calling a method on another resource, the API call will first be checked against a list of permissions (generated during preflight) to see if the request should go through. If not, an error will be raised to the caller. The end result is that as long as you're able to test the code path with a unit test, it should still work locally.

Here's an example of the error that will be surfaced to the user during `wing test` or in the Wing Console:

```
Error: Resource "root/Default/Function" does not have permission to perform operation "put" on resource "root/Default/b1".
   --> bucket.main.w:14:3
   | while i > 0 {
   |   i -= 1;
   | }
14 | buckets.at(i).put("key", "value");
   | ^
at /Users/chrisr/dev/wing-test/bucket.main.w:14:3
```

-----

### Implementation notes

**The good:** The main implementation is built around a new resource named `sim.Policy`. It stores a "principal" (a resource the policy applies to) and a list of statements (a list of resources and method names). It's probably not the only way to model this, but it felt like a decent starting point given that permissions are also represented with resources within CloudFormation and Terraform on most major cloud providers.

Policy checking is also straightforward. All requests between resources currently go through the simulator's internal HTTP server. We've added an additional piece of information that identifies who is the resource that's performing the method call, which is sent to the server.

**The mid:** How to store the information from `sim.Policy` in the simulator is an open question. This PR models `sim.Policy` like a resource (why not?). Except, nobody communicates with the policy directly. Instead, when the simulator is initializing or shutting down resources, if it sees a resource is a policy, it saves the resolved policy information into a policy registry. When we're checking if an API call should be allowed or not, it performs a lookup in the policy registry.

But does that mean `sim.Policy` has no effect on other resources? Not quite. First, an aside...

> Within the simulator, a resource can be identified in two ways. The first is with their construct path (a string like `root/MyApp/Storage/Bucket`), and the second is with a simulation-unique ID aka "handle" (a string like `sim-42`), which serves as an analog to physical resource names on cloud providers.

> To make this concrete, imagine you have a Wing app with a `sim.Container`, and you modify the container's image. When the simulator is updating to the new version of your app, the construct path for the resource will stay the same, but the resource itself needs to be destroyed and recreated, which results in a fresh simulation ID.

> Today this difference between these IDs isn't strictly necessary - it's possible we could use construct paths everywhere in the simulator. But using these IDs helps us detect dependency relationships in the resource graph, and if we wanted to model other kinds of application updates in the simulator (for example, Terraform's [`create_before_destroy` property](https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#create_before_destroy)), then this distinction suddenly becomes very important. So for now I'm trying to keep it.

Each `sim.Policy` refers to resources via their simulation handles, as we do elsewhere in the simulator. But this means a `sim.Policy` resource can't be created until all of the resources it refers to have been created. This introduces some interesting edge cases (for example, how does a `cloud.Service` call other resources during start-up when its `sim.Policy` can't be created until `cloud.Service` has finished starting up??) I managed to solve most of these cases with auxiliary resources that run after the initial resource's creation.

**The bad:** In order for resources to make requests to the simulator server with their own handles, they need to know their own handles. Unfortunately, to expose this information in `ISimulatorContext`, I needed to refactor inflight simulation classes so that simulator context is passed to `init()` instead of to the JavaScript constructor. This results in some uglier code, but I don't think the distinction between constructors and `init()` will be relevant once more sim resources are implemented in Wing.

## Checklist

- [ ] Title matches [Winglang's style guide](https://www.winglang.io/contributing/start-here/pull_requests#how-are-pull-request-titles-formatted)
- [ ] Description explains motivation and solution
- [ ] Tests added (always)
- [ ] Docs updated (only required for features)
- [ ] Added `pr/e2e-full` label if this feature requires end-to-end testing

*By submitting this pull request, I confirm that my contribution is made under the terms of the [Wing Cloud Contribution License](https://github.com/winglang/wing/blob/main/CONTRIBUTION_LICENSE.md)*.
mergify bot pushed a commit that referenced this pull request Apr 24, 2024
## Checklist

[rendered version](https://github.com/winglang/wing/blob/yoav/rfc-explicit_lift_qualification/docs/contributing/999-rfcs/2024-03-14-explicit-lift-qualification.md)

Related to #76, #5935


- [x] Title matches [Winglang's style guide](https://www.winglang.io/contributing/start-here/pull_requests#how-are-pull-request-titles-formatted)
- [x] Description explains motivation and solution
- [ ] Tests added (always)
- [ ] Docs updated (only required for features)
- [ ] Added `pr/e2e-full` label if this feature requires end-to-end testing

*By submitting this pull request, I confirm that my contribution is made under the terms of the [Wing Cloud Contribution License](https://github.com/winglang/wing/blob/main/CONTRIBUTION_LICENSE.md)*.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mechanism for explicitly qualifying lifting
4 participants