-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mechanism for explicitly qualifying lifting #76
Comments
An idea for the explicit capture syntax modified from https://github.com/monadahq/rfcs/blob/main/0044-winglang-requirements.md: let bucket_per_region = {
US: Bucket(region: 'us-east-1'),
EU: Bucket(region: 'eu-west-2'),
};
struct MyEvent {
data: str;
}
let handler = (event: MyEvent) ~> {
let bucket = bucket_per_region.get(event.region);
await bucket.upload('boom', event.data);
} captures [
// user can provide very granular information
{ obj: bucket_per_region.US, calls: [{ method: "upload", arguments: ["boom", Type.Str] }] },
{ obj: bucket_per_region.EU, calls: [{ method: "upload", arguments: ["boom", Type.Str] }] },
// OR user can just provide list of methods
{ obj: bucket_per_region.NA, methods: ["upload"] },
{ obj: bucket_per_region.NA, methods: ["upload"] },
]; I think for MVP it might be okay to start with just the list of method names - I feel like this is still enough to get useful feedback on the concept of "compiler automatically generates least privilege permissions between resources" from users. |
Imagine we 2 programmers in the team, bob and alice. Alice develops a function // this is wing code
function f_alice upload_file(bucket: cloud.Bucket){
bucket.upload(foo, bla)
} Bob wants to use this function as a lambda function // this is wing code
import f_alice
bucket = cloud.Bucket()
queue = cloud.Queue()
inflight function f_bob(){
f_alice(bucket)
}
queue.add_consumer(cloud.Function(f_bob)) The first thing that comes to (my) mind, is that f_bob code is where the capture happens but it is Alice responsibility to define what are the permissions needed from bucket that is sent to // this is wing code
function f_alice upload_file(bucket: cloud.Bucket requires "upload"){
bucket.upload(foo, bla)
} I think we first need to agree who responsibility is it to define what the function binding requirements, I believe that in this example it is Alice |
A more relevant example if when Alice code is not written in wing in this case, Alice is writing code in python: import wing
def f_alice(bucket)
bucket.upload(foo, bar)
# a non succinct option to export a python function to wing
wing.export(f_alice).withArg(0, wing.Bucket, wing.bind("upload")) |
@ekeren I think your example is interesting because it's possible that both functions may want to define information about their capturing. // this is wing code
function f_alice(bucket: cloud.Bucket){
bucket.upload(foo, bla)
} // this is wing code
import f_alice
bucket = cloud.Bucket()
queue = cloud.Queue()
inflight function f_bob(){
f_alice(bucket)
bucket.download(bar, bloop)
}
queue.add_consumer(cloud.Function(f_bob)) f_alice needs others to know that it needs to be able to call "upload" to the bucket passed to it, and f_bob needs others to know that it needs to be able to call both "upload" and "download" on the bucket it references. (The capture only becomes "real" or "live" when |
More syntax ideas...
Another twist could be if you could specify the permissions you want to grant on the resource, rather than the methods you need to call. For example, we could have a convention in the language where all permission APIs must start with
The compiler could then automatically add additional statements to whatever resources use
On second thought, I'm not sure designing this around permissions would really be a simpler experience for users. Plus, making permissions part of the type system feels like a weird mixing of responsibilities. Not sure though 🤷♂️ |
I think we don't and probably never will know exactly what information we'll need to pass to the capture mechanism in the SDK for it to figure out exactly how to capture. We tend to assume these are permissions because that'll probably cover most use cases but it can be other things like credentials, protocol specifications... I think eventually we'll need a syntax where we pass the captured object ( |
I agree that permissions are just one use case and we can't really be opinionated about what info we pass down to the resources when they are connected together. In a sense they need a way to reflect on the AST of the capture site. Maybe our language needs something like that. (This reminds me of the idea to render a step functions workflow by reflecting on the AST). |
One use case we've been discussing is:
Another use case we might want to consider is:
There are some ways to achieve this with the existing language spec: resource ReadonlyBucket extends cloud.Bucket {
public override ~put() {
throw new Error("Cannot put to a readonly bucket");
}
} But this isn't type safe, since you can pass a Instead, maybe there's a way to specify restrictions at compile time. Pseudocode example: constraint Readonly {
resource: cloud.Bucket;
methods: ["get", "list"]; // not allowing "put"
}
let handler = (bucket: @Readonly) ~> {
print(bucket.get("data.json")); // OK
print(bucket.put("data.json", stuff)); // compile error: `put` cannot be invoked on a @Readonly resource
}; The above syntax is totally half baked. But I wonder if there might be value in including capture-like constraints as part of our type system somehow. Does anyone have ideas for use cases for this? |
This is getting interesting. Before we dive into syntax, I think it would be useful to design the data model, then we can come up with the best syntax for it if we want to include it in our "header files" (currently seem like these are going to simply be JSII manifests with some extensions), in SDK code (maybe some docstring annotations) and in Wing code (e.g. to define explicit requirements and constraints). So basically, we are talking about a way to describe how a resource is used within an inflight function. And when we say how it's used, in the object oriented world, it basically comes down to which methods are being called on it, and which arguments are passed to these methods (thinking of properties as sugar for This description can have two reasons:
I am starting to think that maybe the right mental model here is "policy", and then maybe it makes sense to get inspiration from an existing policy models such as AWS IAM or Kubernetes RBAC or [OPE]. For example, in AWS IAM policies, each statement includes: resource (arn), actions (a set of method names on the resource), principal (consuming role), effect (allow/deny) and conditions. I think it is safe to ignore prinicpal here because the principal is basically the consuming resource (e.g. the Let's take an example: let greet = (bucket: cloud.Bucket, msg: str) ~> {
bucket.put("greeting.txt", msg);
};
let my_bucket = new cloud.Bucket();
let handler = () ~> {
let s = "hello, world";
greet(my_bucket, s);
assert(my_bucket.get("greeting.txt") == s);
my_bucket.put("hello.txt", "world");
};
new cloud.Function(handler); If I work backwards from the ideal desired result here, I expect:
So if I try to express this through the mental model of a policy, I'd say: When the wing compiler compiles the bucket:
- method: put
args:
object_key: "greeting.txt"
body: *
msg:
- method: to_str
Let's say it is possible to access this policy through
assert(greet.policy["bucket"][0]["method"] == "set")
assert(greet.policy["bucket"][0]["args"]["body"] == "*") Now, when the compiler compiles the my_bucket:
# from greet.policy.bucket
- method: put
args:
object_key: "greeting.txt"
body: *
# v2 of wing will know this is
# "hello world", so it can be
# even be more specific
# from "handler"
- method: get
args:
object_key: "greeting.txt"
# from "handler"
- method: put
args:
object_key: "hello.txt"
body: "world"
s:
# from greet.policy.msg
- method: to_str Now, when new cloud.Function(this, "Function", {
captures: {
my_bucket: {
obj: my_bucket,
policy: [{
method: "put",
args: {
"object_key": "greeting.txt",
}
}, {
method: "get",
args: {
"object_key": "greeting.txt",
}
}, {
method: "put",
args: {
"object_key": "hello.txt",
"body": "world"
}
}]
}
}
}); Then, for c in Object.values(props.captures) {
c.obj.capture(this /* cloud.Function */, c.policy);
} Now, the More thoughts:
|
@Chriscbr , regarding
When I was brainstorming about this with @yoav-steinberg, I went back to what @3p3r has said in the beginning about anonymous types inflight. |
@eladb lets consider the following example and try to see if this fits your proposed data model:
primary_bucket:
- method: get
args:
object_key: "maintenance" Note that we don't say anything about
west_b:
# from get_bucket.policy.primary_bucket
- method: get
args:
object_key: "maintenance"
# from "handler" - How can we figure this out!!!??
- method: put
args:
object_key: "hello.txt"
body: "world"
east_b:
# from "handler" - How can we figure this out!!!??
- method: put
args:
object_key: "hello.txt"
body: "world" |
@yoav-steinberg wrote:
Yes, this won't work without some control-flow analysis or some explicit information from the user. As you indicated, our current algorithm will synthesize the following policy for west_b:
- method: get
args:
object_key: "maintenance" For completeness, here's how this will happen: The compiler needs to synthesize a requirements policy for So we are now in a situation where this code will fail (in production) because the handler will try to call P.S. I am warming up to the term "requirements", as in |
Iterate on all inputs (explicit or implicit=captures) and for each input:
Alternative algorithm:
In the case of east_b:
- method: put
args:
object_key: "hello.txt"
body: "world"
b:
- method: put
args:
object_key: "hello.txt"
body: "world" So now, we have a set of statements in the namespace of We are left with |
If we could reliably calculate the set of reaching definitions for |
Definitely something to consider as the compiler evolves, but as we discussed, since we need a way for users to resolve those ambiguities explicitly, doing this type of static analysis is not in the critical path. |
I have to voice my concern over the direction this model is taking. I understand that the purpose of doing this, is to calculate minimum perms required to make a captured code work in cloud runtime. But I have concerns about how to scale this. Every cloud has well over 200 resources. We are starting with a few, which are abstractions themselves. And everything seems to be designed around those few abstractions. From simulator to now permissions. I am not seeing discussions or concerns on how to scale this eventually to hit all resources. It is beginning to feel like design is going from top to bottom instead of bottom to top. We cannot maintain this polycon and permission model for that many resources without some sort of automation. Example: using iamfast to generate policies for AWS. |
If all Wing can do is to generate minimum permissions required for its own set of limited polycons, we really haven’t solved anything in the current landscape. We made new problems and solved those new problems ourselves. There is no value proposition in it. The problem of “how do I generate minimum permissions for polycons” does not exist. The problem is “how do I generate minimum permissions for cdk-tf L1 constructs”. Wing syntax is abstraction of what’s currently out there. We shouldn’t add new problems to it. |
Another design concern is that: cloud SDK API is now being merged into Constructs API under polycons and their clients. Past a few tech demos, how does this scale? How do I consume the AWS SDK in my Lambda in Wing? If I can’t, then we introduced new problems, so what’s the point? If I can, then this permission model doesn’t make sense. Majority of cloud runtime logic is through access to cloud SDKs. We are abstracting both the SDK and CDK with no discussion on how this scales to cover all clouds and resources again. |
A solution that comes to my mind that addresses this at a fundamental level is JSII-ifying major cloud SDKs through automation and instrumentation. Through this process we can inject permission metadata into the manifest and have the compiler simply query for those permissions during compilation. That way permissions are not tightly coupled with polycons anymore. Polycons can exist in their own world, permissions can also live on their own. Sounds terribly hard, but it’s not supposed to be easy to solve. That’s the value proposition. Anyone can make abstractions. But not many can address this at infrastructure level. |
I get what you're saying @3p3r about solving problems we introduce and not ones users have now and about the scalability of our solution. |
What problem do Polycons address besides abstraction of multi cloud resources and respective SDK calls for Wing? Here’s Wing’s original pitch: we made a language that eases the use of cloud, CDK, and constructs. Note that even in this pitch, we added something new, which is the syntax. But that’s okay because we solved an existing problem. We are designing everything around Polycons and introducing hard dependency on Polycons in everything else:
What’s Wing doing here again? We’re building a language but are still stuck thinking in JavaScript domain. Emphasis is on making Polycons work both in and out of Wing more than how Wing internals should properly address the existing issue we are trying to solve. It is not appetizing at all if, as a user, you tell me I need to write everything in a certain way (Polycon) inside a new syntax. I now have two problems to deal with. This is what top to bottom design is. It’s not engineering. We’re just cruising to get a tech demo working. We’re designing ourselves into a corner so when the time comes to scale, non of this is usable. Wing internal should be devoid of Polycons entirely. It should focus on CDK TF L1 constructs, what it actually compiles to. Not the abstractions we introduced. |
Here's a simplified version of what I am asking, if it helps conveying my point across: What's the endgame vision with this Polycon design, and how does it hit every resource in CDK-TF without manual intervention? How does it help solving the original problem with CDK? Forget about the product and marketing aspects, just look at it from pure engineering POV. If we can't answer this as a team right now, we won't be able to answer it 6 months down the road either when we have accumulated so much code. |
@3p3r I think you raise some valid concerns. To share one perspective, right now I think we're just using polycons as an "extension" of That said, I do feel like there's good reason for having some kind of policy abstraction to deliver on the "all clouds are equal" promise. For example, let's take this hypothetical code snippet where we're trying to use Wing's inflight capabilities with ordinary (preflight-only) constructs: bring s3 from "@cdktf/provider-aws";
bring { S3Client, CopyObjectCommand } from "@aws-sdk/client-s3";
// ordinary constructs, not polycons
let srcBucket = s3.S3Bucket();
let targetBucket = s3.S3Bucket();
let handler = () ~> {
let client = new S3Client();
await client.send(new CopyObjectCommand({
// let's assume the arn's aren't tokens for now
Bucket: targetBucket.arn,
CopySource: "${srcBucket.arn}/hello.txt",
Key: "hello.txt",
}));
};
let handlerFn = cloud.Function(handler); My hope is something like this could be supported in Wing! But, because SDKs for resources are not standardized across clouds or providers [1], we can't reliably infer a list of operations to later be associated with the function signature. We could dispatch to an external library like iamfast to produce a list of needed IAM permissions, but this would be placing a significant part of our value on a cloud specific tool. I think we should aim for a more cloud-agnostic architecture, even if it requires a little more work from library authors. Going back to the example above, the only values that get captured from outside are resource ARNs, so the user would need to manually add permissions for this code to work: handlerFn.addPolicyStatements([{
effect: ...
action: ...
resource: ...
}]); This is all good and fine. But how do we provide the "automatic minimal policies" experience in userland - i.e. for resources outside of the Wing SDK? One option could be to let the user model create a resource with an inflight method: // extending an ordinary construct (this is still not a polycon)
resource MyBucket extends s3.S3Bucket {
~copyFrom(source: s3.S3Bucket, key: str) {
let client = new S3Client();
await client.send(new CopyObjectCommand({
Bucket: this.arn,
CopySource: "${source.arn}/{key}",
Key: "{key}",
}));
}
} Then when let handler = () ~> {
await targetBucket.copyFrom(srcBucket, "hello.txt");
}; ... the compiler can see "ok, we are capturing an inflight method named "copyFrom" on a Wing class, so I can annotate this handler's signature with a policy associating The logic for automatically turning policies into actual resource permissions must also be part of a construct's API, but it's a small amount of boilerplate: resource MyBucket extends s3.S3Bucket {
...
// this API still doesn't feel right to me
bind(compute: IInflightRunner, policy: cloud.Policy) {
if !(compute instanceof aws.Function) {
throw "unsupported - only supporting being captured by AWS lambdas for now";
}
if policy.methods.includes("copyFrom") {
compute.addPolicyStatements(...);
}
}
} A lot of this code is imagined up, so I'm curious what parts resonate and which don't. 🙂 [1] For example, some terraform resources might not have dedicated JavaScript SDKs, and could instead require you to hit HTTP endpoints perhaps? |
Your first example actually perfectly explains why I am confused.
But it does contain everything compiler needs to know about perms though. Compiler sees a "send" operation is being called on a client of type "S3Client" which is uploading to a construct of type "S3Bucket" with a known ARN. That's all the pieces needed to generate permissions right there. That's where my disconnect comes in. I see all the data in your basic example without Polycons and I am wondering why there needs to be another layer of abstractions on top of that, when we can just see underneath the abstraction with the compiler as well. BTW I am not saying we should use iamfast. I am saying we should create software like iamfast that does this for all other clouds. We can take inspirations from iamfast, but I am not particularly fond of iamfast's design either. |
Following up on #1682 to identify cases where we cannot infer which operations are performed on a captured resource. See positive/negative tests for examples. Rewrite the algorithm which analyzes the expressions captured by inflight methods so that it is able to identify more cases and emit errors when captures cannot be qualified (i.e. a resource is captured but we cannot determine which operations are performed on it without static analysis). For each method, we identify all expressions that start with `this.xxx` and break them down into parts (using nested references). Then, we traverse the list of parts and split the expression into *preflight* and *inflight*. The preflight part is what we are capturing and the first inflight component qualifies which operations are performed on the captured object. Reorganized capture tests into `resource_captures` (both under valid and invalid). This does not address #76 but it explicitly identifies these cases. We will follow up at some point with a way to allow users to explicitly qualify the reference. *By submitting this pull request, I confirm that my contribution is made under the terms of the [Monada Contribution License](https://docs.winglang.io/terms-and-policies/contribution-license.html)*.
…ts (#5935) Fixes: #76 Creates a `lift` builtin function that can be used in inflight code to explicitly add lift qualifications to a method: ```wing bring cloud; let bucket = new cloud.Bucket(); bucket.addObject("k", "value"); let some_ops = ["put", "list"]; // We can define a list of ops in preflight code to be used when explicitly qualifying a lift class Foo { pub inflight mehtod() { lift(bucket, some_ops); // Explicitly add some permissions to `bucket` using a preflight expression lift(bucket, ["delete"]); // Add more permissions to bucket using a literal log(bucket.get("k")); // Good old implicit qualification adds `get` permissions let b = bucket; // We can now use an inflight variable `b` to reference a preflight object `bucket` b.put("k2", "value2"); // We don't get a compiler error here, because explicit lifts are being used in the method disabling compiler qualification errors for k in b.list() { // `list` works on `bucket` because of explicit qualification and `b` references `bucket` log(k); } b.delete("k2"); // `delete` also works because of explicit qualification assert(bucket.tryGet("k2") == nil); `yay!` } } let foo = new Foo(); test "a test" { foo.mehtod(); } ``` ## Checklist - [x] Title matches [Winglang's style guide](https://www.winglang.io/contributing/start-here/pull_requests#how-are-pull-request-titles-formatted) - [x] Description explains motivation and solution - [x] Tests added (always) - [x] Docs updated (only required for features) - [ ] Added `pr/e2e-full` label if this feature requires end-to-end testing *By submitting this pull request, I confirm that my contribution is made under the terms of the [Wing Cloud Contribution License](https://github.com/winglang/wing/blob/main/CONTRIBUTION_LICENSE.md)*.
Congrats! 🚀 This was released in Wing 0.61.17. |
## Checklist [rendered version](https://github.com/winglang/wing/blob/yoav/rfc-explicit_lift_qualification/docs/contributing/999-rfcs/2024-03-14-explicit-lift-qualification.md) Related to #76, #5935 - [x] Title matches [Winglang's style guide](https://www.winglang.io/contributing/start-here/pull_requests#how-are-pull-request-titles-formatted) - [x] Description explains motivation and solution - [ ] Tests added (always) - [ ] Docs updated (only required for features) - [ ] Added `pr/e2e-full` label if this feature requires end-to-end testing *By submitting this pull request, I confirm that my contribution is made under the terms of the [Wing Cloud Contribution License](https://github.com/winglang/wing/blob/main/CONTRIBUTION_LICENSE.md)*.
Workaround
Since it is possible to lift inflight closures (because a closure's "operation" is always "call me"), then, it is possible to work around the limitation we currently have by wrapping the operation in a closure and lifting this closure.
In the following example,
selectBucket
returns acloud.Bucket
and then the.put()
operation cannot be qualified.To overcome this limitation, we can change
selectBucket
to actually perform the operation:We need a syntax and implementation for the user to explicitly define rules describing how inflight code is going to use a resource.
The rules can be passed to the resource's capture implementation.
The most basic rule is obviously what inflight methods of the resource are being used. This can be used to set up permissions, for example. But we should also keep in mind that there might be information we want to pass describing possible ranges of arguments to these methods or regexes for valid argument values etc.
Example:
In the future we should also have a way to automatically generate these rules based on the code.
The text was updated successfully, but these errors were encountered: