Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java-style @annotations for user-guided optimization #45

Merged
merged 6 commits into from
Aug 10, 2023
Merged

Java-style @annotations for user-guided optimization #45

merged 6 commits into from
Aug 10, 2023

Conversation

WardBrian
Copy link
Member

Rendered version here: https://github.com/WardBrian/design-docs/blob/annotations/designs/0033-declaration-annotations.md

This is an idea @rok-cesnovar and I have been discussing on and off for a while. The basic summary is allowing (completely optional) annotations like

functions {
  @foo void bar(int x){...}
}
parameters {
  @baz matrix[3,3] A;
}

Individual interfaces/settings could then define specific annotations they would like to use, such as a @gpu annotation used for OpenCL optimizations. The key idea is that unknown/unused annotations are safely ignored, so a model which uses them could still be compiled by a backend or mode which doesn't use them.

@betanalpha
Copy link

I sent these comments earlier this month but there was some kind of awkward conflict between gmail and github preventing it from being received.

Two comments:

There were long discussions about annotations/decorators in the past and one aspect about which I am still a bit squeamish is the ultimate scope of annotations. Most of the annotations presented as examples concern how a Stan program is transpiled into a C++ program and finally compiled into an executable, all of which is within a common scope. I have no problems there. The @silent annotation, however, influences how the Stan program can be used by a Stan interface which breaks the current compartmentalization. Right now a Stan program simply defines necessary input data and a target distribution and its gradient that can be queried on demand, with the interfaces entirely responsible for how that Stan program is used. In other words at the moment I can fully understand how a Stan program will be used by looking at how the stan call is configured in the interface, but with annotations like @silent I would have to also consult the Stan program to understand what will happen.

I also worry if the annotation syntax might be too easy for users to confuse with types. I understand that the proposed style is becoming conventional but I wonder if a style with opening and closing escapes, for example @foo@ instead of @foo would be more easily interpretable because they superficially mimic inline comments. Syntax highlighters would also be able to color these annotations differently to visually separate them out from the type definition (although I guess syntax highlighters would be smart enough to highlight the annotations regardless of escape).

@rok-cesnovar
Copy link
Member

I agree that the @silent definitely bears additional discussion and it might be overstepping into the interface's territory a bit. I don't want this discussion to go in the direction of whether or not @silent is something we want to do. I also think it is not something anyone is actively pursuing or working on (at least to the best of my knowledge). Maybe we remove it if it

I think the main first use case for annotations is to serve as a way to not drag more implementation details into the "core"-language. For now, I think the @static and @gpu (or whatever we will call them in the end) are the main candidates that would benefit from this at this time.

It would be a solution for #32 as I think it makes more sense to have the implementation detail be a different part of the language, not just a regular function. I think that can be even more confusing to the end-user. Obviously, all the caveats of "we need to accompany this with a lot of good docs" and "we will never be able to stop users that want to do wrong things do them" apply.

Both the "varmat" and GPU projects are huge and their implementations in the backend are done or mature enough that they can be useful for the general public in the language, not just those that know their way around C++. But I think we are more or less sure now that seamless integration of both of these projects into the language, where they can be really useful or have a huge impact with only turning on a compile-time flag, is probably not going to yield as big of a return on investment as we hoped. We can do some "magic" in the compiler and we already do it and it yields good results, but we know it can only address a subset of the problems/models it could be useful for. @betanalpha's issue I linked above is a very good example of that.

@WardBrian
Copy link
Member Author

I agree @betanalpha - I mentioned @silent mainly as I stumbled across the idea in the old working repository for the first versions of stanc3, but it is a different kind than the others proposed. I am not sure how it would need to be implemented, but if it is as you say where there is no simple way of avoiding it without changing the semantics, then I agree it is not a good fit for this proposal, or would at least warrant a design doc of its own right before being implemented.

I also agree with @rok-cesnovar - this proposal is not saying we will implement all of the the possible options the system allows. The core idea is to allow backend-specific, user-directed program optimization without dragging those details into the core language, which would be a win for end users and for the language maintainers.

As an aside, I am struggling to use terminology for "user directed optimization" which cannot be confused for optimization, the inference technique. Suggestions welcome

@WardBrian
Copy link
Member Author

During today's language meeting @bob-carpenter made the point that something like static / const is probably better off being a core feature of the language, so that we can enforce it in the typechecker and encourage a more functional style (especially if we implement comprehensions and lambdas). He agreed this was the right solution for something like @gpu

Also discussed was the use of @likely and @unlikely on integer declarations which directly translate to the C++ equivalent when used in conditionals. This could be useful for models like those used in brms/rstanarm which use data integers as flags for different computation

@betanalpha
Copy link

Given how hard it will be to separate out the general design of annotations from the particular effects of certain annotations, in particular ensuring compatible effects across multiple annotations, I wonder if it would be most productive to

  1. Agree upon the general annotation style (there don't seem to be any complaints against the style suggested in the RFC).
  2. Pick out one annotation to start with, implementing it both in the language and the compiler, and then rolling out documentation and updated based on user feedback.
  3. Adding new annotations one at at time, with each new annotation considering potential conflicts with the existing annotations.

The sequence of annotations considered will of course be arbitrary but it seems like there is some low hanging fruit (static or const for example) that could be prioritized to get useful features in reasonably quickly while also providing a test bed for the implementation of other annotations.

@bob-carpenter
Copy link
Collaborator

@betanalpha: That's the plan. This spec isn't about any particular annotation, just the general scheme. I suggested it'd be helpful to have some examples in a bit more detail to help understand the general spec, but this spec isn't about those examples.

@WardBrian
Copy link
Member Author

WardBrian commented May 3, 2022

Just to echo, that is the proposed way forward. I think a good choice for the first would be @gpu, since I know @rok-cesnovar has been wanting to work more on the OpenCL optimizations.

@yizhang-yiz
Copy link

Agreed. This is a general way to support context-specific demands. Another example came to mind is augmented output of ODE solution process.

@betanalpha
Copy link

betanalpha commented May 4, 2022 via email

@WardBrian
Copy link
Member Author

@betanalpha I've added a version of your wording to the design doc. Thank you for the suggestion

@WardBrian WardBrian requested a review from bob-carpenter May 12, 2022 18:39
@spinkney
Copy link
Contributor

I want to add that this could open up our constraint transforms to have methods. Not sure what the implementation would look like exactly but here's a possible example

unit_vector[N] uv1 @hyperspherical;
unit_vector[N] uv2 @muller;  // or whatever we call the current implementation from https://dl.acm.org/doi/10.1145/377939.377946
unit_vector[N] uv3; // this uses the "default"

@WardBrian
Copy link
Member Author

I think something like that would deserve it’s own language feature

@betanalpha
Copy link

betanalpha commented Jun 1, 2022 via email

@WardBrian
Copy link
Member Author

I am bumping this to see if there are further comments and if someone would like to serve as the reviewer for this topic.

@rok-cesnovar and I have recently been discussing some alternatives to the currently fairly-brittle external C++ features exposed by cmdstan, and the @extern annotation on functions mentioned as an example here would serve this purpose very well.

@bob-carpenter
Copy link
Collaborator

the currently fairly-brittle external C++ features exposed by cmdstan

I'm unclear on what it means to have external C++ features exposed by CmdStan. You mean the makefiles and their config? Otherwise, it feels like a command-line interface through files without any C++ to be seen. Also, how does this relate to annotations, which is what this pull request is about. I don't get the connection to @extern.

So at least as of now, I don't think I could review this. Instead, I'd like to sit down and talk to @rok-cesnovar and @WardBrian about the intended scope, as it's very unclear to me from the examples being discussed in this GitHub conversation.

@WardBrian
Copy link
Member Author

I'm unclear on what it means to have external C++ features exposed by CmdStan.

CmdStan supports the user supplying a function definition as a C++ header file: https://mc-stan.org/docs/cmdstan-guide/using-external-cpp-code.html

But, the way this is currently implemented in CmdStan means that it breaks every time we change how the signatures are generated (see stan-dev/cmdstan#1119), and has other annoyances like the user's code being outside the model namespace.

If we had the ability to annotate an individual function spec with @external("foo.hpp"), for example, it would allow us to place the contents of foo.hpp inside the model namespace where that function would normally have it's code generated. This is only tangentially related to the proposal (and accepting this proposal would not mean we 100% were going to do that annotation), I only brought it up as motivation for why I am reviving this thread.

@bob-carpenter
Copy link
Collaborator

Yikes---I forgot about that feature that got ported over from RStan. That is a real maintenance nightmare, especially given how much the math library coding conventions keep changing.

@WardBrian
Copy link
Member Author

Yes, it is difficult, especially since the current state of things mandates that you use the require_* template arguments, since we generate a C++ declaration for the function anyway. By labeling it as external we could skip generating anything for that function, and then you as the user should be able to supply any C++ function which can be called with the required arguments, rather than one which fits exactly the spec we use ourselves.

That is all probably best left for a separate discussion, though! I really just meant it as a motivating reason and possibly an early use of this concept

@andrjohns
Copy link
Contributor

Completely unrelated to the design-doc discussion here, but could the external c++ issue be solved by not generating the "placeholder" c++ for undefined Stan functions (rather than requiring the --allow-undefined flag and generating the empty signature)?

Completely off-topic, so feel to move this question to a different issue/discussion

@WardBrian
Copy link
Member Author

Just to quickly say: yes, that’s possible, though it requires either marking during typechecking which declarations got definitions eventually, or re-checking that during code gen (which turns into one of those “accidentally O(n^2)” situations if you’re not careful). An annotation or anything else which gives us that information locally/explicitly avoids that

Maybe someone is relying on the current behavior though, so the annotation is also nice because it is an opt-in

@andrjohns
Copy link
Contributor

Just to quickly say: yes, that’s possible, though it requires either marking during typechecking which declarations got definitions eventually, or re-checking that during code gen (which turns into one of those “accidentally O(n^2)” situations if you’re not careful). An annotation or anything else which gives us that information locally/explicitly avoids that

Maybe someone is relying on the current behavior though, so the annotation is also nice because it is an opt-in

Ahh makes sense, thanks for clarifying!

Copy link
Member

@mitzimorris mitzimorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this PR.

@mitzimorris mitzimorris merged commit c0053a1 into stan-dev:master Aug 10, 2023
@WardBrian WardBrian deleted the annotations branch October 1, 2024 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants