-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FR: Generalized hook support #3577
Comments
cc @arxanas for his opinion on hooks, as we discussed them quite heavily for Im very much on board on providing client integrations and having a pub-sub like event system in the daemon but I don't want to have it in the cli. Having external languages and binaries would fix some re-occurring ideas in relation to the template language #3262 and this recent Discord discussion. |
Could you elaborate on what you mean by "don't want to have it in the CLI"? My line of thought, as I said above, was that:
This would mean that when you run a CLI command such as What was your vision of how hooks would work? |
That's all fine, if you can take the performance penalty and latency from the external binary.
It has no place there, as it severely overlaps with
I'd rather just offload the whole hook system to the daemon and it is responsible for everything, the CLI only emits events which you can react upon and nothing more. |
In order to be sufficiently generic, any hook is necessarily going to have to be an external binary. I'm not sure I get what you're saying. Otherwise, rather than a generic hook, I'd have a fixed set of hooks that come packaged with jj that I could use.
Let's use a more concrete example. Suppose I wanted to make a post-describe hook that validated that the commit had a bug associated with it in the commit description. What would the workflow look like? My imagination was:
I think necessarily any time a hook runs, the CLI is going to have to wait for that hook to finish so it can check whether the hook succeeded or failed. That being said, step 3 could be skipped if we just have a permanent API server running in the daemon.
I think the workflow I'd imagined is impossible with what you're proposing. What user workflow do you imagine? Could you take me through what happens both from a user perspective and a code perspective with your idea. |
One reason I've avoided hooks so far is that we sometimes want solutions that can validate the commits after the operation is done anyway. One example of that is when you rewrite commits on the server (e.g. via some GitHub UI). To better understand the use cases, what does your pre-upload Gerrit hook do? |
I think that both pre-* and post-* hooks are reasonable, and complementary. I don't think, however, that we can say that the existence of some way to do post-validation means that we shouldn't do pre-validation.
TLDR: It first finds the binary for the pre-upload hook, then it runs that binary, passing in a list of git commit shas. ChromeOS is a multi-repo setup. You can think of it like git submodules except we don't use submodules - we use a tool called repo instead. We have one repo called repohooks which contains all of the hooks. When we run Pre-upload.py does various things such as:
Note that it does this on every commit in the stack, not just the top commit. Unfortunately, just running |
If we are going to have hooks, then I think upload is a good case where they would be useful, but it might be too late to do it after uploading (you might have already accidentally uploaded your password file then). I think the hooks you've mentioned above only require readonly access to the data. That can mostly be done after committing the transaction instead. That's not true if we also want to support hooks that can rewrite content, descriptions, branches, etc. It might be good to compile a list of use cases for that too. |
Upload hooksPre-upload hooksNot too much to say. Pre-upload hooks should be able to mutate the commits in the stack (eg. running formatters). Post-upload hooksI can't really think of a use case for this. Maybe you could have a hook to trigger some kind of github action, for example, but it seems like that should be set up on the server side? I'd love to hear other people's opinion on whether this was useful. Describe hooksI think these are relatively un-contentious.
fn describe(commit: Commit) -> {
let mut msg = match pre_describe_hook(commit) {
Ok(m) => m,
Err(e) => {
// If the initial description was invalid, we still have to be able to edit that description, so it doesn't really make sense to have these fail. So any pre-describe hook failure is considered an internal error to the hook.
warning(e);
msg
}
};
while (true) {
msg = run_editor(prefilled);
match post_describe_hook(Commit{msg=msg, ..commit}) {
Ok(edited) => return edited,
// Rerun the editor
Err(e) => print(e),
}
}
} Commit hooksI've saved this for last because these are by far the hardest.
Pre-commit hooks are hard & confusing because every time you run any
I don't think post-commit hooks make a lot of sense. A post-commit hook can only really do validation, and you can just as easily run that validation in pre-commit. Because of all the difficulties described above in detecting when a "commit" intention occurs, I'm honestly not a big fan of pre-commit hooks. Instead, I'd personally prefer pre-upload hooks. I can see some small justifications for why pre-commit is better than pre-upload, but I think that being able to manually run my pre-upload hooks would be the way to go, since it's just so hard to detect when you actually "commit" in jj. Other Git hooks
|
In the context of a GUI, would it make sense that post-commit could be a trigger to update the display? |
Since transactions are cheap (atleast for the Git backend, don't know if that counts for Piper), we can just rewrite the actual objects with a new transaction instead of providing hooks, which would fail in the CLI. This would match the behavior of a hook and since all of this happens in the daemon, it is something you can easily opt out.
I still don't think that such a thing belongs into the CLI. I implore you to go through to the I hope I made my idea clear, that keeping hooks out of band of the CLI is long term better. And RE: Pre/Postsubmit hooks, we'll first need some definition of forge in library for it to be useful. |
Note for anyone reading this thread in the future: #1869 is the tracking bug of
I'm not neccesarily opposed to this.
To me, the most important thing is the user experience, and implementation details are secondary. If we can have a great user experience without putting it in the CLI, I'm certainly not opposed to that, but it's really hard to evaluate whether Suppose a repo foo has a rule that the first line of a commit message must not exceed 80 characters, and the user wants to validate that when they write a description, it adheres to that specification. The requirements I would like to impose upon a user journey are:
Based on reading the things you asked me to read, I'm not 100% certain, but it seems that you disagree with the user journey itself, and disagree with the fundamental need for a explicit implementation of a "hook". So my questions for you are:
From what I can tell from piecing together bits and pieces from what you're saying about doing this out of band, with a pub/sub message, and reading between the lines a lot, what would happen under the hood would look like this:
I hope you understand why I'm confused, because in the final step there, there is no way to report the failed validation to the end-user. I'm sure your idea works, but unfortunately I don't understand what you're trying to say here, so I'd appreciate it if you could elaborate on precisely what you want. Because you've communicated vague ideas rather than a precise set of steps, it's hard for me to tell what you're trying to say. |
I've been considering this further, and also chatted to @mlcui-corp offline. My conclusions are:
I think what we should do is:
I also think it's important to distinguish hooks from some sort of pub-sub system for messages. @joyously raises a good point that it's useful to have a post-commit hook so that the IDE can know when to update. However, the differences between hooks and a pub-sub system (from my perspective, at least) are:
I think that an IDE would be better off subscribing to a pub-sub system, so that it simply gets notified when events occur. This would mean that the user doesn't have to configure the hook for the IDE themselves, and it also means we can provide events for things that really shouldn't need to be a hook (eg. the user runs |
@mlcui-corp also mentioned to me that pre-describe hooks could be made more difficult by I don't think this changes my position on the matter - I think pre-upload hooks only make a good start. |
Does this refer to I spent a few years in the internals of WordPress, which uses a hook system in PHP (synchronous). It is amazingly flexible. One of the parameters is a priority, so that the code is run in priority order. There are two types of hook: one is a "filter" which can modify the first parameter (and must return it, so that filters can be chained), and the other is an "action" which can do whatever but no return value (more like pub-sub). Some big plugins also invoke the hook functions for their own hooks. |
Currently, pre-upload hooks refer only to My intention is that any jj command which supports uploading would end up supporting this. This wouldn't be limited to the git backend - for example, internally at google we would be able to use this for a piper backend.
IIUC, what you're trying to ask here is "can we make sure that pre-upload hooks are able to be run without actually pushing anywhere?". My intention was something similar to Alternatively, we could create a specific command to run your pre-upload hooks. Your idea about the two types of hooks seems interesting, but I'm very hesitant to implement something like that. A pub-sub event has the rather nice property that it is guaranteed not to mutate the repository, so I'm happy to sprinkle them all over the place wherever someone might want to know about something. However, a regular hook can make no such guarantee. Allowing a user, for example, to make a hook when a branch is created means that we can no longer guarantee that a branch creation won't modify commits. I'm no expert on this, but I imagine that loosening these constraints could be a potential problem. |
Actually, I was thinking more in terms of needing hooks in other places besides pre-upload since those scenarios I mentioned wouldn't be uploading. At the same time, if the hooks are on push, they would run even when I push from one folder to another on my computer? I agree that |
Ah, I get you now, thanks for clarifying.
For contributors stuck offline for a while, I imagine I'm thinking of it less as a pre-upload hook, and more of a pre-publish hook (where you usually publish via upload, but users could choose to do whatever they want. |
Also, @PhilipMetzger, I'd like to hear your thoughts on my updated plans, since you seemed to have issue with hooks before. I understand that you don't want hooks in the CLI, and after rethinking, I'm ok with that for most use cases - I'm no longer advocating for a describe hook. However, I strongly believe that we need pre-upload hooks. For example, consider my workflow when I contribute to
With
This is clearly superior, but IMO is still insufficient. If I do it this way, there are so many ways to upload an invalid commit:
I see the purpose of a pre-upload hook as being complementary to
I'd like to combine these. For example, I might create a file #!/bin/bash -eu
jj run -r 'mutable()::@' 'cargo fmt && cargo clippy && cargo test' Then my workflow would be:
Note that I should clarify that I don't really care about the specific implementation (the bash file above was an example). I only care that from the perspective of a user of jj, I can make sure that jj doesn't let me upload a bad commit. Right now I'm trying to get buy-in on the concept of a pre-upload hook, rather than any specific implementation of it. |
Here's a simple proposal for an example implementation. I think it's quite nice, but as I said before, I'm open to other ideas. Step 1: Multi-layered config filesWe start off by making
This would allow repositories to create their own configurations, but allow you to override them if you wanted to (actual filename up for debate). Step 2:
|
Just a quick note that reading config from the repo has security problems. One solution to that would be to allowlist certain keys. I think Git and Mercurial don't allow reading any config from repos. I don't know if they did it that way for simplicity or if there are problems even with allowlisted keys that I'm missing. |
That's a good point. For security reasons, git can't run hooks unless you manually copy them to the |
I was thinking of having some way of storing "suggested config" in the repos that the user would have to explicitly import. It would probably be restricted to some allowlisted keys as well. The application I had in mind is to suggest a value for The problem here is that the exact UI and minor decisions here have security implications and I don't have a mental model of simple conditions that are "sufficient" for security. |
Actually, there was some "config-based hooks" effort in Git a while ago (a quick googling led to this). I don't know if that series landed (@nasamuffin surely knows), but it's a good idea to at least understand what that patch series was about regardless. |
I know the |
No, it did not land, but not for lack of interest or viability - just for lack of developer time. Google has been carrying the rest of that series downstream for years; we haven't had any problems with it. I'd absolutely recommend that approach for jj over the historical Git approach. |
Sorry for the late response here, I took some time off after that particular conversation. @matts1 and I had a private conversation off-thread.
|
Looking at the provided technical docs, this feature really looks like the pub/sub system in this FR with the concept of "Events" and all. So if we can provide a better implementation on that front I'm happy. And that is a really nice design, which sadly hasn't landed in Git yet 😦 . |
My point here was that by the point the validation had failed, the CLI command has already exited. If that's the case, then the daemon re-opening editor means that it can't simply re-open it in the same terminal. It would have to spin up a new terminal window containing a new instance of
I quite like the idea I proposed where we just use the existing jj configuration file. This could mean that we could just alias |
Ack, that makes sense. I wouldn't consider it a major blocker though, as there surely is a way to make it friendlier.
The proposal is fine, although I'd make some different choices. A bunch of nitpicks:
There's also the question of virtual aliases, so the And my idea with the protobuf configuration would be for the forge anyway, as having a stable representation there is helpful. Footnotes
|
I couldn't find any issues for fix when I looked. Could you link to where this was discussed? I'm a little surprised, since
My assumption was that with my virtual pre-upload step I defined,
This seems difficult to do with a regular alias. But maybe it's not on the table for run at all, in which case I guess a regular alias would work just fine.
You've probably already considered this, but having write as the default may leave performance on the table, since if I run `jj run -r 'immutable_heads()..@' 'cargo lint', then since I know it's readonly, I should be able to run it in parallel on every revision at once.
Could you clarify what specifically you needed sets for. IMO, sets can be mostly replaced with lists, and if we really wanted a set type
To the best of my knowledge, it's currently unsupported to have multiple jj commands in a single alias, but I do think it'd be helpful. I think something like the following would work relatively well:
This seems like it's introducing yet another configuration file for the forge. Plenty of these exist already. It seems easier to just have the repo define the configuration in a version-controlled file (#3684), and have the forge just invoke jj aliases (eg. |
There never was a issue, as it was decided on here and in this Discord discussion (see #3647 for the actual difference to
I get that you're surprised, but since
It never was on the table that
I'm happy to inform you that this is
I just understood your suggested syntax (looked like a python dict at first) as that, OTOH I don't think that TOML tables and arrays scale that well (cargo has a good reason to generate the lock file).
👍
Fair enough. |
After thinking about it, I see two possible nice solutions for pre-upload hooks. I quite like the second option: Option 1: Look for the alias pre-upload[aliases]
pre-upload = ["run", "-r", "@", "pre-commit run"] When you run
However, doing so would mean that if we want to ensure that pre-upload hooks are kept in the daemon (and can thus run even if you invoke upload from a tool such as an IDE), we would need to put CLI parsing inside the daemon, which I'm not a huge fan of. Option 2: Specific pre-upload config section[pre-upload]
command = "pre-commit run"
# JJ would provide an extra revset here containing the set of revisions that we are planning on uploading.
revset = "mutable()::uploaded_changes()"
# One of rebase, reparent, and readonly
mode = "readonly" Here, the pre-upload configuration would roughly equate to the command-line options for Comments on
|
I have a great deal written up against pre-commit hooks (as @PhilipMetzger alluded to), but it sounds like pre-commit hooks are not being discussed anymore, so no point in expounding further. It sounds like there's a great deal of complexity:
This is primarily to solve an interface problem: that the user can accidentally run some commands without satisfying some invariants. Suppose that we just allowed the user to override the built-in commands with their own aliases? It seems like it would be substantially easier, and more flexible anyways. This should still solve the per-repo configuration issue by providing a one-time setup script to run. For the case of large organizations, it seems they successfully get their developers to use blessed wrapper tools for things like builds, tests, code review, etc., so I don't think the hook thing is as big of a problem. jj is also designed so that individuals or organizations can provide their own custom build by plugging into jj-lib (which is what Google does right now?), so that seems like a natural place to insert any organizational checks, without having to work out a general solution for all the hooks that might ever exist. I think the For what it's worth, I use
It's become apparent that one of the primary reasons that I didn't used to run more tests is simply that it was inconvenient, and being able to test/fix in parallel, with caching, without conflicts, improves the situation anyways. If there are real pushes that need to be prevented (leaking credentials, uploading massive files), then they need to be on the server-side, rather than the client-side anyways. |
I agree. Outside of large organisations - like open source GitHub repos - most (from n=3) repos I've seen on tend to encourage running tests ( I haven't seen any repository which uses something like https://pre-commit.com, or a wrapper which does checks before uploading, other than large Google git repositories (Android / ChromiumOS with On the other hand, this might be a symptom of the tools we're used to. Maybe we're too used to the status quo of "uploading doesn't run checks" because existing tools make it hard to do so, and jj could make it a nicer experience? Does the existence of git wrapper tools like My personal opinion: I don't mind aliases to do this for most users, as aliases are very transparent to users and allows for fine-grained customisation. My current Chromium workflow uses I would expect most wrapper tools would have similar behaviour to |
I don't think this is particularly accurate. These were all being discussed as potential ways to solve the problem, but my most recent proposal was extremely simple. It required
I'd estimate that this could be written in ~100 lines of code without too much difficulty.
I think that's true, but I believe that there is almost always a set of things you want to run before uploading that are sufficiently quick and reliable to put in a pre-upload check. I think that 99% of the time I'm going to want to run
An alias could solve things for most use cases, but will prevent proper integration with tools such as IDEs. Also, I was told a while back that we were explicitly choosing not to be able to override builtin commands (as doing so would break any tooling that invoked them).
I don't think this is a solve for the problem. Sure, it mitigates it, but the way it is currently I need a mental model of which project I'm working on, so I know which wrapper tool to run. If I'm working on rust code in ChromeOS, for example, I need to run Also, when I start contributing on a new project, I don't want to have to spend a few minutes trying to work out how to upload the change correctly (I've done this before). Sometimes it's not clear how to run the formatter (with bazel, for example, they tend to not provide a binary to run the formatter, and expect you to just know to run
This doesn't really work. I work at google, and so I use google's jj, but I also work on a variety of open source projects hosted by a variety of companies. It's also a lot easier to convince a team of developers who primarily use git to just add a few lines of config to their repo to support jj developers than it is to convince them to literally maintain a whole rust project for that one jj developer who works on their repo.
I strongly believe that this is the case. I've been exposed to
I'm not a fan of aliases. I use them because there's not a better solution, but I don't think it's the best solution. I think that uploading to Chrome ( |
I am still not sure about how I feel about pre-upload hooks. For me, this is a tradeoff, and possibly a philosophical question of how jj should work - should jj commands be minimal and focused, or be "batteries included" (to be more specific - "have space for batteries (hooks) for users, if they want to plug some in")? I think there is elegance in supplying a minimal tool which does a single job well (just uploading to remotes), akin to the Unix philosophy. Maintenance of a minimal interface becomes easy, and tools can be very focused. Composition (like using aliases) would be encouraged. On the other hand, having hooks built in would definitely reduce the friction to have an "all-in-one" upload command, apart from initial hook setup, compared to the alternatives:
Adding pre-upload hooks would also encourage a standard of sharing common commands of "lint / fix / test a revset", which would make it much easier to have the "exact same command" behaviour across repos. We may need to be careful about how much of I think the "exact same command" benefit is a bit unrelated to the question of whether we want pre-upload hooks in the first place. A user can get the "exact same command" behaviour by creating some script which runs the things they want before upload, in any repository, and use My only concern about doing adding pre-upload hooks would be the extra cost to jj:
If this cost isn't too great, I'd be for adding in pre-upload hooks - even if it is just for encouraging a standard of sharing lint / fix / test commands. |
I miss being able to use https://github.com/cachix/git-hooks.nix |
My opinion on pre-upload hooks is very neutral, as they're primarily for interop and not a concept I'd want to keep long-term, we should rather fix it in a server implementation when we get there.
While the interface is generally simple and powerful, better forge integration has been a major point for a long time (#485). And providing something like pre-upload hooks are probably required to migrate off of Git, as a bunch of tooling sadly depends on it (like Nix via Cachix or Chromium's
I strongly agree too, but think we should fix it with our server implementation, when we get there. But as interop is important for
I think it's a major benefit if its done via
I don't think exposing another interface than
Agreed, they're the best solution until we have a better design for a native future. |
I came here looking for git-hook support as well, specifically Looks like I've got a lot of reading above to checkout. |
In case it saves you or someone else some time, there's PR #2845 for adding a "jj gerrit send" command. |
A case that I don't think I've seen mentioned yet in this issue, but that I'd find useful would be a hook for "Jujutsu has changed a file on disk". As a practical example, I have a frontend project built using Node and NPM. NPM has a lock file ( What I want is that when I switch between two branches/commits with different dependencies (i.e. where the lock file is different between the two checkouts), This approach could also be useful for Python (i.e. keeping a venv up-to-date), and probably other languages which have a package manager that installs dependencies locally when it runs (Ruby, PHP, potentially others). Alternatives:
This is mostly a quality-of-life feature, and it's possible to get by without it. But it could be very useful, particularly given how easy |
@MrJohz That's interesting. I'm curious to know:
There might be some overlap with the design for |
The way I see it, |
If I were doing this workflow (and maybe I'll try it out now, since I have some similar workflows 👀), I would be relying on the strict ordering between my [aside] There might be even more general infrastructure that would be helpful for various tooling, like a method of querying jj for the files that have changed in the working copy since a previous query's timestamp. Then my update command could inspect the list and more intelligently decide what work to do. This is a general problem that many organizations build tools to solve, for which it would be nice if source control were a little more helpful, but outside the scope of "hooks". |
Yes, if you rerun it multiple times, it will always produce the same really (again modulo network and cache).
Yes, generally. In my case, running
Potentially, yes - in theory, I only need to run That said, in my experience, even if I only intend to read the code, it's often still more useful to run the command anyway, because it's not unusual in that situation to later want to run the code as well, and running the command automatically saves me from forgetting what's going on.
Potentially, although my impression is that those are principally for running a command across multiple revisions (with rebases if changes are made). In this case, it would only ever be useful to run
That could be a good solution. The only things I would be concerned about would be:
|
I'm sure not every user of git hooks has the same model as me, but also I think jj is different enough that it makes sense to build stories about what people achieve with their hooks, and how we can model that (and even improve on it!) with the different way of looking at things that jj gives us. The way I'd like git's pre-commit hook to translate to jj is more akin to a "post-update-change-with-new-commit" hook -- and I'd suggest it would be reasonable to treat it in a similar way to the very nice way that jj's automatic rebasing "never fails", by adding a flag to the change if the hook is unsuccessful or modifies files. I'd be quite happy if the hook ran asynchronously, maybe after a delay (or fully lazily) in case I make another change, but the important (to me) thing is that (as with conflicts) I can see that I'm not yet finished with a change if the hook isn't happy, even if I'm not necessarily going to do anything about that just yet. That would be so much nicer than what git hooks give me today, especially when rebasing. |
Wanting post-commit for ci/cd or auto-push is why I landed here |
Is your feature request related to a problem? Please describe.
I was looking at #2845 and realized that it didn't meet my needs because it didn't support pre-upload hooks. I investigated further into hooks, and the closest I found was #405, which rather than generalized hooks, specifically is trying to solve pre-commit hooks.
Describe the solution you'd like
I was thinking of treating hooks like a pub-sub system. jj would publish events, which would consist of some metadata. Your hooks would simply be subscriptions to some subset of the events, matching specific metadata. For example:
We would then, similar to my #3575, run all hook binaries that have subscribed to that event in the global config.toml. Similar to #3575, these hook binaries would be passed a file descriptor with a connection to the jj grpc server. For example, my global
config.toml
might look like:We can also implement per-repo hooks in the same way that git has them by allowing additional subscribers from a
hooks/jj.toml
or similar, though I'd hold off on this in the short term due to security concerns.:For example, a pre-upload hook that forces your commit to have an associated bug might look like:
Though similar to my suggestion in #3575, we could provide preludes for common languages, and simply make the hook:
We would thus provide a method in jj along the lines of:
Describe alternatives you've considered
Didn't really consider much else, this is the first idea that came to mind. I thought it was a nice starting point. Other ideas are welcome.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: