FR: Generalized hook support #3577

matts1 · 2024-04-26T02:21:21Z

Is your feature request related to a problem? Please describe.
I was looking at #2845 and realized that it didn't meet my needs because it didn't support pre-upload hooks. I investigated further into hooks, and the closest I found was #405, which rather than generalized hooks, specifically is trying to solve pre-commit hooks.

Describe the solution you'd like
I was thinking of treating hooks like a pub-sub system. jj would publish events, which would consist of some metadata. Your hooks would simply be subscriptions to some subset of the events, matching specific metadata. For example:

message CommitDescription {
   string commit = 1;
   string description = 2;
}

message HookEvent {
    string workspace = 1;
    oneof hook_type {
        CommitDescriptionHook commit_description = 2;
        PreUploadHook pre_upload = 3;
    }
    ...
}

We would then, similar to my #3575, run all hook binaries that have subscribed to that event in the global config.toml. Similar to #3575, these hook binaries would be passed a file descriptor with a connection to the jj grpc server. For example, my global config.toml might look like:

[hooks]
subscribe_to_every_event = { binary = "/path/to/foo" }
pre_upload = { binary = "/path/to/bar", type = "pre_upload" }
specific_workspace = { binary = "/path/to/baz", workspace = "/path/to/workspace" }

We can also implement per-repo hooks in the same way that git has them by allowing additional subscribers from a hooks/jj.toml or similar, though I'd hold off on this in the short term due to security concerns.:

For example, a pre-upload hook that forces your commit to have an associated bug might look like:

conn = grpc.connect(file_descriptor(os.environ["JJ_SERVER"]))
main = importlib.import("/path/to/foo_ext.py", "main")
event = conn.get_event()

if "BUG=" not in event.commit_description.description:
    bug = input()
    description = f"{event.commit_description.description}\nBUG={bug}"
    conn.set_description(commit_id=event.commit_description.commit_id, description=description)

Though similar to my suggestion in #3575, we could provide preludes for common languages, and simply make the hook:

def main(conn, event):
    if "BUG=" not in event.commit_description.description:
        bug = input()
        description = f"{event.commit_description.description}\nBUG={bug}"
        conn.set_description(commit_id=event.commit_description.commit_id, description=description)

We would thus provide a method in jj along the lines of:

fn publish_event(event: &HookEvent) -> anyhow::Result {
    Vec<&Hook> hooks = filter_hooks(event);
    // Just a regular grpc server, but get_event() will always return the hook object.
    GrpcServer server = hook_grpc_server(event);
    for (hook in hooks) {
        let return_code = hook.run(grpc_server);
        if (return_code != 0) {
          bail!("Hook failed");
        }
    }
    Ok(())
}

Describe alternatives you've considered
Didn't really consider much else, this is the first idea that came to mind. I thought it was a nice starting point. Other ideas are welcome.

Additional context
Add any other context or screenshots about the feature request here.

The text was updated successfully, but these errors were encountered:

PhilipMetzger · 2024-04-26T16:04:41Z

cc @arxanas for his opinion on hooks, as we discussed them quite heavily for jj run (#405/#1869).

Im very much on board on providing client integrations and having a pub-sub like event system in the daemon but I don't want to have it in the cli. Having external languages and binaries would fix some re-occurring ideas in relation to the template language #3262 and this recent Discord discussion.

matts1 · 2024-04-28T23:32:08Z

Could you elaborate on what you mean by "don't want to have it in the CLI"?

My line of thought, as I said above, was that:

The hooks would be external binaries
The config file would specify the hooks
The pub/sub aspect of them would be done within jj-lib

This would mean that when you run a CLI command such as jj describe, it would run a pre-describe hook. I can't see any way of doing this cleanly other than that when you run the jj command you get hooks, so I'm rather confused by your statement of "don't want to have it in the CLI", both in terms of what you mean and why you don't want it.

What was your vision of how hooks would work?

PhilipMetzger · 2024-04-29T16:12:01Z

My line of thought, as I said above, was that:

The hooks would be external binaries

The config file would specify the hooks

The pub/sub aspect of them would be done within jj-lib

That's all fine, if you can take the performance penalty and latency from the external binary.

Could you elaborate on what you mean by "don't want to have it in the CLI"?

It has no place there, as it severely overlaps with run and forcing a pre-upload or presubmit hook in there makes the user experience rather unpleasant. If we can make the CLI emit just a proto message through IPC/vsock and the daemon handles the whole hook invocation, the user experience is kept smooth.

What was your vision of how hooks would work?

I'd rather just offload the whole hook system to the daemon and it is responsible for everything, the CLI only emits events which you can react upon and nothing more.

matts1 · 2024-04-29T22:41:46Z

That's all fine, if you can take the performance penalty and latency from the external binary.

In order to be sufficiently generic, any hook is necessarily going to have to be an external binary. I'm not sure I get what you're saying. Otherwise, rather than a generic hook, I'd have a fixed set of hooks that come packaged with jj that I could use.

I'd rather just offload the whole hook system to the daemon and it is responsible for everything, the CLI only emits events which you can react upon and nothing more.

Let's use a more concrete example. Suppose I wanted to make a post-describe hook that validated that the commit had a bug associated with it in the commit description. What would the workflow look like? My imagination was:

jj describe
Type a commit message without a bug
Start up an API server
Hook runs, connecting to that API server
Commit message validated via the hook, fails validation
jj describe fails with an exit status

I think necessarily any time a hook runs, the CLI is going to have to wait for that hook to finish so it can check whether the hook succeeded or failed. That being said, step 3 could be skipped if we just have a permanent API server running in the daemon.

It has no place there, as it severely overlaps with run and forcing a pre-upload or presubmit hook in there makes the user experience rather unpleasant. If we can make the CLI emit just a proto message through IPC/vsock and the daemon handles the whole hook invocation, the user experience is kept smooth.

I think the workflow I'd imagined is impossible with what you're proposing. What user workflow do you imagine? Could you take me through what happens both from a user perspective and a code perspective with your idea.

martinvonz · 2024-04-30T17:30:43Z

One reason I've avoided hooks so far is that we sometimes want solutions that can validate the commits after the operation is done anyway. One example of that is when you rewrite commits on the server (e.g. via some GitHub UI).

To better understand the use cases, what does your pre-upload Gerrit hook do?

matts1 · 2024-05-01T00:00:50Z

One reason I've avoided hooks so far is that we sometimes want solutions that can validate the commits after the operation is done anyway

I think that both pre-* and post-* hooks are reasonable, and complementary. I don't think, however, that we can say that the existence of some way to do post-validation means that we shouldn't do pre-validation.

To better understand the use cases, what does your pre-upload Gerrit hook do?

TLDR: It first finds the binary for the pre-upload hook, then it runs that binary, passing in a list of git commit shas.

ChromeOS is a multi-repo setup. You can think of it like git submodules except we don't use submodules - we use a tool called repo instead. We have one repo called repohooks which contains all of the hooks.

When we run repo upload, it first runs ${CHROMEOS_ROOT}/src/repohooks/pre-upload.py. For example, I usually work in the repository ~/chromiumos/src/bazel, so I have a script that finds the root directory based on the existence of a .repo directory (similar to how jj does it for .jj), and runs pre-upload.py <commit1> <commit2> before doing a git push.

Pre-upload.py does various things such as:

Validate that our formatter doesn't need to do anything
Check that there's correct license headers at the top of every file.
Check that your commit description has a footer containing TEST=<something>
Check that your BUG=<something> is in the correct format

Note that it does this on every commit in the stack, not just the top commit.

Unfortunately, just running repo upload isn't an option, because it works on the current branch, and doesn't work with detached head.

martinvonz · 2024-05-01T00:17:31Z

I think that both pre-* and post-* hooks are reasonable, and complementary. I don't think, however, that we can say that the existence of some way to do post-validation means that we shouldn't do pre-validation.

If we are going to have hooks, then I think upload is a good case where they would be useful, but it might be too late to do it after uploading (you might have already accidentally uploaded your password file then).

I think the hooks you've mentioned above only require readonly access to the data. That can mostly be done after committing the transaction instead. That's not true if we also want to support hooks that can rewrite content, descriptions, branches, etc. It might be good to compile a list of use cases for that too.

matts1 · 2024-05-01T01:19:52Z

Upload hooks

Pre-upload hooks

Not too much to say. Pre-upload hooks should be able to mutate the commits in the stack (eg. running formatters).
I imagine that they would get access to the jj API, and could use that to run something roughly equivalent to jj run 'immutable_heads()..commit_to_be_pushed' clang-format "$(jj files @)", or do whatever else they want.

Post-upload hooks

I can't really think of a use case for this. Maybe you could have a hook to trigger some kind of github action, for example, but it seems like that should be set up on the server side? I'd love to hear other people's opinion on whether this was useful.

Describe hooks

I think these are relatively un-contentious.

A pre-describe hook would pre-fill the commit message
- A hook failure would be a warning, and would leave the message unchanged
A post-describe message would both validate the commit message and perform potential fixups on it
- Failure of this would cause the editor to rerun

fn describe(commit: Commit) ->  {
  let mut msg = match pre_describe_hook(commit) {
    Ok(m) => m,
    Err(e) => {
      // If the initial description was invalid, we still have to be able to edit that description, so it doesn't really make sense to have these fail. So any pre-describe hook failure is considered an internal error to the hook.
      warning(e);
      msg
    }
  };
  while (true) {
    msg = run_editor(prefilled);
    match post_describe_hook(Commit{msg=msg, ..commit}) {
      Ok(edited) => return edited,
      // Rerun the editor
      Err(e) =>  print(e),
    }
  }
}

Commit hooks

I've saved this for last because these are by far the hardest.

Pre-commit hooks
- Can edit the tree (eg. running formatters)
- Can validate the commit (eg. running linters)
- Can block a commit

Pre-commit hooks are hard & confusing because every time you run any jj command, it technically does a commit. It seems to me that pre-commit hooks should run when the user has the intention to say "I'm happy with this piece of work, let's commit", but that's extremely difficult to detect, because:

jj commit and jj amend clearly show an intention to do so
- But amend is just an alias for squash, which is usually not used for the commit intention (probably?)
jj new may be used to commit, or it may be used as a "checkout" command
- We could special-case when you new on top of the current commit, but it seems hard trying to explain all the nuances of the pre-commits to a user
I personally mostly use jj split when I intend to "commit" something (since it works like hg commit --interactive), but I also use it to just run a regular split.
- Similarly to new, we could special-case this for when we're splitting @, but I suspect other people use jj split for other purposes.
I use jj describe to "commit" when it's the final commit in the stack (since I use the jj edit workflow)

I don't think post-commit hooks make a lot of sense. A post-commit hook can only really do validation, and you can just as easily run that validation in pre-commit.

Because of all the difficulties described above in detecting when a "commit" intention occurs, I'm honestly not a big fan of pre-commit hooks. Instead, I'd personally prefer pre-upload hooks. I can see some small justifications for why pre-commit is better than pre-upload, but I think that being able to manually run my pre-upload hooks would be the way to go, since it's just so hard to detect when you actually "commit" in jj.

Other Git hooks

pre-rebase
- The documentation suggests "you can use this hook to disallow rebasing any commits that have already been pushed". However, since we have immutable commits, I don't think we really need this, and I can't think of any other use case.
post-checkout
- Documentation: "This may mean moving in large binary files that you don’t want source controlled, auto-generating documentation, or something along those lines"
- I can see some appeal to this, but IMO something like this should just be run explicitly from the user. It's also, similar to commit, quite hard to detect a "checkout". Interested to hear other opinions though
post-merge
- I don't think this makes much sense in jj. Merge isn't a special operation for us, so I don't think it makes sens to special-case it
pre-auto-gc
- This just seems wierd. It juts exposes internals that shouldn't be exposed?

joyously · 2024-05-01T01:45:43Z

I don't think post-commit hooks make a lot of sense. A post-commit hook can only really do validation

In the context of a GUI, would it make sense that post-commit could be a trigger to update the display?

PhilipMetzger · 2024-05-01T17:04:32Z

It has no place there, as it severely overlaps with run and forcing a pre-upload or presubmit hook in there makes the user experience rather unpleasant. If we can make the CLI emit just a proto message through IPC/vsock and the daemon handles the whole hook invocation, the user experience is kept smooth.

I think the workflow I'd imagined is impossible with what you're proposing. What user workflow do you imagine? Could you take me through what happens both from a user perspective and a code perspective with your idea.

Since transactions are cheap (atleast for the Git backend, don't know if that counts for Piper), we can just rewrite the actual objects with a new transaction instead of providing hooks, which would fail in the CLI. This would match the behavior of a hook and since all of this happens in the daemon, it is something you can easily opt out.

Describe hooks

I think these are relatively un-contentious.

A pre-describe hook would pre-fill the commit message
- A hook failure would be a warning, and would leave the message unchanged
A post-describe message would both validate the commit message and perform potential fixups on it
- Failure of this would cause the editor to rerun

fn describe(commit: Commit) ->  {
  let mut msg = match pre_describe_hook(commit) {
    Ok(m) => m,
    Err(e) => {
      // If the initial description was invalid, we still have to be able to edit that description, so it doesn't really make sense to have these fail. So any pre-describe hook failure is considered an internal error to the hook.
      warning(e);
      msg
    }
  };
  while (true) {
    msg = run_editor(prefilled);
    match post_describe_hook(Commit{msg=msg, ..commit}) {
      Ok(edited) => return edited,
      // Rerun the editor
      Err(e) =>  print(e),
    }
  }
}

I still don't think that such a thing belongs into the CLI. I implore you to go through to the jj run Design Doc with all it's versions (the one in the repo and the Google Doc one) and the related Discord discussion, as it contains valuable conversations in relation to hooks.

I hope I made my idea clear, that keeping hooks out of band of the CLI is long term better.

And RE: Pre/Postsubmit hooks, we'll first need some definition of forge in library for it to be useful.

matts1 · 2024-05-02T00:50:44Z

Note for anyone reading this thread in the future: #1869 is the tracking bug of jj run

we can just rewrite the actual objects with a new transaction instead of providing hooks

I'm not neccesarily opposed to this.

I still don't think that such a thing belongs into the CLI

To me, the most important thing is the user experience, and implementation details are secondary. If we can have a great user experience without putting it in the CLI, I'm certainly not opposed to that, but it's really hard to evaluate whether jj run is a good alternative to hooks in the CLI (even after having read the document you mentioned), because you haven't yet proposed a specific user journey.

Suppose a repo foo has a rule that the first line of a commit message must not exceed 80 characters, and the user wants to validate that when they write a description, it adheres to that specification.

The requirements I would like to impose upon a user journey are:

A user must be able to use standard jj commands
- jj describe, jj commit, jj spllit, or any other command that sets the description for a change must run this hook-like thing
  - Adding an aliasjj describe-with-bug as a jj alias is insufficient
  - Similarly for alias describe="jj run pre-fill && jj describe" as a shell alias
- This should allow for a consistent user experience
  - It would be a pain if for one repo you needed to run jj run hooks/validate_description and for another repo you had to run jj run hooks/post-describe

Based on reading the things you asked me to read, I'm not 100% certain, but it seems that you disagree with the user journey itself, and disagree with the fundamental need for a explicit implementation of a "hook". So my questions for you are:

Can jj run meet those requirements (I suspect not, given the current design of jj)?
- If not:
  - What user journey would you propose?
    - What would the user need to do to configure it (eg. .~/.config/jj/Config.toml)?
    - What would the user need to do to ensure it runs when they run a command?
  - Why do you want it if it can't meet those requirements?
    - Do you think the requirements are bad?
      - If so, why? What doesn't make sense about them?
    - Do you think that the requirements are good, but you're willing to make that sacrifice? If so, why?

From what I can tell from piecing together bits and pieces from what you're saying about doing this out of band, with a pub/sub message, and reading between the lines a lot, what would happen under the hood would look like this:

jj describe -> write commit message -> submit it
jj-cli publishes an event saying that a commit description has been updated
(this could happen anytime from now to the very end) jj-cli returns, having succeeded
jj daemon subscribes to that event and reads hooks/jj-config.toml (or some config file), and now knows that we have a subscriber to the post-describe hook that will run binary jj run hooks/post-describe
jj daemon runs
jj daemon runs the subscriber via something roughly equivalent to jj run hooks/post-describe (or something similar)
The post-describe binary runs, and the commit message fails validation

I hope you understand why I'm confused, because in the final step there, there is no way to report the failed validation to the end-user. I'm sure your idea works, but unfortunately I don't understand what you're trying to say here, so I'd appreciate it if you could elaborate on precisely what you want. Because you've communicated vague ideas rather than a precise set of steps, it's hard for me to tell what you're trying to say.

matts1 · 2024-05-13T00:24:09Z

I've been considering this further, and also chatted to @mlcui-corp offline.

My conclusions are:

Pre-upload hooks are an absolute must
- While the user could achieve them with jj run, the fact that they can just run something like jj git push allows them to entirely skip those hooks, and users will forget.
pre / post-commit hooks are both unneccesary (jj run mostly makes these redundant) and difficult (it's very difficult to determine a commit intent in jj).
- I'm no longer 100% on board with post-describe hooks. I use jj describe to write temporary names for things, similar to a git stash, and having those have to follow the conventions required for a commit that's going to be submitted seems like overkill, when we can just as easily make it a pre-upload hook.
- I think there's a definite use case for using pre-describe hooks to pre-fill the commit description. However, the value proposition for it seems limited. Maybe setting the JJ_CHANGE_ID environment variable when running $EDITOR would allow users to configure their editor to do something similar (eg. inherit bugs from parent commits, add "Change-Id" to footer). My main concern here is that different repositories may have different requirements (eg. one that pushes to gerrit may require change-id in the footer, different remotes may have different formats for specifying bugs).

I think what we should do is:

Only start by implementing pre-upload hooks
- But do so by creating a generic mechanism for hooks, so we can easily add them later elsewhere if required
Table pre/post-describe hooks, and see what (if any) requests for things related to this come up in the issue tracker.
Plan to never implement commit hooks.

I also think it's important to distinguish hooks from some sort of pub-sub system for messages. @joyously raises a good point that it's useful to have a post-commit hook so that the IDE can know when to update. However, the differences between hooks and a pub-sub system (from my perspective, at least) are:

Hooks can fail (and a failure generally results in the failure of the underlying command)
Hooks are synchronous (since you need to know whether the hook succeeded or not)
Hooks are configured by the user (or the repository)

I think that an IDE would be better off subscribing to a pub-sub system, so that it simply gets notified when events occur. This would mean that the user doesn't have to configure the hook for the IDE themselves, and it also means we can provide events for things that really shouldn't need to be a hook (eg. the user runs jj branch create foo - clearly, we don't need any type of hook for that, but the IDE should ideally be able to get notified).

matts1 · 2024-05-13T00:30:25Z

@mlcui-corp also mentioned to me that pre-describe hooks could be made more difficult by jj describe -m foo, or anything where you specify the description in the command-line.

I don't think this changes my position on the matter - I think pre-upload hooks only make a good start.

joyously · 2024-05-13T00:55:54Z

we can just as easily make it a pre-upload hook.

Does this refer to jj git push or is there another command that falls under this label?
Does your view take into account solo projects, not pushed to a remote? Or contributors that are stuck offline for awhile, and can't push? (just curious)

I spent a few years in the internals of WordPress, which uses a hook system in PHP (synchronous). It is amazingly flexible. One of the parameters is a priority, so that the code is run in priority order. There are two types of hook: one is a "filter" which can modify the first parameter (and must return it, so that filters can be chained), and the other is an "action" which can do whatever but no return value (more like pub-sub). Some big plugins also invoke the hook functions for their own hooks.

matts1 · 2024-05-13T01:31:19Z

Currently, pre-upload hooks refer only to jj git push in the public build. However, jj gerrit send (#2845) is currently in progress, and would presumably also use this same mechanism.

My intention is that any jj command which supports uploading would end up supporting this. This wouldn't be limited to the git backend - for example, internally at google we would be able to use this for a piper backend.

Does your view take into account solo projects, not pushed to a remote? Or contributors that are stuck offline for awhile, and can't push? (just curious)

IIUC, what you're trying to ask here is "can we make sure that pre-upload hooks are able to be run without actually pushing anywhere?". My intention was something similar to jj git push --dry-run except it also runs hooks (we could allow dry-run to run pre-upload hooks, but having --dry-run potentially modify commits may go against what users expect).

Alternatively, we could create a specific command to run your pre-upload hooks.

Your idea about the two types of hooks seems interesting, but I'm very hesitant to implement something like that. A pub-sub event has the rather nice property that it is guaranteed not to mutate the repository, so I'm happy to sprinkle them all over the place wherever someone might want to know about something. However, a regular hook can make no such guarantee. Allowing a user, for example, to make a hook when a branch is created means that we can no longer guarantee that a branch creation won't modify commits. I'm no expert on this, but I imagine that loosening these constraints could be a potential problem.

joyously · 2024-05-13T03:00:13Z

what you're trying to ask here is "can we make sure that pre-upload hooks are able to be run without actually pushing anywhere?".

Actually, I was thinking more in terms of needing hooks in other places besides pre-upload since those scenarios I mentioned wouldn't be uploading. At the same time, if the hooks are on push, they would run even when I push from one folder to another on my computer?

I agree that --dry-run should not modify anything. (Would it make a snapshot if needed?)

matts1 · 2024-05-13T03:39:38Z

Ah, I get you now, thanks for clarifying.

Does your view take into account solo projects, not pushed to a remote? Or contributors that are stuck offline for awhile, and can't push? (just curious)

For contributors stuck offline for a while, I imagine jj git push --dry-run --run-hooks would work.
For someone working on an offline project, the same command could work, but a command like jj run-hooks pre-upload might be more appropriate (and would also solve the offline use case).

I'm thinking of it less as a pre-upload hook, and more of a pre-publish hook (where you usually publish via upload, but users could choose to do whatever they want.

matts1 · 2024-05-13T04:11:35Z

Also, @PhilipMetzger, I'd like to hear your thoughts on my updated plans, since you seemed to have issue with hooks before.

I understand that you don't want hooks in the CLI, and after rethinking, I'm ok with that for most use cases - I'm no longer advocating for a describe hook. However, I strongly believe that we need pre-upload hooks.

For example, consider my workflow when I contribute to jj.

Write code until cargo test passes
On each commit, run cargo fmt && cargo clippy && cargo test
jj git push

With jj run, this can be reduced to:

Write code until cargo test passes
Run jj run -r 'mutable()::@' 'cargo fmt && cargo clippy && cargo test (or something along these lines)
jj git push

This is clearly superior, but IMO is still insufficient. If I do it this way, there are so many ways to upload an invalid commit:

I sometimes forget to run jj run
I work on several different repositories
- I need to both learn and remember precisely which jj run command to run for each repository
I either need to remember the precise jj run invocation (which I often get wrong) or I need to look through my bash history for it (which may have many other random jj run invocations in it)
Sometimes I run something and it passes the checks. I then make a change that I think won't affect things, so I don't re-run those commands, but then it turns out it has affected things (yes, I really am my own worst enemy)

I see the purpose of a pre-upload hook as being complementary to jj run, rather than redundant.

A pre-upload hook is designed to protect you from uploading bad commits
jj run allows you to easily run things on multiple commits at once

I'd like to combine these. For example, I might create a file hooks/pre-upload containing:

#!/bin/bash -eu

jj run -r 'mutable()::@' 'cargo fmt && cargo clippy && cargo test'

Then my workflow would be:

Write code until cargo test passes
jj git push
jj runs the hook on the specific commit that needs to be uploaded, which then invokes jj run to validate and fix the whole stack of changes
It then actually uploads it once the commit is validated.

Note that I should clarify that I don't really care about the specific implementation (the bash file above was an example). I only care that from the perspective of a user of jj, I can make sure that jj doesn't let me upload a bad commit. Right now I'm trying to get buy-in on the concept of a pre-upload hook, rather than any specific implementation of it.

matts1 · 2024-05-13T05:12:00Z

Here's a simple proposal for an example implementation. I think it's quite nice, but as I said before, I'm open to other ideas.

Step 1: Multi-layered config files

We start off by making config.toml have more layers. Currently, jj basically does jj --config_toml=$HOME/.config/jj/config.toml.
I propose changing it to:

jj \
  --config_toml=<repo>/jj.toml \
  --config_toml=$HOME/.config/jj/config.toml \
  --config_toml=<repo>/.jj/config.toml

This would allow repositories to create their own configurations, but allow you to override them if you wanted to (actual filename up for debate).

Step 2: `jj run` aliases

Add an entry in cargo.toml config files which allows you to create jj run configurations. For example:

<repo>/jj.toml would contain:

[run-aliases]
# People might not expect pre-upload hooks modifying commits.
pre-upload = {"aliases": ["lint", "test"]}
fmt = {write: "true", "cmd": ["cargo fmt"]}
lint = {"cmd": ["cargo clippy"]}
test = {"cmd": ["cargo test"]}

It could then be merged with a user-specific config for that specific repo (eg. <repo>/.jj/config.toml)

[run-aliases]
# I like my pre-upload hooks modifying commits
pre-upload = {"aliases": ["fmt", "lint", "test"]}

Step 3: The hook

A user could then run jj run --alias fmt to run their formatter, for example.

When you run jj git push, jj would automatically run jj run -r <all revisions being uploaded> --alias pre-upload, if an alias named pre-upload was present.

martinvonz · 2024-05-13T05:16:23Z

Just a quick note that reading config from the repo has security problems. One solution to that would be to allowlist certain keys. I think Git and Mercurial don't allow reading any config from repos. I don't know if they did it that way for simplicity or if there are problems even with allowlisted keys that I'm missing.

matts1 · 2024-05-13T05:24:03Z

That's a good point.

For security reasons, git can't run hooks unless you manually copy them to the .git/hooks directory (docs). Could we consider something similar where we require a user to "install" the repo-specific config by copying it to the .jj directory, or are there additional security concerns that may not be relevant for git?

ilyagr · 2024-05-13T05:24:21Z

Just a quick note that reading config from the repo has security problems.

I was thinking of having some way of storing "suggested config" in the repos that the user would have to explicitly import. It would probably be restricted to some allowlisted keys as well.

The application I had in mind is to suggest a value for immutable_heads(), but suggested hooks make sense too.

The problem here is that the exact UI and minor decisions here have security implications and I don't have a mental model of simple conditions that are "sufficient" for security.

martinvonz · 2024-05-13T05:33:00Z

Actually, there was some "config-based hooks" effort in Git a while ago (a quick googling led to this). I don't know if that series landed (@nasamuffin surely knows), but it's a good idea to at least understand what that patch series was about regardless.

joyously · 2024-05-13T13:53:25Z

I only care that from the perspective of a user of jj, I can make sure that jj doesn't let me upload a bad commit.

I know the jj model is different, but I just wanted to toss this in. The Bazaar/Breezy command structure is a little different from Git in that you work in a branch, merge the main branch in (it changes working copy only), do whatever you want including testing and fixing conflicts, then commit the working copy. This fixes the problem Git has with merge commits and not being able to test before it's merged(committed).
I'm not sure how this would fit in with jj committing everything all the time.
What if the result of a failed hook was visible in the log?

nasamuffin · 2024-05-13T16:24:02Z

config-based hooks

No, it did not land, but not for lack of interest or viability - just for lack of developer time. Google has been carrying the rest of that series downstream for years; we haven't had any problems with it. I'd absolutely recommend that approach for jj over the historical Git approach.

PhilipMetzger · 2024-05-13T19:05:39Z

Sorry for the late response here, I took some time off after that particular conversation. @matts1 and I had a private conversation off-thread.

⚠️, Warning! Wall of text ahed.

Finishing the user story:

jj describe -> write commit message -> submit it

jj-cli publishes an event saying that a commit description has been updated

(this could happen anytime from now to the very end) jj-cli returns, having succeeded

jj daemon subscribes to that event and reads hooks/jj-config.toml (or some config file), and now knows that we have a subscriber to the post-describe hook that will run binary jj run hooks/post-describe

jj daemon runs

jj daemon runs the subscriber via something roughly equivalent to jj run hooks/post-describe (or something similar)

The post-describe binary runs, and the commit message fails validation

And then the daemon re-opens $EDITOR with something like a jj describe @- --add-footer 'BUG=' etc. The UX should feel similar how Git hooks work currently, with a way nicer rollback mechanic if needed. And since subscribers are not bound to a local machine, we can add/create infrastructure which implement those checks on remote workers (like a mini-Tricorder) or build it directly into the forge.

However, the differences between hooks and a pub-sub system (from my perspective, at least) are:

Hooks can fail (and a failure generally results in the failure of the underlying command)

Hooks are synchronous (since you need to know whether the hook succeeded or not)

Hooks are configured by the user (or the repository)

While I agree that this is definitely a "Hook" in the Git sense, my opinion on that is still unchanged that we can achieve the same UX with a pub/sub part in a daemon. And since jj's transactions are cheap, I see no need to fail a command when we can immediately rewrite it via daemon.

pre / post-commit hooks are both unneccesary (jj run mostly makes these redundant) and difficult (it's very difficult to determine a commit intent in jj).

I'm no longer 100% on board with post-describe hooks. I use jj describe to write temporary names for things, similar to a git stash, and having those have to follow the conventions required for a commit that's going to be submitted seems like overkill, when we can just as easily make it a pre-upload hook.

I think there's a definite use case for using pre-describe hooks to pre-fill the commit description. However, the value proposition for it seems limited. Maybe setting the JJ_CHANGE_ID environment variable when running $EDITOR would allow users to configure their editor to do something similar (eg. inherit bugs from parent commits, add "Change-Id" to footer). My main concern here is that different repositories may have different requirements (eg. one that pushes to gerrit may require change-id in the footer, different remotes may have different formats for specifying bugs).

I agree on the conclusion but I think we can achieve the same feel with even with the daemon, as pointed out above by finishing the user story.

Addressing pre-upload hooks aka (`g4 presubmit`/`arc lint`)

I think what we should do is:

Only start by implementing pre-upload hooks

But do so by creating a generic mechanism for hooks, so we can easily add them later elsewhere if required

This is fine standalone, although I'd consider the pre-upload part another FR. 👍

Does your view take into account solo projects, not pushed to a remote? Or contributors that are stuck offline for awhile, and can't push? (just curious)

IIUC, what you're trying to ask here is "can we make sure that pre-upload hooks are able to be run without actually pushing anywhere?". My intention was something similar to jj git push --dry-run except it also runs hooks (we could allow dry-run to run pre-upload hooks, but having --dry-run potentially modify commits may go against what users expect).

Alternatively, we could create a specific command to run your pre-upload hooks.

At this point we've reached the design of g4 presubmit or arc lint which was always fine to me, and advocated for via jj fix/hg fix from @martinvonz.

With jj run, this can be reduced to:

Write code until cargo test passes

Run jj run -r 'mutable()::@' 'cargo fmt && cargo clippy && cargo test (or something along these lines)

jj git push

This is clearly superior, but IMO is still insufficient.

Thanks for recognizing jj run's value.

If I do it this way, there are so many ways to upload an invalid commit:

I sometimes forget to run jj run

I work on several different repositories

I need to both learn and remember precisely which jj run command to run for each repository

I either need to remember the precise jj run invocation (which I often get wrong) or I need to look through my bash history for it (which may have many other random jj run invocations in it)

Sometimes I run something and it passes the checks. I then make a change that I think won't affect things, so I don't re-run those commands, but then it turns out it has affected things (yes, I really am my own worst enemy)

I see the purpose of a pre-upload hook as being complementary to jj run, rather than redundant.

A pre-upload hook is designed to protect you from uploading bad commits

jj run allows you to easily run things on multiple commits at once

I'd like to combine these. For example, I might create a file hooks/pre-upload containing:

And as you allude to, this workflow makes sense with failing in the pre-upload/pre-push step. I don't see any issue with it. And if we have jj fix this concludes to "pre-upload for pr/12345" failed, please run 'jj fix --apply'.

I think all the related things which are proposed are fine as is with minor adjustments.

WDYT of also supporting protobuf as configuration possibility?

TL;DR

You have my buy-in for the reduced scope and moving it out of the CLI.

PhilipMetzger · 2024-05-13T19:14:41Z

Actually, there was some "config-based hooks" effort in Git a while ago (a quick googling led to this).

Looking at the provided technical docs, this feature really looks like the pub/sub system in this FR with the concept of "Events" and all. So if we can provide a better implementation on that front I'm happy.

And that is a really nice design, which sadly hasn't landed in Git yet 😦 .

matts1 · 2024-05-13T21:45:41Z

And then the daemon re-opens $EDITOR with something like a jj describe @- --add-footer 'BUG='

My point here was that by the point the validation had failed, the CLI command has already exited. If that's the case, then the daemon re-opening editor means that it can't simply re-open it in the same terminal. It would have to spin up a new terminal window containing a new instance of $EDITOR, which seems like a subpar user experience. This would work just fine if $EDITOR was a gui application, but if you use an editor such as vim, it doesn't seem great.

WDYT of also supporting protobuf as configuration possibility?

I quite like the idea I proposed where we just use the existing jj configuration file. This could mean that we could just alias jj fix to jj run --alias fix, and have the same for jj lint and jj test. What do you think about my proposal above? You never responded to it, so I don't know how you felt.

PhilipMetzger · 2024-05-14T21:01:15Z

My point here was that by the point the validation had failed, the CLI command has already exited. If that's the case, then the daemon re-opening editor means that it can't simply re-open it in the same terminal. It would have to spin up a new terminal window containing a new instance of $EDITOR, which seems like a subpar user experience.

Ack, that makes sense. I wouldn't consider it a major blocker though, as there surely is a way to make it friendlier.

This could mean that we could just alias jj fix to jj run --alias fix

jj fix will be a standalone command and that has been decided since run was designed.

What do you think about my proposal above? You never responded to it, so I don't know how you felt.

The proposal is fine, although I'd make some different choices.

A bunch of nitpicks:

I don't think run should be responsible for its aliases, as it will purely duplicate existing work.
The aliases table should just expand to normal run invocations¹.
Also there is no need to specify that the command writes, as it is the default.
Looking at the TOML spec, there is no good support for sets, so something better is needed.

There's also the question of virtual aliases, so the pre-upload you defined, as I'm not sure if it already is supported.

And my idea with the protobuf configuration would be for the forge anyway, as having a stable representation there is helpful.

these would be just strings which will be passed as <command> to run, like the aliases are now. ↩

matts1 · 2024-05-14T23:44:16Z

jj fix will be a standalone command and that has been decided since run was designed.

I couldn't find any issues for fix when I looked. Could you link to where this was discussed? I'm a little surprised, since jj fix seems like it's trivial to implement as an alias in your config.toml (eg. fix = ["run", "-r", "immutable_heads()..@", "--no-rebase", "cargo fmt"]' - the no-rebase is an option I would presume exists on jj run to have it not modify the tree of the children, only that commit exactly).

I don't think run should be responsible for its aliases, as it will purely duplicate existing work.

The aliases table should just expand to normal run invocations1.

My assumption was that with my virtual pre-upload step I defined, jj run would be able to come up with a dependency graph. Consider, for example:

[run-aliases]
# People might not expect pre-upload hooks modifying commits.
pre-upload = {"aliases": ["fix", "rebasing", "lint", "test"]}
fix = {write: "true", "cmd": "cargo fmt"}
lint = {"cmd": "cargo clippy"}
test = {"cmd": "cargo test"}

It gets told to run fix, rebasing, lint, then test
This means that there's an operation for fix, lint, and test, on every change in the revset.
It knows that the lint and test command can run at the same time, since they're read-only, but that running it on a given commit requires the fix step on that particular commit to be finished"

This seems difficult to do with a regular alias. But maybe it's not on the table for run at all, in which case I guess a regular alias would work just fine.

Also there is no need to specify that the command writes, as it is the default.

You've probably already considered this, but having write as the default may leave performance on the table, since if I run `jj run -r 'immutable_heads()..@' 'cargo lint', then since I know it's readonly, I should be able to run it in parallel on every revision at once.

Looking at the TOML spec, there is no good support for sets, so something better is needed.

Could you clarify what specifically you needed sets for. IMO, sets can be mostly replaced with lists, and if we really wanted a set type jj can validate that there were no duplicates.

There's also the question of virtual aliases, so the pre-upload you defined, as I'm not sure if it already is supported.

To the best of my knowledge, it's currently unsupported to have multiple jj commands in a single alias, but I do think it'd be helpful. I think something like the following would work relatively well:

[aliases]
lint = ["run", "cargo clippy"]
test = ["run", "cargo test"]
pre-upload = {
    "opts": {
        name: "revision",
        short: "r",
        default: "immutable_heads()..@"
    },
    "commands": [
        # Could potentially add metadata here to create a dependency graph and allow commands to run in parallel.
        ["lint", "--revision=$revision"],
        ["test", "--revision=$revision"],
    ]
}

And my idea with the protobuf configuration would be for the forge anyway, as having a stable representation there is helpful.

This seems like it's introducing yet another configuration file for the forge. Plenty of these exist already. It seems easier to just have the repo define the configuration in a version-controlled file (#3684), and have the forge just invoke jj aliases (eg. jj test -r @ to run the alias test = ["run", "cargo test"]). This would allow the forge to just use the pre-existing configuration files, and also allow the forge to essentially "test" the jj aliases.

PhilipMetzger · 2024-05-15T17:06:28Z

jj fix will be a standalone command and that has been decided since run was designed.

I couldn't find any issues for fix when I looked. Could you link to where this was discussed?

There never was a issue, as it was decided on here and in this Discord discussion (see #3647 for the actual difference to run).

I'm a little surprised, since jj fix seems like it's trivial to implement as an alias in your config.toml (eg. fix = ["run", "-r", "immutable_heads()..@", "--no-rebase", "cargo fmt"]' - the no-rebase is an option I would presume exists on jj run to have it not modify the tree of the children, only that commit exactly).

I get that you're surprised, but since fix is a specialization of run it should have its own implementation. And a --no-rebase flag would be another FR in #1869 butI think your use-case there is already covered by just passing --reparent.

My assumption was that with my virtual pre-upload step I defined, jj run would be able to come up with a dependency graph. Consider, for example:
[run-aliases]
# People might not expect pre-upload hooks modifying commits.
pre-upload = {"aliases": ["fix", "rebasing", "lint", "test"]}
fix = {write: "true", "cmd": "cargo fmt"}
lint = {"cmd": "cargo clippy"}
test = {"cmd": "cargo test"}
It gets told to run fix, rebasing, lint, then test

This means that there's an operation for fix, lint, and test, on every change in the revset.

It knows that the lint and test command can run at the same time, since they're read-only, but that running it on a given commit requires the fix step on that particular commit to be finished"

This seems difficult to do with a regular alias. But maybe it's not on the table for run at all, in which case I guess a regular alias would work just fine.

It never was on the table that run is able to model a dependency graph. The best I can give you is just expanding the aliases in order. Maybe it could be a future extension but IMO it shouldn't be run's feature.

Also there is no need to specify that the command writes, as it is the default.

You've probably already considered this, but having write as the default may leave performance on the table, since if I run `jj run -r 'immutable_heads()..@' 'cargo lint', then since I know it's readonly, I should be able to run it in parallel on every revision at once.

I'm happy to inform you that this is run is parallel by default. But it is an optimization to keep in mind.

Looking at the TOML spec, there is no good support for sets, so something better is needed.

Could you clarify what specifically you needed sets for. IMO, sets can be mostly replaced with lists, and if we really wanted a set type jj can validate that there were no duplicates.

I just understood your suggested syntax (looked like a python dict at first) as that, OTOH I don't think that TOML tables and arrays scale that well (cargo has a good reason to generate the lock file).

There's also the question of virtual aliases, so the pre-upload you defined, as I'm not sure if it already is supported.

To the best of my knowledge, it's currently unsupported to have multiple jj commands in a single alias, but I do think it'd be helpful.

👍

This seems like it's introducing yet another configuration file for the forge. Plenty of these exist already. It seems easier to just have the repo define the configuration in a version-controlled file (#3684), and have the forge just invoke jj aliases (eg. jj test -r @ to run the alias test = ["run", "cargo test"]). This would allow the forge to just use the pre-existing configuration files, and also allow the forge to essentially "test" the jj aliases.

Fair enough.

matts1 · 2024-05-15T23:10:34Z

After thinking about it, I see two possible nice solutions for pre-upload hooks. I quite like the second option:

Option 1: Look for the alias pre-upload

[aliases]
pre-upload = ["run", "-r", "@", "pre-commit run"]

When you run jj git push, it would:

See if aliases.pre-upload exists in the config's cargo.toml
If it exists, run that alias

However, doing so would mean that if we want to ensure that pre-upload hooks are kept in the daemon (and can thus run even if you invoke upload from a tool such as an IDE), we would need to put CLI parsing inside the daemon, which I'm not a huge fan of.

Option 2: Specific pre-upload config section

[pre-upload]
command = "pre-commit run"
# JJ would provide an extra revset here containing the set of revisions that we are planning on uploading.
revset = "mutable()::uploaded_changes()"
# One of rebase, reparent, and readonly
mode = "readonly"

Here, the pre-upload configuration would roughly equate to the command-line options for jj run (and would call whatever function jj run calls under the hood). This would have the advantage of no longer being an arbitrary command, and thus we can have nice TUI integration once it's ready, amongst other things (though we'd need to be careful to ensure that the TUI is only used in jj-cli).

Comments on `jj run`

@PhilipMetzger, a few suggestions on your design doc for jj run (is there a good way to give feedback on design docs on github - comments on a google doc work pretty well, but there doesn't seem to be anything like that for github):

If I understand correctly, exactly one of --dry-run, --rebase, --reparent, and --readonly must be chosen
- Probably worth calling out on there. It's very unclear, especially because --clean is stuck between them.
Based on the fact that you say that you can write by default, and that it's parallel by default, I assume that the default is --reparent?
- Also probably worth calling out
You write that by default, jj run works on @.
- I'd personally like jj run to work by default on mutable()::@ (If I wanted to run jj run -r @ 'command', I'd instead just run command directly).
- I think it'd be valuable to instead do the same thing we do for log, where we add revsets.edit (which could default to @)
No mention was made about the working directory, so I'm curious how relative paths would work
- IMO, we should set the $JJ_REPO environment variable so that a relative path to the root of the repo can be $JJ_REPO/foo
- Maintaining relative paths has value
  - A unix mount namespace could solve this by literally just overlaying our new virtual file system on top of the real one
    - But unclear how this'd work on non-unix systems

arxanas · 2024-05-16T04:01:08Z

I have a great deal written up against pre-commit hooks (as @PhilipMetzger alluded to), but it sounds like pre-commit hooks are not being discussed anymore, so no point in expounding further.

It sounds like there's a great deal of complexity:

pub/sub system
daemon
ordering
blocking
failure
configuration
run integration

This is primarily to solve an interface problem: that the user can accidentally run some commands without satisfying some invariants. Suppose that we just allowed the user to override the built-in commands with their own aliases? It seems like it would be substantially easier, and more flexible anyways. This should still solve the per-repo configuration issue by providing a one-time setup script to run.

For the case of large organizations, it seems they successfully get their developers to use blessed wrapper tools for things like builds, tests, code review, etc., so I don't think the hook thing is as big of a problem. jj is also designed so that individuals or organizations can provide their own custom build by plugging into jj-lib (which is what Google does right now?), so that seems like a natural place to insert any organizational checks, without having to work out a general solution for all the hooks that might ever exist.

I think the jj run implementation details are largely orthogonal, so they should be moved to a different thread. It suffices to decide or not that jj will invoke a run alias as a "hook". You can also see the interface for git test for related discussion.

For what it's worth, I use git test as a primary way to validate commits before sending them for review. I considered making my own wrapper script or integrating it into git submit directly, but the unfortunate fact is that that there is not always one set of suitable checks to run on each commit for a given push:

In some cases, I want to run certain heavy test suites;
in other cases, I want to skip even running formatters.

It's become apparent that one of the primary reasons that I didn't used to run more tests is simply that it was inconvenient, and being able to test/fix in parallel, with caching, without conflicts, improves the situation anyways.

If there are real pushes that need to be prevented (leaking credentials, uploading massive files), then they need to be on the server-side, rather than the client-side anyways.

mlcui-corp · 2024-05-16T05:29:13Z

This is primarily to solve an interface problem: that the user can accidentally run some commands without satisfying some invariants. Suppose that we just allowed the user to override the built-in commands with their own aliases? It seems like it would be substantially easier, and more flexible anyways. This should still solve the per-repo configuration issue by providing a one-time setup script to run.

it seems [large organizations] successfully get their developers to use blessed wrapper tools for things like builds, tests, code review, etc., so I don't think the hook thing is as big of a problem

I agree. Outside of large organisations - like open source GitHub repos - most (from n=3) repos I've seen on tend to encourage running tests (cargo test, npm test, etc.) and linting/formatting manually before uploading. Even internally in google3, you need to run linting/formatting manually before uploading. These checks are often run in CI after upload anyway.

I haven't seen any repository which uses something like https://pre-commit.com, or a wrapper which does checks before uploading, other than large Google git repositories (Android / ChromiumOS with repo, Chromium with git cl + depot-tools). I believe Phabricator also has pre-upload lint checks via arc lint.

On the other hand, this might be a symptom of the tools we're used to. Maybe we're too used to the status quo of "uploading doesn't run checks" because existing tools make it hard to do so, and jj could make it a nicer experience? Does the existence of git wrapper tools like repo, git cl and arc show a deficiency of git, which could be addressed by jj?

My personal opinion: I don't mind aliases to do this for most users, as aliases are very transparent to users and allows for fine-grained customisation. My current Chromium workflow uses git test run -x "my_presubmit.sh" -v "main..@" && git push origin HEAD:refs/for/main for pushing commits - expanding it out like this allows myself to modify the exact behaviour I want (for example, maybe I want to push a specific commit without checking it out, or just run presubmits for some commits), similar to @arxanas's use cases.

I would expect most wrapper tools would have similar behaviour to <run presubmit on REVSET> && jj forge upload [REVSET] [UPLOAD_FLAGS] - if anything more bespoke is needed (like changing upload parameters based on presubmit results), a wrapper tool would make sense. Getting users to add an alias to their repo-specific config is probably the same amount of effort as getting users to add a preupload hook IMO.

matts1 · 2024-05-16T06:08:38Z

It sounds like there's a great deal of complexity:

I don't think this is particularly accurate. These were all being discussed as potential ways to solve the problem, but my most recent proposal was extremely simple. It required

~~pub/sub system~~
~~daemon~~
~~ordering~~
~~blocking~~ (it blocks, but just only insofar as a synchronous function blocks)
~~failure~~ (they can fail, but it's the same as any other rust function failing)
configuration
run integration (sort of - it basically just calls the run function).

I'd estimate that this could be written in ~100 lines of code without too much difficulty.

the unfortunate fact is that that there is not always one set of suitable checks to run on each commit for a given push

I think that's true, but I believe that there is almost always a set of things you want to run before uploading that are sufficiently quick and reliable to put in a pre-upload check. I think that 99% of the time I'm going to want to run jj run -r 'stack(@)' '<fix> && <lint>' before uploading, but I'm also not proposing that you should be locked in to it. ChromeOS uses repo upload --no-verify to skip your presubmits, and I imagine that if we added pre-upload checks, we'd also add an option to skip them.

Suppose that we just allowed the user to override the built-in commands with their own aliases

An alias could solve things for most use cases, but will prevent proper integration with tools such as IDEs. Also, I was told a while back that we were explicitly choosing not to be able to override builtin commands (as doing so would break any tooling that invoked them).

For the case of large organizations, it seems they successfully get their developers to use blessed wrapper tools for things like builds, tests, code review, etc., so I don't think the hook thing is as big of a problem.

I don't think this is a solve for the problem. Sure, it mitigates it, but the way it is currently I need a mental model of which project I'm working on, so I know which wrapper tool to run. If I'm working on rust code in ChromeOS, for example, I need to run cros format to run my pre-upload checks. When I work on other bazel-bazed projects, I need to run buildifier. When I work on jj, I need to run cargo fmt.

Also, when I start contributing on a new project, I don't want to have to spend a few minutes trying to work out how to upload the change correctly (I've done this before). Sometimes it's not clear how to run the formatter (with bazel, for example, they tend to not provide a binary to run the formatter, and expect you to just know to run buildifier to fix and buildifier --lint to lint).

jj is also designed so that individuals or organizations can provide their own custom build by plugging into jj-lib (which is what Google does right now?), so that seems like a natural place to insert any organizational checks, without having to work out a general solution for all the hooks that might ever exist.

This doesn't really work. I work at google, and so I use google's jj, but I also work on a variety of open source projects hosted by a variety of companies. It's also a lot easier to convince a team of developers who primarily use git to just add a few lines of config to their repo to support jj developers than it is to convince them to literally maintain a whole rust project for that one jj developer who works on their repo.

On the other hand, this might be a symptom of the tools we're used to

I strongly believe that this is the case. I've been exposed to repo upload, so when I first switched to jj, I would never bother linting / fixing my commits, push directly to gerrit, and end up submitting bad code (the linter is intenttionally disabled on the remote for us, because they assume that you can't not run it, and thus if you submit code that fails lints there's a reason for it). Even if that weren't the case, it will be several minutes before the presubmits actually get run, so it would take ages before I was notified of a given failure.

I don't mind aliases to do this for most users

I'm not a fan of aliases. I use them because there's not a better solution, but I don't think it's the best solution. I think that uploading to Chrome (git cl upload), chromeos (repo upload), and any random open source project (git push) should all use the exact same command. If there is presubmits, the user should never be allowed to skip them when uploading unless they explicitly say something like --no-verify. What I want is not a new, safe path. I want guardrails on the existing path.

mlcui-corp · 2024-05-17T04:07:10Z

I am still not sure about how I feel about pre-upload hooks.

For me, this is a tradeoff, and possibly a philosophical question of how jj should work - should jj commands be minimal and focused, or be "batteries included" (to be more specific - "have space for batteries (hooks) for users, if they want to plug some in")?

I think there is elegance in supplying a minimal tool which does a single job well (just uploading to remotes), akin to the Unix philosophy. Maintenance of a minimal interface becomes easy, and tools can be very focused. Composition (like using aliases) would be encouraged.

On the other hand, having hooks built in would definitely reduce the friction to have an "all-in-one" upload command, apart from initial hook setup, compared to the alternatives:

With an alias like <run preupload on REVSET> && jj upload [REVSET] [UPLOAD_FLAGS], it might be difficult to pass [UPLOAD_FLAGS] into the push/upload command without writing a full Bash function / Python script.
(as @matts1 mentioned) Bespoke jj-lib usage just for having an "all-in-one" upload command would require writing a full Rust project.

Adding pre-upload hooks would also encourage a standard of sharing common commands of "lint / fix / test a revset", which would make it much easier to have the "exact same command" behaviour across repos. We may need to be careful about how much of jj run we ingrain in this - consider a hook implemented with jj-lib, which could analyse a series of commits without checking them out.

I think the "exact same command" benefit is a bit unrelated to the question of whether we want pre-upload hooks in the first place. A user can get the "exact same command" behaviour by creating some script which runs the things they want before upload, in any repository, and use run_preupload_for_repository [REVSET] && jj upload [REVSET] [UPLOAD_FLAGS] in any repository. However, as mentioned above, this is a fair amount of friction compared to a pre-upload hook, and would not have the easily shareable standard (a jj configuration) that a pre-upload hook would have.

My only concern about doing adding pre-upload hooks would be the extra cost to jj:

complexity cost in the implementation - it might make the implementation harder to work on. The ~100 line estimate by @matts1 feels promising, though.
complexity cost in the interface - we would need to add at least one flag (like --skip-hooks) to the CLI, and possibly more (like --dry-run, which could possibly be implemented as a different command).

If this cost isn't too great, I'd be for adding in pre-upload hooks - even if it is just for encouraging a standard of sharing lint / fix / test commands.

bbigras · 2024-05-17T05:57:18Z

I miss being able to use https://github.com/cachix/git-hooks.nix

PhilipMetzger · 2024-05-24T17:27:08Z

My opinion on pre-upload hooks is very neutral, as they're primarily for interop and not a concept I'd want to keep long-term, we should rather fix it in a server implementation when we get there.

For me, this is a tradeoff, and possibly a philosophical question of how jj should work - should jj commands be minimal and focused, or be "batteries included" (to be more specific - "have space for batteries (hooks) for users, if they want to plug some in")?

While the interface is generally simple and powerful, better forge integration has been a major point for a long time (#485). And providing something like pre-upload hooks are probably required to migrate off of Git, as a bunch of tooling sadly depends on it (like Nix via Cachix or Chromium's PRESUBMIT.py).

On the other hand, this might be a symptom of the tools we're used to

I strongly believe that this is the case.

I strongly agree too, but think we should fix it with our server implementation, when we get there. But as interop is important for jj, we should provide at least a minimal interface, which pre-upload hooks are.

Adding pre-upload hooks would also encourage a standard of sharing common commands of "lint / fix / test a revset", which would make it much easier to have the "exact same command" behaviour across repos.

I think it's a major benefit if its done via jj instead of having 30'000 different tools with custom flavors, essentially doing the same thing.

We may need to be careful about how much of jj run we ingrain in this - consider a hook implemented with jj-lib, which could analyse a series of commits without checking them out.

I don't think exposing another interface than jj run or jj fix should be planned, as these commands can cover many workflows as fix rewrites commits in memory anyway.

If this cost isn't too great, I'd be for adding in pre-upload hooks - even if it is just for encouraging a standard of sharing lint / fix / test commands.

Agreed, they're the best solution until we have a better design for a native future.

talios · 2024-09-21T09:06:21Z

I came here looking for git-hook support as well, specifically commit-msg, we use Gerrit for our code review system, which appends a Change-ID footer to the initial commit message to track revised patch-sets.

Looks like I've got a lot of reading above to checkout.

martinvonz · 2024-09-23T09:23:44Z

I came here looking for git-hook support as well, specifically commit-msg, we use Gerrit for our code review system, which appends a Change-ID footer to the initial commit message to track revised patch-sets.

Looks like I've got a lot of reading above to checkout.

In case it saves you or someone else some time, there's PR #2845 for adding a "jj gerrit send" command.

MrJohz · 2024-11-28T09:23:52Z

A case that I don't think I've seen mentioned yet in this issue, but that I'd find useful would be a hook for "Jujutsu has changed a file on disk".

As a practical example, I have a frontend project built using Node and NPM. NPM has a lock file (package-lock.json) which represents the exact set of dependencies that my project is using. Whenever I run an NPM command (e.g. to install a new module), the command also updates this lock file so that it stays in sync with the dependencies downloaded into my project. The lock file is kept under version control, but the dependency files themselves are ignored. This means that at any point, I can run the command npm ci which will reset the installed dependencies back to the state found in package-lock.json.

What I want is that when I switch between two branches/commits with different dependencies (i.e. where the lock file is different between the two checkouts), npm ci will automatically run to keep my dependencies in sync. With Git, I've used the post-checkout hook to do this — it checks if the lock file is in the list of changed files, and if so, runs npm ci. This works fairly well, especially when doing things like git bisect over a set of commits where I've updated dependencies, although it isn't quite perfect (e.g. it can be a bit funky during rebases).

This approach could also be useful for Python (i.e. keeping a venv up-to-date), and probably other languages which have a package manager that installs dependencies locally when it runs (Ruby, PHP, potentially others).

Alternatives:

In Rust projects, even though there's also a lock file and a target directory with the dependencies installed into it, this isn't such a problem because cargo generally updates the target directory whenever it runs (e.g. build, test). However, I'm quite tied into the JS ecosystem which assumes that the contents of node_modules is always the current set of project dependencies.
I could also set a file watcher to run whenever package-lock.json changes. The problem here is that NPM typically changes the lock file in sync with the downloaded dependencies — I don't need to run an extra npm ci in this case. The only changes to the lock file that don't sync with the downloaded dependencies are the ones done by my VCS, which is why I think a hook would be appropriate in this case.

This is mostly a quality-of-life feature, and it's possible to get by without it. But it could be very useful, particularly given how easy jj makes it to switch between different changes.

arxanas · 2024-11-28T21:06:11Z

@MrJohz That's interesting. I'm curious to know:

Is the result of npm ci deterministic given the tracked working copy state?
- Is it networked? Does it rely on the state of package indexes elsewhere?
Is the result of the command idempotent?
Is it "expensive" in some sense to run it? (Sounds like yes, given that you want to avoid running it unnecessarily?)
Are there cases where the files are out of sync, but you don't want to update them anyways (perhaps because it's expensive to do so and you know that the result shouldn't matter)?

There might be some overlap with the design for jj fix/jj run commands, given that the workflow involves "fixing" files in the working copy by running a certain command.

matts1 · 2024-11-28T22:39:42Z

The way I see it, pre-x hooks need to be hooks because there's a strict "this thing must happen before jj does this thing". Anything that's implemented as a post-x in git, I think would be better suited to a pub-sub event in the jj api (#3219, design doc). @MrJohz How do you feel about that?

arxanas · 2024-11-29T22:02:28Z

The way I see it, pre-x hooks need to be hooks because there's a strict "this thing must happen before jj does this thing". Anything that's implemented as a post-x in git, I think would be better suited to a pub-sub event in the jj api (#3219, design doc). @MrJohz How do you feel about that?

If I were doing this workflow (and maybe I'll try it out now, since I have some similar workflows 👀), I would be relying on the strict ordering between my post-checkout hook finishing and jj returning control to me. Otherwise, various commands might fail or produce garbage results if I switch commits but run them before the update command finishes.

[aside] There might be even more general infrastructure that would be helpful for various tooling, like a method of querying jj for the files that have changed in the working copy since a previous query's timestamp. Then my update command could inspect the list and more intelligently decide what work to do. This is a general problem that many organizations build tools to solve, for which it would be nice if source control were a little more helpful, but outside the scope of "hooks".

MrJohz · 2024-11-29T22:39:25Z

@arxanas

Is the result of npm ci deterministic given the tracked working copy state?

npm ci is deterministic modulo network. That is, if you assume that all relevant external repositories are accessible and behaving correctly, then npm ci will always install the same files to the same locations given a particular package-lock.json. There are additional flags that can be used to remove the network element and tell NPM to install from a local cache, in this case it is deterministic with the assumption that all relevant packages are available in the cache.

Is the result of the command idempotent?

Yes, if you rerun it multiple times, it will always produce the same really (again modulo network and cache).

Is it "expensive" in some sense to run it? (Sounds like yes, given that you want to avoid running it unnecessarily?)

Yes, generally. In my case, running npm ci takes maybe 3-5 seconds to run, during this time I can't really do much with my code because the dependencies aren't available yet. Also reducing unnecessary calls to the publicly available resource that is the NPM package repository is probably good citizenship! ;)

Are there cases where the files are out of sync, but you don't want to update them anyways (perhaps because it's expensive to do so and you know that the result shouldn't matter)?

Potentially, yes - in theory, I only need to run npm ci (assuming the lock file has changed) if I intend to run the code. If I just wanted to check a commit out to read the contents of the repository at that commit, I might not need to run npm ci (although it might still be useful for providing up-to-date LSP completions, type checking, etc).

That said, in my experience, even if I only intend to read the code, it's often still more useful to run the command anyway, because it's not unusual in that situation to later want to run the code as well, and running the command automatically saves me from forgetting what's going on.

There might be some overlap with the design for jj fix/jj run commands, given that the workflow involves "fixing" files in the working copy by running a certain command.

Potentially, although my impression is that those are principally for running a command across multiple revisions (with rebases if changes are made). In this case, it would only ever be useful to run npm ci for the revision currently checked out, as the command essentially only makes changes to a folder that is ignored by the VCS.

The way I see it, pre-x hooks need to be hooks because there's a strict "this thing must happen before jj does this thing". Anything that's implemented as a post-x in git, I think would be better suited to a pub-sub event in the jj api (#3219, design doc). @MrJohz How do you feel about that?

That could be a good solution. The only things I would be concerned about would be:

In general usage, being able to see when the command has finished running, as only then can I start working. In practice, it's often kind of obvious, but if I ever wanted to do something programmatic (i.e. a script that issues jj commands), I'd usually want to ensure that all of the event subscriptions had finished executing before doing my next step.
I have found the post-checkout hook in git really useful while bisecting to avoid confusion about which dependencies are currently installed. It would be important, particularly with a bisect run command, that the bisection test command only runs after all the event subscribers have finished running.

andrewaylett · 2025-02-01T16:23:18Z

I'm sure not every user of git hooks has the same model as me, but also I think jj is different enough that it makes sense to build stories about what people achieve with their hooks, and how we can model that (and even improve on it!) with the different way of looking at things that jj gives us.

The way I'd like git's pre-commit hook to translate to jj is more akin to a "post-update-change-with-new-commit" hook -- and I'd suggest it would be reasonable to treat it in a similar way to the very nice way that jj's automatic rebasing "never fails", by adding a flag to the change if the hook is unsuccessful or modifies files. I'd be quite happy if the hook ran asynchronously, maybe after a delay (or fully lazily) in case I make another change, but the important (to me) thing is that (as with conflicts) I can see that I'm not yet finished with a change if the hook isn't happy, even if I'm not necessarily going to do anything about that just yet.

That would be so much nicer than what git hooks give me today, especially when rebasing.

crystalthoughts · 2025-02-06T09:53:38Z

Wanting post-commit for ci/cd or auto-push is why I landed here

PhilipMetzger added the enhancement New feature or request label Apr 26, 2024

matts1 mentioned this issue Apr 30, 2024

Prototype for jj api #3601

Draft

4 tasks

matts1 mentioned this issue May 14, 2024

FR: Per-repo jj configuration #3684

Open

matts1 mentioned this issue Jun 14, 2024

FR: Enhance aliases to run multiple commands with conditional logic #3673

Open

matts1 mentioned this issue Jul 26, 2024

FR: Support jj fix -r X to fix X with minimal effects on X.. #3805

Open

PhilipMetzger mentioned this issue Aug 17, 2024

FR: Implement JJ run #1869

Open

5 tasks

This was referenced Nov 9, 2024

util: add exec command for arbitrary aliases #4759

Merged

FR: Daemon for managing multiple repositories. #4854

Open

david-crespo mentioned this issue Jan 9, 2025

Remove format action oxidecomputer/console#2642

Merged

PhilipMetzger mentioned this issue Jan 27, 2025

Integrate with https://pre-commit.com/ #405

Open

PhilipMetzger mentioned this issue Feb 6, 2025

git: skip running pre-push hooks when subprocessing to match git2 behaviour #5612

Merged

arxanas mentioned this issue Feb 8, 2025

Git operations with git.subprocess=true fail if hook prints unexpected text #5610

Closed

FR: Generalized hook support #3577

FR: Generalized hook support #3577

Comments

matts1 commented Apr 26, 2024

PhilipMetzger commented Apr 26, 2024

matts1 commented Apr 28, 2024

PhilipMetzger commented Apr 29, 2024

matts1 commented Apr 29, 2024 • edited Loading

martinvonz commented Apr 30, 2024

matts1 commented May 1, 2024

martinvonz commented May 1, 2024

matts1 commented May 1, 2024

Upload hooks

Pre-upload hooks

Post-upload hooks

Describe hooks

Commit hooks

Other Git hooks

joyously commented May 1, 2024

PhilipMetzger commented May 1, 2024

Describe hooks

matts1 commented May 2, 2024

matts1 commented May 13, 2024

matts1 commented May 13, 2024

joyously commented May 13, 2024

matts1 commented May 13, 2024 • edited Loading

joyously commented May 13, 2024

matts1 commented May 13, 2024

matts1 commented May 13, 2024 • edited Loading

matts1 commented May 13, 2024 • edited Loading

Step 1: Multi-layered config files

Step 2: jj run aliases

Step 3: The hook

martinvonz commented May 13, 2024

matts1 commented May 13, 2024

ilyagr commented May 13, 2024 • edited Loading

martinvonz commented May 13, 2024

joyously commented May 13, 2024

nasamuffin commented May 13, 2024

PhilipMetzger commented May 13, 2024

⚠️, Warning! Wall of text ahed.

Finishing the user story:

Addressing pre-upload hooks aka (g4 presubmit/arc lint)

TL;DR

PhilipMetzger commented May 13, 2024

matts1 commented May 13, 2024

PhilipMetzger commented May 14, 2024

Footnotes

matts1 commented May 14, 2024

PhilipMetzger commented May 15, 2024 • edited Loading

matts1 commented May 15, 2024 • edited Loading

Option 1: Look for the alias pre-upload

Option 2: Specific pre-upload config section

Comments on jj run

arxanas commented May 16, 2024

mlcui-corp commented May 16, 2024

matts1 commented May 16, 2024 • edited Loading

mlcui-corp commented May 17, 2024 • edited Loading

bbigras commented May 17, 2024

PhilipMetzger commented May 24, 2024

talios commented Sep 21, 2024

martinvonz commented Sep 23, 2024

MrJohz commented Nov 28, 2024

arxanas commented Nov 28, 2024

matts1 commented Nov 28, 2024

arxanas commented Nov 29, 2024

MrJohz commented Nov 29, 2024 • edited Loading

andrewaylett commented Feb 1, 2025

crystalthoughts commented Feb 6, 2025

matts1 commented Apr 29, 2024 •

edited

Loading

matts1 commented May 13, 2024 •

edited

Loading

matts1 commented May 13, 2024 •

edited

Loading

matts1 commented May 13, 2024 •

edited

Loading

Step 2: `jj run` aliases

ilyagr commented May 13, 2024 •

edited

Loading

Addressing pre-upload hooks aka (`g4 presubmit`/`arc lint`)

PhilipMetzger commented May 15, 2024 •

edited

Loading

matts1 commented May 15, 2024 •

edited

Loading

Comments on `jj run`

matts1 commented May 16, 2024 •

edited

Loading

mlcui-corp commented May 17, 2024 •

edited

Loading

MrJohz commented Nov 29, 2024 •

edited

Loading