Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't read a file on standard input #239

Closed
baodrate opened this issue Nov 5, 2021 · 9 comments · Fixed by #306
Closed

Can't read a file on standard input #239

baodrate opened this issue Nov 5, 2021 · 9 comments · Fixed by #306
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@baodrate
Copy link
Contributor

baodrate commented Nov 5, 2021

Is it possible to use darker on a stdin stream, or if I assume correctly that it's not, could this be implemented? Of course darker relies on having access to the file path and git history to assess which formatted blocks to keep, but this can be handled by taking an explicit filepath as an argument.

Prior art to reference would be clang-format's --assume-filename, isort's --filename, prettier's --stdin-filepath, among others.

The motivation would be for improved integration with tooling (e.g. w/ ALE)

@akaihola akaihola added the enhancement New feature or request label Nov 5, 2021
@akaihola
Copy link
Owner

akaihola commented Nov 5, 2021

Hi @qubidt, thanks for the suggestion!

Yes, this sounds like a good idea if it enables use of Darker in yet another development environment.

Since you mentioned ALE, do you know if it assumes a name for an explicit filepath argument which would make it particularly easy to interoperate with Darker?

Among the argument names you mentioned, I think --stdin-filepath is the best match, since Darker already has an --stdout argument.

(update: Note that Black has an --stdin-filename argument)

@akaihola akaihola added this to the 1.5.0 milestone Nov 5, 2021
@baodrate
Copy link
Contributor Author

baodrate commented Nov 8, 2021

do you know if it assumes a name for an explicit filepath argument which would make it particularly easy to interoperate with Darker?

If I'm parsing this question correctly, yes. ALE has built in support for many formatters, (among them isort and black). Most of them support a stdin mode (some require some equivalent of a --stdin-filepath like parameter). For these it pipes the contents of the current buffer into the tool's stdin and (if required) passes the buffer's filename as an argument:

https://github.com/dense-analysis/ale/blob/f37cd1fd4fc17173a98649d8a0b2f37ce7ba61cf/autoload/ale/fixers/clangformat.vim#L17-L40

For the ones that can't operate on stream input it seems to write the buffer to a temporary file which it then calls the formatter on:

https://github.com/dense-analysis/ale/blob/f37cd1fd4fc17173a98649d8a0b2f37ce7ba61cf/autoload/ale/fixers/mix_format.vim#L11-L23

This obviously wouldn't work for darker because we need to know which file's git history to compare against.

Yes, this sounds like a good idea if it enables use of Darker in yet another development environment.

Terrific. If I have some time this (or next) week I'll try to work on this, if no one else gets to it first.

@akaihola
Copy link
Owner

ALE has built in support for many formatters, (among them isort and black). Most of them support a stdin mode (some require some equivalent of a --stdin-filepath like parameter). For these it pipes the contents of the current buffer into the tool's stdin and (if required) passes the buffer's filename as an argument:

And this would be our best approach, I think. But does ALE then need to be modified in order to add Darker as a supported formatter, along with the name of its stdin option? Or can you configure the name of the binary and options to pass without patching ALE?

For the ones that can't operate on stream input it seems to write the buffer to a temporary file which it then calls the formatter on:
[...]
This obviously wouldn't work for darker because we need to know which file's git history to compare against.

VSCode uses temporary files as well, and Darker 1.3.2 knows to translate a temporary file path like path/file.py.<hash>.tmp to the corresponding original Python file path like path/file.py before retrieving the old version from the repository.

How does ALE name and place the temporary files?

@akaihola
Copy link
Owner

akaihola commented Feb 24, 2022

I think the stdin mode should only be valid when rev2 of -r/--revision is :WORKTREE: (the default). Since if the revision to compare is already in Git history, Darker could just as well get the contents from there.

(edit: see discussion about ..:STDIN: further below)

@akaihola akaihola mentioned this issue Feb 24, 2022
14 tasks
@akaihola
Copy link
Owner

akaihola commented Feb 24, 2022

Oh, and @qubidt, isn't there the option to do without an --stdin-filepath argument and just make sure the positional PATH arguments only point to a single file if a boolean --stdin argument is enabled?

This would of course make Darker invocation a bit different from the other tools you mentioned, but on the other hand it would fit Darker's existing CLI better.

I mean, if an --stdin-filepath was provided on the command line, what would we expect from and do with positional PATH arguments? Disallow them? Ignore them? Assert that the option and PATH arguments are equivalent?

Examples of happy cases:

$ darker --stdin-filepath=src/mypkg/foo.py  # missing `PATH`s, assumed same as `--stdin-filepath`
$ darker --stdin-filepath=src/mypkg/foo.py src/mypkg/foo.py  # `PATH` equivalent to `--stdin-filepath`
$ darker --stdin src/mypkg/foo.py  # unique `PATH`, implied as the input filepath

Problematic cases:

$ darker --stdin-filepath=src/mypkg/foo.py src/  # `PATH` is a directory, what to do with all files within?
$ darker --stdin-filepath=src/mypkg/foo.py src/mypkg/*.py  # `PATH` expands to multiple files, what to do?
$ darker --stdin src/  # what if `PATH` directory has multiple `.py` files?

(update: Note that Black has an --stdin-filename argument)

@akaihola akaihola modified the milestones: 1.5.0, 1.6.0 Feb 24, 2022
@baodrate
Copy link
Contributor Author

baodrate commented Feb 24, 2022

I mean, if an --stdin-filepath was provided on the command line, what would we expect from and do with positional PATH arguments? Disallow them? Ignore them? Assert that the option and PATH arguments are equivalent?

clang-format behaves thusly:

$ clang-format                # clang-format blocks, waiting for input from stdin

$ clang-format main.cpp       # content of main.cpp is formatted and written to stdout

$ clang-format -i main.cpp    # main.cpp is formatted in place

$ clang-format < main.cpp     # main.cpp is formatted (can't tell if it simply assumes cpp or if it does any
                              # parsing) and written to stdout

$ clang-format -i --assume-filename test.cpp < main.cpp
error: cannot use -i when reading from stdin.

# all of these cases ignore the stdin and --assume-filename argument
$ clang-format mod.cpp < main.cpp
$ clang-format --assume-filename test.cpp mod.cpp < main.cpp
$ clang-format -i --assume-filename test.cpp mod.cpp < main.cpp

prettier:

$ prettier < javascript.js
[error] No parser and no file path given, couldn't infer a parser.

$ prettier        # I guess prettier detects if stdin is a tty
Usage: prettier [options] [file/dir/glob ...]
  ...

$ prettier foo.js          # formatted written to stdout

$ prettier -w foo.js       # formatted in-place

$ prettier --stdin-filepath bar.js < foo.js             # formatted js written to stdout

$ prettier -w --stdin-filepath bar.js < foo.js          # formatted js written to stdout (-w is ignored,
                                                        # whether # the hinted filepath exists or not)

$ prettier --stdin-filepath bar.js baz.js < foo.js      # formatted baz.js is written to stdout, stdin is ignored

$ prettier -w --stdin-filepath bar.js baz.js < foo.js   # # baz.js is formatted in-place, stdin is ignored

To summarize, both tools only read from stdin when no positional-arg paths are provided.
The difference from darker (and black) is that they do not format files in-place by default.
This suggests to me that you're right and a different interface would make sense (I haven't thought it all
through yet tho)


$ darker --stdin src/mypkg/foo.py  # unique `PATH`, implied as the input filepath
$ darker --stdin src/  # what if `PATH` directory has multiple `.py` files?
$ darker --stdin-filepath=src/mypkg/foo.py  # missing `PATH`s, assumed same as `--stdin-filepath`

This is why it makes more sense to me to make the parameter explicitly take an argument, which, unlike the
PATH positional-args, must be a file. That avoids this ambiguity.

  • If --stdin-filepath is specified always disallow positional args
    • this would go beyond the behavior of either the tools I referenced, but that seems reasonable to me since
      darker/black by default imply in-place behavior
  • If --stdin-filepath's argument must be a filepath

This kind of stuff can be tricky to implement in in argparse tho so maintainability should def be considered when picking a CLI design.

$ darker --stdin-filepath=src/mypkg/foo.py src/mypkg/foo.py  # `PATH` equivalent to `--stdin-filepath`
$ darker --stdin-filepath=src/mypkg/foo.py src/  # `PATH` is a directory, what to do with all files within?
$ darker --stdin-filepath=src/mypkg/foo.py src/mypkg/*.py  # `PATH` expands to multiple files, what to do?

All of these should error out.


That's just my immediate observations tho. I'm not super acquainted with either black or darker. (I use them a decent amount but mostly through automated hooks, I rarely work with them thru the CLI or programatically).

I'm not strongly opinionated about this and I obviously don't know darker that well so I'll of course defer to you when it comes to the design.

Sorry I haven't touched this yet, btw. I got busier and this fell off my plate

@akaihola
Copy link
Owner

akaihola commented Feb 24, 2022

Thanks @qubidt for your detailed examples and reasoning. It all makes sense to me.

I think we have two good candidates for the command line design:

  1. boolean --stdin argument and exactly one positional file path:
$ darker --stdin [...other options...] src/mypkg/foo.py
  1. --stdin-filepath=<path> with no positional file paths:
$ darker --stdin-filepath=src/mypkg/foo.py [...other options but no positional arguments...]

(also, I edited both my and your earlier comments to rename --input-filepath to --stdin-filepath to reduce confusion, sorry for messing that up earlier)

(update: Note that Black has an --stdin-filename argument)

@akaihola akaihola changed the title stdin mode? Can't reat a file on standard input Apr 6, 2022
@akaihola akaihola changed the title Can't reat a file on standard input Can't read a file on standard input Apr 6, 2022
@akaihola
Copy link
Owner

Black actually has

  --stdin-filename TEXT           The name of the file when passing it through
                                  stdin. Useful to make sure Black will
                                  respect --force-exclude option on some
                                  editors that rely on using stdin.

So I guess --stdin-filename is a better name for the option than --stdin-filepath since Black is a more closely related tool to Darker than Prettier.

@akaihola akaihola modified the milestones: 1.6.0, 1.7.0 Nov 15, 2022
@akaihola akaihola self-assigned this Jan 6, 2023
@akaihola
Copy link
Owner

akaihola commented Jan 6, 2023

In #306, I've been working on the design with --stdin-filename=<filepath> (and no positional paths allowed).

The :STDIN: design

Yet another possible command line design occurred to me:

darker --revision=..:STDIN: myfile.py

As with the proposed --stdin design earlier in this thread, :STDIN: as rev2 would only accept one positional path which must not be an existing directory on disk. It may be an existing file, but may as well not exist at all.

What does rev2 mean?

The :STDIN: design would elegantly work around the issue of what does rev2 in --revision=<rev1>..<rev2> really mean when --stdin or --stdin-filename= is used (I proposed earlier that it would have to be :WORKTREE:, or empty which defaults to :WORKTREE:).

Solving rev2 in the --stdin or --stdin-filename= design

:WORKTREE: as rev2 doesn't actually make sense when reading from standard input. The contents of the file aren't necessarily really in the worktree. The file might not exist at all, or its contents might be different from what is fed into the standard input of Darker.

We could actually solve this in the --stdin and --stdin-filename designs as well. Instead of requiring rev2 to be :WORKTREE:, we would require it to be :INPUT:, and set the default value to :INPUT:.

In non-stdin mode, the default is:

command line parsed revision range
darker f.py HEAD..:WORKTREE:

With the proposed --stdin --revision=<rev1..rev2> path.py or --stdin-filename=path.py --revision=<rev1..rev2> designs, defaults would be:

revision argument parsed revision range
--revision=HEAD..:STDIN: HEAD..:STDIN:
--revision=..:STDIN: HEAD..:STDIN:
--revision=.. HEAD..:STDIN:
(no --revision=) HEAD..:STDIN:
--revision=main..:STDIN: main..:STDIN:
--revision=main.. main..:STDIN:
--revision=:WORKTREE:..:STDIN: :WORKTREE:..:STDIN:
--revision=:WORKTREE:.. :WORKTREE:..:STDIN:
--revision=..HEAD ❗ ERROR ❗

Hybrid design

We could actually make defaults work both ways:

  • --revision=<rev1>..:STDIN: path.py would imply --stdin [...] path.py or --stdin-filename=path.py
  • --stdin [...] path.py or --stdin-filename=path.py would replace an empty rev2 with :STDIN:, and error out if a different rev2 was specified

What about the configuration file?

Would we accept stdin = true or stdin-filename = "path.py" in pyproject.toml?

Would those be overridable on the command line with --no-stdin or --no-stdin-filename?

Should --revision=..:STDIN: --no-stdin or --revision=..:STDIN: --no-stdin-filename cause an error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

2 participants