-
-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple-IO-redirection semantics ambiguous (how to swap stdout/stderr?) #733
Comments
This can be achieved by using
|
Thanks, @zzamboni. The issue is still a bug though. |
The cause is that Elvish always closes the LHS of redirects: Line 511 in f9ef4e6
So Elvish actually closes fd 2 when it sees We will need some sort of reference counting system to make this work correctly. |
Swapping file descriptors was never real easy to follow in Unix shells anyway, IMO - if you don't look at each redirection in the chain in terms of the dup() call the shell will perform it all seems backwards and confusing. Maybe it's better to think of other ways of expressing it? For instance, combine redirection with braced lists:
Or alternately, instead of treating redirections as a sequence of dup()'s performed in order, make the right-hand side file descriptors always refer to the file table as it was before the redirections were performed: so that it doesn't matter in what order a set of redirections is listed:
The file descriptors on the RHS always refer to the file table that the command would have received if there hadn't been any redirections at all. The file descriptors on the LHS always refer to the file table that will exist in the new command's process. This does potentially create problems when joining streams, however: A command like this to join the stdout and stderr of a command in a pipeline would work as long as we consider the pipe to already be part of the file table (and thus, &1 is the pipe) before we do redirections:
But joining stdout and stderr and redirecting to a file would be a problem:
That might be where we'd need to look at braced lists or something again:
Alternately, join streams inside a lambda, and redirect to the file outside the lambda:
|
@zakukai I like the idea of using braced list to swap FDs. The other part of your proposal - making RHS refer to the original FD table - sounds problematic, as you have pointed out yourself. I am particularly worried that this subtle difference will be very confusing for people familiar with POSIX shell. |
I think that is generally the nature of this endeavour - creating a shell that is in some ways very similar to a POSIX shell but in some ways very, very different. There's always a balance to be struck between the familiar and the novel - but a shell like "elvish" is already moon-speak to someone steeped in the POSIX way of doing things (IMO anyway) and if they weren't open to trying a different approach they wouldn't be using it. :) Some old design choices are just past due to be revisited. Personally my take is that the POSIX shell way of handling this is just plain confusing anyway. For people who understand the POSIX redirections, or at least have memorized a few idiomatic use cases like joining (2>&1) and swapping (3>&1 1>&2 2>&3 3>&-), it's true, it will confuse them that it works differently. But I think my suggestion is at least easy to understand once people understand the concept. In proposing this I kind of had to shoot holes in it to see where it would have to lead. I think most of the "problems" I identified were solved pretty neatly, and it's really just when redirecting to named files (rather than other, already-open FDs) that it becomes challenging: The basic problem is that redirecting to a file carries side-effects, so redirecting to the same target twice (with one-to-one redirections) wouldn't work:
And I can't open the file and then join the streams using one-to-one redirects because under my proposal, redirections can't reference each other:
It's a problem but I think N-to-one redirects solve it pretty well. |
True. However, my rule is thumb is if a piece of Elvish code looks exactly like a piece of POSIX shell code, it should either do exactly the same thing, or something totally different, never something subtly different - the last case is the most confusing. I am perplexed about So to summarize, I like N-to-N redirections, but not N-to-one redirections. I find Another possibility is to "reify" the FD table as a dynamically scoped variable (#993), which provides a alternative syntax for redirections and can be used for complex situations such as swapping FDs. |
Indeed, supporting N-to-1 would not fit in with some of the general idioms of Elvish, which generally allows only N-to-N, in assignments, for instance. I had thought of proposing some kind of alternate syntax for N-to-1 redirect to keep it distinct from N-to-N, but ultimately I didn't see the need. But I can understand why you would consider this inconsistent and prefer to avoid it. Speaking of the dynamically-scoped FD table variable idea: it occurs to me that this could also be used to do away with the ampersand notation in redirects:
It's more verbose perhaps but the concept appeals to me: The ampersand notation for file descriptors exists basically because the shell didn't have any other way to refer to file descriptors. But Elvish has file objects that can be stored in variables, copied, passed as arguments, etc. So in principle if the FD table (or, rather, the standard IO streams - IMO that's all of "the FD table" that should be needed or provided) were made available as variables, and particularly if paired with a way to cleanly control the lifetime of the file object (so it doesn't have to wait for GC to close the file) it would simplify some of the scenarios like "give a command in the middle of a pipeline access to the pipeline's final stdout":
Not crazy about expressing redirections as variable assignments, at least in terms of syntax: But it at least fits neatly with the idea that open files should be managed as variables, not as FD numbers. I suppose one could also borrow this chestnut:
This style of redirect in Bash or Korn Shell would normally mean "find an available file table slot and bind the file there, and record the file descriptor number in the variable provided" - but in this case since the variable used in the redirect is "io:out", it would instead bind stderr to file descriptor 1 (stdout) when launching the command. |
@xiaq (minor nit) When you write “rectify” above, did you mean “reify”? |
@hanche ah yes, I meant "reify". I really like the observation that reified FD table can be used to eliminate the ampersand forms, and if reified FD table does get implemented, I'll consider deprecating and eventually removing the ampersand form; it's one of the more obscure parts of POSIX syntax. Regarding how to name reified FDs: I am more inclined to expose the full FD table as a list, instead of naming the 3 standard streams separately. On a Unix syscall level, the FDs 0, 1 and 2 are not actually special in any way; it is only a user-space convention that most processes have these 3 FDs "pre-opened", and libc internalizes that convention. The story is different on Windows, which does treat the standard streams in a special way, but that shouldn't stop Unix users from doing what the OS allows. The FD table can support I also have seen some scripts that make use of higher FDs. I haven't written a lot of such scripts myself, but I feel it's an under-utilized way of doing simple text-based IPC that is worth encouraging. Finally regarding |
Personally, generally when I have used higher-number FDs within a shell script, it has been essentially because the shell does not support other ways of managing open files. For instance, if I have a function that "returns a value" (that is, by writing the value to stdout so the caller can capture it) but the function still needs to communicate information to the user using actual stdout, I will dup "actual" stdout to another file. When it comes to invoking another program, I agree that full control over the set of numbered FDs should be supported. (I wasn't clear on that point) It's within the script that I think higher-numbered FDs shouldn't be used - within the idioms of the language I think it makes a lot more sense, if one needs additional files open, to use file objects (managed by variable scope, passed as arguments, exchanged between different threads using "put", etc.) rather than numbered file descriptors provided by redirection syntax. At the very least it's a pattern that should be discouraged IMO. |
Is this really a bug? That is, warranting the "bug" label? As the subject line states the current, documented, behavior is ambiguous. It also makes some, potentially useful, semantics hard to write. This seems to me to fall into the enhancement issue set. |
In bash, multiple IO redirections are processed as though they were variable assignments. That is, one can swap stdout/stderr by doing
which amounts to
In elvish, the same does not quite seem to work.
I have created test files
write.py
, which writesstdout
to stdout andstderr
to stderr:and
filter.py
, which processes and marks its stdin:By default, running
python write.py | python filter.py
works as expected, with onlystdout
being filtered and marked:Now, if I try to swap stdout and stderr for
write.py
, I get errors:This works in bash as expected, however:
The text was updated successfully, but these errors were encountered: