-
-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--exec does not escape cmd.exe metacharacters (Windows) #155
Comments
This makes we wonder why we take the detour with On the other hand this enables features like shell output redirection without explicitly invoking a shell. I'm not certain this is worth all of the trouble of multiple layers of escaping. Looking forward to other thoughts on this. |
The main reason for shelling out to a shell is because it's incredibly complex to properly parse shell syntax, and in doing so you will end up restricting yourself to whatever subset of shell functionality that your implementation supports.
Escaping currently only happens with a single line of code, so there aren't multiple layers -- just one. Line 95 in 614f576
Had a feeling this would happen with that crate though. Probably best that we do the escaping ourselves. The crate doesn't really handle the whole special character set on both platforms. |
We should simply check for any bytes within the range of 0...42, 59...64, 91...96, and 123...127, and then escape them accordingly. On *nix systems, that's the |
Something like this should do the job efficiently: use std::borrow::Cow;
#[cfg(windows)]
const ESCAPE_CHAR: u8 = b'^';
#[cfg(not(windows))]
const ESCAPE_CHAR: u8 = b'\\';
fn needs_escape(byte: u8) -> bool {
byte < 43 || (byte > 58 && byte < 65)
|| (byte > 90 && byte < 97)
|| (byte > 122 && byte <= 127)
}
pub fn escape<'a>(input: &'a str) -> Cow<'a, str> {
let chars_to_escape = input.as_bytes().iter().filter(|&&x| needs_escape(x)).count();
if chars_to_escape == 0 {
Cow::Borrowed(input)
} else {
let mut output = Vec::with_capacity(input.len() + chars_to_escape);
for &character in input.as_bytes() {
if needs_escape(character) {
output.push(ESCAPE_CHAR);
}
output.push(character);
}
let output = unsafe { String::from_utf8_unchecked(output) };
Cow::Owned(output)
}
} |
Might help if you can try out #158 and see if that fixes your issues. |
I agree, implementing shell syntax would be crazy. But that's not what I was thinking about. I was thinking about dropping the intermediary shell and with it all support for shell redirection etc. inside the
Exactly. Currently there is only one layer of escaping. The second layer (cmd escaping) is missing—that's what this issue is about. See https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/ for details.
Could you please elaborate how you came up with these ranges? There's still the issue with properly quoting file names with spaces, which cannot be solved by only escaping on a character by character basis.
Thanks for the PR. It escapes all the cmd metacharacters I tested. But unfortunately files with spaces are still an issue:
|
This would majorly limit the capabilities of what you can do with command executions. There are many things you can do beyond redirections that having an intermediary shell provides, such as executing multiple statements, conditionals, shell expansions (such as process, brace, variable, array, etc. expansions), and entire scripts. So without the shell, you'd have to replicate these at some level, and that's effectively writing your own shell. And it'd be a shame to give up these capabilities just because
The range was created from the ASCII table, which includes all special characters that matter within the 7-bit range. It's surprising that |
Best course of action may just be to drop |
@reima Try the latest commit, which is using |
That's something I don't see myself using a single command line for. Anything as complex as conditionals I would put into a script that is spawned by Another argument against spawning a shell is that every time I don't use shell features, I pay for something I don't use. If I use It's also worth noting that someone coming from Maybe we could compromise on making the shell optional via a command line switch?
There is no issue with
Yes, but why did you chose these ranges specifically? E.g. why is 42 included, but 43 is not? What I'm getting at is that without further explanation these are just a bunch of magic numbers. There should at least be a comment in the source code about how these ranges came to be (e.g. what class of escapeable characters they contain).
Sadly, that's how command line arguments work on Windows. In *nix, the kernel API for spawing a process ( |
Escaping works just fine now. But the performance impact of spinning up a new PowerShell for each file is huge (times are in seconds, ~60 files per run, each command was run multiple times in different orders, and the numbers are representative samples):
Using
I also tested with a WIP version where no shell is spawned, which is a little faster (as expected):
|
I tend to agree that executing the command in a shell is unnecessary. |
Am working on a the logic to enable execution without a shell. It's just rather complicated on the Linux side of things. Namely, you need to ensure that you write an argument splitter yourself, and must use a different set of rules for escaping inputs, so you're kind of re-inventing a basic shell parser. You can't really use quotes with inputs, because they can conflict with quotes provided by the user, but you can't use backslashes either, because then you'll just have additional backslashes in your input. So it's very, very tricky to handle properly. |
@mmstick I've created PR #160 with my WIP version of removing the shell entirely. I decided to use the same syntax as find/xargs/parallel etc. and allow multiple arguments for the |
@reima @mmstick Sorry for the late reply. Thank you both for looking into this and for your contributions.
I believe I also agree with this. An additional shell layer would be nice to have in terms of functionality, but dropping it will speed things up and also dramatically simplify the involved logic concerning quoting. I'm afraid that we would keep running into similar issues as reported in this ticket - and each of these is, in effect, a rather serious problem (or even a vulnerability). Accidental spawning of unwanted processes might seem contrived, but people will run fd in folders with millions of files and one of them might indeed have a very unfortunate name 😃. |
closed via #160. |
On Windows,
--exec
launches the command throughcmd.exe
. This works fine as long as the file names do not contain any of the metacharacters(
,)
,%
,!
,^
,"
,<
,>
,&
, or|
. These metacharacters are not escaped byfd
. As a consequence, they are interpreted bycmd.exe
with their special meaning.This may even lead to unwanted processes being started by
fd
:reima
is my Windows user name, which is printed by callingwhoami
.There is also an issue with the escaping of file names with spaces (which was not fixed with #135, at least on Windows):
The text was updated successfully, but these errors were encountered: