-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PVF worker: Maintain lists of used syscalls #1663
Conversation
I ran rubocop --autocorrect on list-syscalls.rb. I needed a linter, but rubocop emitted 500 (minor) errors. I had rubocop fix them automatically.
Adds --only-list-syscalls option to list-syscalls.rb to produce these lists
# with musl libc library. Since this is not actually used to link any binaries | ||
# it should most likely work just fine. | ||
|
||
gcc "$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need those extra wrappers, do we? We're only calling gcc/g++ with no extra flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do, compiling with the x86_64-unknown-linux-musl
target will fail otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it enough to set CC_x86_64_unknown_linux_musl = { value = "gcc", force = true, relative = true }
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, looks like that works (without relative = true
), though we lose the explanatory comments in the files. For that reason I'd rather keep the musl-gcc
files.
46 (sendmsg) | ||
53 (socketpair) | ||
55 (getsockopt) | ||
56 (clone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see particularly dangerous syscalls here, like clone
Is there a possibility of further tailoring this list to the ones required strictly during PVF execution, after all the wasmtime setup is done? We only really need to install the filter right before executing the PVF, not during the prerequisite setup, which should enable us to tighten this list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an idea of having a "blacklist for the whitelist", that is, pick out all the I/O syscalls from this list and additionally block those. Makes sense to implement that at the same time as logging, and we'll see if any I/O calls get through the filter. For now, as we discussed, I just want some reasonable assurance that io_uring
is never called, so we can properly block networking in the interim. :)
@@ -0,0 +1,608 @@ | |||
#!/usr/bin/ruby |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a thought: do we really want to commit to maintaining ruby code? I know I can barely read it 😝
Is there a plan to rewrite it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really want to commit to maintaining ruby code? I know I can barely read it 😝 Is there a plan to rewrite it?
Sorry! So my excuse here is that originally I wrote this script as a quick experiment, and not necessarily with a "we'll include this in the CI" mindset, and for my personal throwaway scripts I always use Ruby since it takes fraction of the time and space to write compared to alternatives. This whole thing took me, like, I don't know, less than 30 minutes to write maybe? (I think Ruby is really underrated when it comes to scripting; it's a shame Python won.)
If you or someone else would like to volunteer rewrite it, be my guest. (:
That said, I think the language choice it actually the least dubious part here: ideally if you'd want to do this properly you would not do a quick hack job like this one where you parse objdump
output and YOLO it with regexps. Instead you'd properly disassemble the program into a strongly typed form and analyze that (so for that approach writing it in Rust would actually be the best, since you have ready made crates to deconstruct/disassemble the ELF files). But instead of 30 minutes of work that'd take, like, probably at least a week. And I have better things to spend a week on, which is why when I ran this experiment and wanted to get quick result this is what I wrote.
(Although to be honest it's probably unlikely this will break, and if it'll break it'd probably be like a 10 minute job for me to fix it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember suggesting to RIIR some time ago.1 Would be fun to rewrite it, would help me understand how it works better. Not a big priority since it's only meant to run in CI. As a backup I also plan to log all syscall violations for a while, just in case this script missed some.
Footnotes
-
Because you know, performance and correctness are nice properties in a language. :P Just kidding, Ruby is always a pleasure to work with. Except for how ruby on MacOS doesn't work by default, the latest version of ruby-lint gave me a runtime error, gems randomly start failing, old links to docs give 404, etc. I especially love how I have an old Jekyll blog that literally won't build anymore, even in Docker. Kidding of course, love the language.↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anybody wants to rewrite it, here there's a project I started with in Parity with a working example of how to extract the code segment from an ELF, disassemble it and analyze instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big priority since it's only meant to run in CI.
For now, yes. But I assume we plan to partially base the process of creating the seccomp whitelist on this script, so I think it has quite a high criticality in terms of the blast radius if it misbehaves (I'm not saying it does/will).
That said, I think the language choice it actually the least dubious part here
It's quite impressive for a 30-min job 😄
I agree with you that the language is the least dubious part though and support the idea of rewriting it with the hope of making it both easier to understand and comprehensive.
Personally, I believe this is a very difficult task to do correctly. I know at least one failed attempt with a past project I worked on, but the goal there was more ambitious - to also retrieve system call arguments (limited to flags), which turned out to be as difficult as solving the halting problem 😅
With all this said, I'm fine with keeping this as is for the time being (though I would like to see it rewritten some time, more so because it was designed to be a POC, from a code quality perspective). Coupling it with extensive auditing of the reported syscalls by validators (through SECCOMP_RET_LOG
), I think we can achieve reasonable confidence in the whitelist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the script comes across something unexpected it should output a warning and cause the CI job to fail. Ofc it can still misbehave, but we plan to have some logging of violations for a while before switching on the seccomp filter. For now this should give us enough confidence to restrict networking by blocking io_uring
and socket
syscalls.
Also, if we'll eventually just replace all this with PolkaVM, then re-writing the script would not be a good use of resources.
I ran rubocop --autocorrect on list-syscalls.rb. I needed a linter, but rubocop emitted 500 (minor) errors. I had rubocop fix them automatically.
Adds --only-list-syscalls option to list-syscalls.rb to produce these lists
9d822ab
to
4bd6362
Compare
…rcnski/ci-list-syscalls
- ./list-syscalls.rb ../../../target/x86_64-unknown-linux-musl/production/polkadot-execute-worker --only-used-syscalls | diff -u execute-worker-syscalls - | ||
- ./list-syscalls.rb ../../../target/x86_64-unknown-linux-musl/production/polkadot-prepare-worker --only-used-syscalls | diff -u prepare-worker-syscalls - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @altaua, it works! I'm just curious why there is a -
at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We got some confirmation that it works. 😂
https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/3941675
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just curious why there is a
-
at the end?
It's diff
between a file and stdin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@altaua It would be ideal if we had an informative message on failure, something like
The list of syscalls in the worker binary has changed. Please review whether this is expected and update execute-worker-syscalls if so
bb900af
to
2c9d848
Compare
2c9d848
to
96a1500
Compare
@@ -523,3 +523,4 @@ test-syscalls: | |||
- if [[ "$CI_JOB_STATUS" == "failed" ]]; then | |||
printf "The x86_64 syscalls used by the worker binaries have changed. Please review if this is expected and update polkadot/scripts/list-syscalls/*-worker-syscalls as needed.\n"; | |||
fi | |||
allow_failure: true # TODO: remove this once we have an idea how often the syscall lists will change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was probably added to not confuse devs who would be blocked and not know what to do.
But the syscalls should change rarely and mostly only when updating the worker code.
It can stay not required for now until we have the syscall blocking in place.
Co-authored-by: Mira Ressel <[email protected]>
We're already working on sandboxing by [blocking all unneeded syscalls](#882). However, due to the wide scope it will take a while longer. This PR starts with a much smaller scope, only blocking network-related syscalls until the above is ready. For security we block the following with `seccomp`: - creation of new sockets - these are unneeded in PVF jobs, and we can safely block them without affecting consensus. - `io_uring` - as discussed [here](paritytech/polkadot#7334 (comment)), io_uring allows for networking and needs to be blocked. See below for a discussion on the safety of doing this. - `connect`ing to sockets - the above two points are enough for networking and is what birdcage does (or [used to do](phylum-dev/birdcage#47)) to restrict networking. However, it is possible to [connect to abstract unix sockets](https://lore.kernel.org/landlock/[email protected]/T/#u) to do some kinds of sandbox escapes, so we also block the `connect` syscall. (Intentionally left out of implementer's guide because it felt like too much detail.) `io_uring` is just a way of issuing system calls in an async manner, and there is nothing stopping wasmtime from legitimately using it. Fortunately, at the moment it does not. Generally, not many applications use `io_uring` in production yet, because of the numerous kernel CVEs discovered. It's still under a lot of development. Android outright banned `io_uring` for these reasons. Considering `io_uring`'s status, and that it very likely would get detected either by our [recently-added static analysis](#1663) or by testing, I think it is fairly safe to block it. If execution hits an edge case code path unique to a given machine, it's already taken a non-deterministic branch anyway. After all, we just care that the majority of validators reach the same result and preserve consensus. So worst-case scenario, there's a dispute, and we can always admit fault and refund the wrong validator. On the other hand, if all validators take the code path that results in a seccomp violation, then they would all vote against the current candidate, which is also fine. The violation would get logged (in big scary letters) and hopefully some validator reports it to us. Actually, a worst-worse-case scenario is that 50% of validators vote against, so that there is no consensus. But so many things would have to go wrong for that to happen: 1. An update to `wasmtime` is introduced that uses io_uring (unlikely as io_uring is mainly for IO-heavy applications) 2. The new syscall is not detected by our static analysis 3. It is never triggered in any of our tests 4. It then gets triggered on some super edge case in production on 50% of validators causing a stall (bad but very unlikely) 5. Or, it triggers on only a few validators causing a dispute (more likely but not as bad?) Considering how many things would have to go wrong here, we believe it's safe to block `io_uring`. Closes #619 Original PR in Polkadot repo: paritytech/polkadot#7334
* Add RococoBridgeHub <> WococoBridgeHub full 2 way bridge * Use StorageMapKeyProvider instead of account_info_storage_key() Avoid duplicating storage_map_final_key() * clippy + leftovers
--only-list-syscalls
option to get syscalls without extra noise. Warnings are still printed.Related
Closes #1652
Closes https://github.com/paritytech/ci_cd/issues/880 (CI pipeline request)