-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experience in running Firejail inside a Docker container? #1210
Comments
I'm only going to comment on the stuff that I can reasonably answer here:
I agree, this should definitely be fixed (or perhaps a
Docker is (or at least, should be) using a decent percentage of what Firejail does already (with the exception of the Seccomp-BPF stuff), because it needs to use namespaces to provide it's own isolation. Nesting namespaces created by different tools can be tricky to get correct (and is prone to random failures), and I believe that's why Firejail is complaining.
Agreed, this should be better documented.
Ideally this should be a switch. The current behavior is significantly safer than using the value of
I'm not 100% certain about this, but I will comment that there are quite a few applications (not even just graphical ones) which expect to at least have read access to most of
Probably not without at least a patched version of Docker, and possibly a patched kernel. The
While I have not tried this myself, based on my (limited) knowledge of the Linux VFS layer, I doubt it will work. As far as I understand it, a bind mount is kind of like a hardlink. You can change the properties of that particular link, but there are certain things that can't be changed, and the stuff that the kernel's NFS client is lacking implementations for which OverlayFS needs are one of those things. You may have slightly more luck hacking something together with FUSE (or possibly using a userspace NFS client), but I doubt you'll be able to get OverlayFS to work on NFS without patching the kernel. |
Oh, also, sorry about double posting, but I just noticed your comment about requiring resource level isolation for this. In short, to do that, you'll need something else working together with firejail, as firejail doesn't (currently) do any resource-level isolation. |
The current security technologiy in Linux kernel is access technology - netfilter, SELinux ( mandatory access controls), PID namespace (no access to system PID namespace) etc. It works fine if the bad guy is outside and tries to bring in his exploit code and take over the system. If you bring untrusted user code in your system yourself, it will be more difficult for the kernel to deal with it. In this situation I would say your best approach would be:
Things to stay away from:
Good luck! |
We are running arbitrary user programs and obviously need to isolate them in a sandbox. That is why I am extremely interested in Firejail, since running single-shot processes in QEMU seems like a good idea but is a handling and performance nightmare. I am opening this issue because I am looking for suggestions if Firejail is the right fit for our problems or if something else would be better suited (I am not looking for another self-made solution), and to report some problems I encountered. I only had a peak at some of Firejail's source code and the used technologies, so bear with me. If someone else is doing something similar I would be extremely happy to read about how they are doing it.
We are trying to solve the following problems:
I would like to see this solved inside the service Docker container, and not inside QEMU or yet another Docker container. The reason is that the FS of the service already has all FS data, so a Overlay FS would IMHO perform quite good. It would also reduce the workflow to calling a process instead of creating an image, setting up a container/VM for it, run and recycle the whole workflow. Additionally, it would solve the really bad connection for us between QEMU and Docker/Kubernetes to simple process calls in the same container.
It seems (or does it?) that Firejail can solve everything, we need so I tried it out. However, I am having a hard time to setting this up. The following problems were encountered (if I should open up an issue for each of them, just mumble the word):
a.)
--quiet
is definitely not quiet. The warnings seem OK but I think they should be hidden behind another argument if--quiet
is used. Also, there is definitely non-warning output when an overlay is used.b.) Since the Firejail process runs inside Docker,
--force
was needed. Why is this restriction needed at all, if it can be simply overwritten?c.) We need "ptrace" working inside firejail, so
--allow-debuggers
was needed. The argument cannot be used with the Kernel 4.4 because of a serious bug, which got fixed in 4.8 which can be read here https://lwn.net/Articles/690685/ and "seccomp reordered after ptrace" in https://outflux.net/blog/archives/2016/10/04/security-things-in-linux-v4-8/. We are using Ubuntu 16.04 and thankfully there is a kernel upgrade available usingapt install --install-recommends linux-generic-hwe-16.04
. This could be better documented on the Firejail side.d.) The Firejail home for the user is read using
getpwuid
which forces one to make that particular directory writable. This should be changed so that the environment variable "HOME" is used (I hacked that in and can send it upstream if you like), or should be settable using an argument.e.) The Firejail home for the user needs to have the right UID and GID. Why? Isn't it enough to have the user write and read to it?
f.) The Firejail binary saves its runtime data to "/run/firejail" with UID=root and GID=root. It would be neat if this would be documented somewhere and if this was a compile-time option.
g.) Is it possible to define a file to collect every seccomp (and other) violation?
h.) I only found
--overlay-clean
to clean up overlays but what if I want to clean the overlay of a specific process?i.) I tried to get firejail running without marking the Docker container "privileged" and failed. Is it even possible, and why not?
j.) I haven't looked into this yet but why isn't it possible to use the
--private*
arguments with--overlay
? I want to have an overlay over certain directories and files, but do a clean FS for the rest of the FS, and then look at changes after the process has died. Does anybody have a solution for this?k.) We are using Vagrant with NFS to develop our product, and overlay does not play with NFS. Does anybody have any luck with BindFS to trick OverlayFS into working?
l.) I will reiterate over my feature branch hopefully at the end of next week and try to get my tiny changes upstream. Is there an official configuration for a formatter?
I am sure that I forgot some things but if anyone can help with a few of these problems I would be really happy.
The text was updated successfully, but these errors were encountered: