Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out-of-bounds memory access in user-space BPF VM due to unsigned int overflow #379

Open
wingo opened this issue Aug 22, 2014 · 16 comments
Open

Comments

@wingo
Copy link

wingo commented Aug 22, 2014

To reproduce:

$ sudo /usr/sbin/tcpdump -i eth0 'ip[0xffffffee:4] == 0'

The BPF:

(000) ldh      [12]
(001) jeq      #0x800           jt 2    jf 5
(002) ld       [-4]
(003) jeq      #0x0             jt 4    jf 5
(004) ret      #262144
(005) ret      #0

Note that the resulting offset is 2**32 - 4.

Expected results: nothing prints, as the offset is not within the range of the packet

Actual results: depends on what's mapped at offset 2**32-4 from your packet start. On my system I don't get a crash but I do get passing packets. The reason is that the length check is performed against OFFSET + sizeof(int32), which overflows to 0.

Note that this vulnerability does not appear to affect the Linux kernel, as its range validation is different. The precise example above doesn't run on the kernel, because loading large constant offsets has special meaning to the kernel, and the constant is invalid, forcing the user-space VM to run. You can make the kernel try a similar codepath by making the offset a computed value. I wasn't able to cause a crash or unexpected passes, but I can't rule out the possibility of a vulnerability there.

@guyharris
Copy link
Member

Note that OFFSET + sizeof(int32) overflows to 0 on an ILP32 platform but not an LP64 platform, and probably not even on an LLP64 platform (cf. WinPcap) as long as sizeof is 64-bit. In particular, with

tcpdump -O 'link[0xFFFFFFFC:4] != 0xf00ba4f'

I didn't see any packets on 64-bit OpenBSD but did see them on 32-bit OpenBSD. (This was capturing on a network device, so that's the kernel BPF. That's not our bug, but it's the same issue, so we want to report that issue to various uses of the in-kernel BPF engine as well, if they're not already checking whether the starting offset of the halfword or word being fetched is within the packet data.)

@guyharris
Copy link
Member

For kernel BPFs, the -O is necessary because, without -O, it compiles to

(000) ld       [-4]
(001) jeq      #0xf00ba4f       jt 2    jf 3
(002) ret      #0
(003) ret      #65535

(yes, bpf_dump() needs to be fixed to print the offsets as signed; there's a bit of signed vs. unsigned cleanup to do on some kernel BPF interpreters as well), and the kernel's bpf_validate() rejects the first instruction, as the offset is bigger than the "maximum packet size" in the kernel's BPF interpreter, whereas, without -O, it compiles to

(000) ld       #0xfffffffc
(001) st       M[0]
(002) ldx      M[0]
(003) ld       [x + 0]
(004) st       M[1]
(005) ld       #0xf00ba4f
(006) st       M[2]
(007) ldx      M[2]
(008) ld       M[1]
(009) sub      x
(010) jeq      #0x0             jt 11   jf 12
(011) ret      #0
(012) ret      #65535

which contains nothing that bpf_validate() would reject, so the run-time range check for indexed loads needs to be fixed there.

The userland bpf_validate() doesn't do the check for a "maximum packet size". The "maximum packet size" for userland BPF is 2^32-1; that means that for "absolute" loads, userland bpf_validate() should check against 2^32-(size of the datum being loaded), which will cause the optimized program to be rejected.

However, we also need to make sure the indexed loads are checked correctly at run time in userland and all the kernel BPF interpreters. (The kernel's bpf_validate() also rejects offsets > the "maximum packet size" for those, which I guess is OK although it prevents loading stuff that precedes the current value of the X register, and also doesn't, as noted, obviate the need for run-time checks.)

@guyharris
Copy link
Member

I've checked change 26a3075 into the trunk; it makes the interpreter more like the FreeBSD interpreter, which does more bounds checking and catches this issue. Unless somebody finds a problem with it, I'll backport it to the 1.6 branch.

guyharris added a commit that referenced this issue Sep 1, 2014
This prevents the checks from incorrectly succeeding due to integer
overflow on 32-bit platforms, addressing GitHub issue #379.

It also catches integer overflow in indexed loads when adding the X
register to the offset; that's not necessary for bounds checking, but
we'll put it in anyway.

Also, make the variable "k", and the memory array, unsigned; that's the
way it is in most kernel BPF interpreters, and means we use unsigned
address arithmetic consistently.
@fxlb
Copy link
Member

fxlb commented Mar 13, 2023

After the change commited by Guy, any reason to keep this issue open?

@infrastation
Copy link
Member

@guyharris, do you think this matter still requires any attention in libpcap or elsewhere?

@guyharris
Copy link
Member

The command given above, modified not to assume a location for tcpdump or an interface on which to capture:

tcpdump -i <interface> 'ip[0xffffffee:4] == 0'

will either 1) have the filter accepted by the kernel, in which case any out-of-bounds reference is a bug in the OS kernel, not in libpcap, or 2) have the filter rejected by the kernel, in which case our current userland filter should handle it without an out-of-bounds access.

@fxlb
Copy link
Member

fxlb commented Dec 4, 2024

$ sudo ./tcpdump -i eth0 'ip[0xffffffee:4] == 0'                               
Warning: Kernel filter failed: Invalid argument
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes

A warning. Nothing captured.

$ sudo ./tcpdump -c10  -i any 'ip[0xffffffee:4] == 0'                           
tcpdump: WARNING: any: That device doesn't support promiscuous mode
(Promiscuous mode not supported on the "any" device)
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes

Capture packets.

@fxlb
Copy link
Member

fxlb commented Dec 4, 2024

Warning: Kernel filter failed: Invalid argument

Should be an error and exit ?

@infrastation
Copy link
Member

Traditionally libpcap uses userland filtering if kernel filtering didn't work. It looks odd that the "any" interface didn't reject the filter.

@guyharris
Copy link
Member

Traditionally libpcap uses userland filtering if kernel filtering didn't work.

At least some kernel cBPF implementations impose a maximum size for filters ("here's a 3GB BPF program", on a 32-bit machine, is a bit rude :-)), so valid-but-too-large filters are rejected by the kernel; this allows the packets to be filtered, albeit without the advantage of having packets that aren't what the user wants to see being discarded early, before they have to be put into a limited-size kernel buffer or be copied up to userland.

It looks odd that the "any" interface didn't reject the filter.

I'll see what code gets generated for Ethernet rather than cooked Linux captures; perhaps the cooked-Linux BPF passes the validator.

@infrastation
Copy link
Member

Thank you.

@fxlb
Copy link
Member

fxlb commented Dec 4, 2024

Traditionally libpcap uses userland filtering if kernel filtering didn't work. It looks odd that the "any" interface didn't reject the filter.

Ah, yes.
Perhaps tcpdump could print:
Warning: Kernel filter failed: Invalid argument. Using userland filtering.

@guyharris
Copy link
Member

ubu22-04$ sudo tcpdump -i ens33 -d 'ip[0xffffffee:4] == 0'
(000) ldh      [12]
(001) jeq      #0x800           jt 2	jf 5
(002) ld       [-4]
(003) jeq      #0x0             jt 4	jf 5
(004) ret      #262144
(005) ret      #0
ubu22-04$ sudo tcpdump -i any -d 'ip[0xffffffee:4] == 0'
tcpdump: data link type LINUX_SLL2
(000) ldh      [0]
(001) jeq      #0x800           jt 2	jf 5
(002) ld       [2]
(003) jeq      #0x0             jt 4	jf 5
(004) ret      #262144
(005) ret      #0

The difference is that the sample expression is ip[{offset}:{length}], so it's relative to the beginning of the IPv4 payload. 0xffffffee is, as a signed integer, -18. The Ethernet header is only 14 octets; 14-18 is a negative number, so you have an offset from the beginning of the packet that's either negative and not a valid "special" value on Linux or a very large unsigned value, so the filter is rejected. The LINKTYPE_LINUX_SLL2 header is 20 octets; 20-18 is 2, so no problem (other than that it's not within the IPv4 header).

@guyharris
Copy link
Member

Ah, yes. Perhaps tcpdump could print: Warning: Kernel filter failed: Invalid argument. Using userland filtering.

EINVAL could either be that the kernel rejected it because it's too big or because it's not something the kernel would accept; "too big" means "more than BPF_MAXINSNS", and that's defined in <linux/bpf_common.h>, which is included by <linux/bpf.h>, which we don't include because it defines the BPF_ opcode values and those might produce redefinition warnings if we've already included our bpf.h header.

For now, we can report EINVAL as "filter is not supported by the kernel".

@infrastation
Copy link
Member

Another common reason for rejecting the filter on some OSes is presence of BPF_MOD or BPF_XOR in the bytecode.

@infrastation
Copy link
Member

In terms of the hypothetical new function mentioned in #1375 this could be:

p = pcap_create("mydevice", errbuf);
pcap_set_flags(p, pcap_get_flags(p) | PCAP_REQUIRE_KERNEL_FILTERING);
pcap_activate(p);

which would make pcap_activate() turn the warning into an error and disable the fallback to userland filtering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants