Skip to content

Latest commit

 

History

History
139 lines (110 loc) · 7.02 KB

README.md

File metadata and controls

139 lines (110 loc) · 7.02 KB

How does dynamic linking works?

This article is trying to explain how dynamic linking works at the Mach-O level. If you're not familiar with Mach-O format, you can read the content in macho_parser first.

Assembly

Although mastering assembly is certainly not required, we will encounter two basic addressing concepts later.

RIP-Relative Addressing

RIP-Relative Addressing is an addressing mode in the form of offset(%rip). The actual address is %rip + offset. It's worth noting that %rip always points to the the next instruction.

For example, the following instruction is at address 0x3f96. The size of jmpqinstruction is 6 bytes. At the moment of executing this instruction, the value of $rip is 0x3f9c (0x3f96 + 0x6). Thus the program will jump to address 0x401c (0x3f9c + 0x80).

0000000000003f96    jmpq    0x80(%rip)

Indirect Addressing

The notation of indirect addressing has a leading asterisk(*). The aforementioned instruction will jump directly to 0x401c, while the following instruction will jump to the address stored at 0x401c.

0000000000003f96    jmpq    *0x80(%rip)

Indirect address can be broken down to two steps: 1. read the value at the specified address; 2. use the value as the target address.

Sample Code

First, let's build a simple dylib example. In the main function, we are accessing a global variable (llios_lib_str) and calling a function (llios_lib_func) defined in a dylib built from lib.c.

// lib.c
#include <stdio.h>

char *llios_lib_str = "str";
void llios_lib_func(char *str) {
    printf("%s\n", str);
}
// main.c
extern char *llios_lib_str;
extern void llios_lib_func(char *);

int main() {
    llios_lib_func(llios_lib_str);
    return 0;
}

(The sample code and build script are in this directory. It's better to build the code by yourself and follow along the article.)

Then we disassemble the __text section (where the code exist) by otool. This is the body of main function.

$ otool -tv a.out
a.out:
(__TEXT,__text) section
_main:
0000000100003f70	pushq	%rbp
0000000100003f71	movq	%rsp, %rbp
0000000100003f74	subq	$0x10, %rsp
0000000100003f78	movq	0x81(%rip), %rax
0000000100003f7f	movl	$0x0, -0x4(%rbp)
0000000100003f86	movq	(%rax), %rdi
0000000100003f89	callq	0x100003f96
0000000100003f8e	xorl	%eax, %eax
0000000100003f90	addq	$0x10, %rsp
0000000100003f94	popq	%rbp
0000000100003f95	retq

Non-lazy binding

The first interesting part is line 4: movq 0x81(%rip), %rax, which accesses the value at 0x81(%rip). Use the knowledge of RIP-relative addressing mentioned before, we know 0x81(%rip) is 0x100004000 (0x100003f7f + 0x81), which is in the __got section.

$ otool -s __DATA_CONST __got a.out
a.out:
Contents of (__DATA_CONST,__got) section
0000000100004000	00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The __got section is the global offset table, a table for non-lazy imported globals. This section contains an array of 64-bit pointers. In this case there are two elements. Each element is associated to an indirect symbol. (For how indirect symbol works, check here.)

$ otool -Iv a.out
...
Indirect symbols for (__DATA_CONST,__got) 2 entries
address            index name
0x0000000100004000     4 _llios_lib_str
0x0000000100004008     5 dyld_stub_binder
...

The values in __got are all 0x0 in the file, because we don't know the values during compilation. dyld, at launch time, finds non-lazy symbols in the dylibs and write the addresses into the section. At the moment of accessing _llios_lib_str, a valid address is already there (at 0x100004000). Global variables are non-lazily bound while functions are usually lazily bound.

Another symbol in __got is dyld_stub_binder. We will get there in a moment.

Lazy binding

Another interesting line is line 7: callq 0x100003f96. From main.c, we know the only method is called is llios_lib_func, so this line must be calling into it. Let's verify it.

0x100003f96 is in the __stubs section. There is only one jmpq instruction, because in this example there is only one imported global function.

$ otool -v -s __TEXT __stubs a.out
Contents of (__TEXT,__stubs) section
0000000100003f96	jmpq	*0x4064(%rip)

Please note here is indirect addressing. The target address is stored at 0x100008000 (0x100003f96 + 0x6 + 0x4064, where 0x6 is the size of jmpq instruction). Where is 0x100008000 then? It's in the __la_symbol_ptr section.

$ otool -s __DATA __la_symbol_ptr a.out
Contents of (__DATA,__la_symbol_ptr) section
0000000100008000	ac 3f 00 00 01 00 00 00

Same as __got section, __la_symbol_ptr is also an array 64-bit pointers which are associated to indirect symbols. Using the same approach, we know 0x100008000 is indeed for the symbol _llios_lib_func. This confirms our guess.

$ otool -Iv a.out
Indirect symbols for (__DATA,__la_symbol_ptr) 1 entries
address            index name
0x0000000100008000     3 _llios_lib_func

Different from __got, the current value of 0x8000 is not 0x0. Instead it's 0x100003fac (endianness). Interestingly, 0x100003fac is in the section __stub_helper.

$ otool -v -s __TEXT __stub_helper a.out
Contents of (__TEXT,__stub_helper) section
0000000100003f9c	leaq	0x4065(%rip), %r11          <-- line 1
0000000100003fa3	pushq	%r11
0000000100003fa5	jmpq	*0x5d(%rip)                 <-- line 3
0000000100003fab	nop
0000000100003fac	pushq	$0x0                        <-- line 5
0000000100003fb1	jmp	0x100003f9c

Please note that 0x100003fac is at line 5. Following the code, the program jumps to 0x100003f9c which is line 1. Eventually it gets to line 3 and jumps to *0x5d(%rip) (indirect addressing again). We should be good at this now. 0x5d(%rip) is 0x100004008 (0x100003fab + 0x5d). It seems we have seen this location before. Yes, it's the 2nd element in __got, which is dyld_stub_binder. As you may still remember, __got is the non-lazy binding section and the address of dyld_stub_binder will be written there at launch time. Basically, callq 0x100003f96 (the one supposed to call llios_lib_func) instruction actually calls into dyld_stub_binder.

Here is what happens when calling a method in a dylib. The program calls into code in (__TEXT,__stub) which reads the address stored in __la_symbol_ptr and jumps to that. At the first time, that address is pointing to (__TEXT,__stub_helper) which in turn calls into dyld_stub_binder. dyld_stub_binder finds the symbol in the dylib, writes it back to __la_symbol_ptr and jumps to the real method. Next time calling the same method, __la_symbol_ptr has the real address, so the program can jump to it directly.

The following diagram shows the binding process. The green arrow is non-lazy binding and the red arrows are the lazy binding flows (first time access).

Dynamic Linking Binding