Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple UC_HOOK_MEM callbacks for unaligned access #1041

Closed
RYDB3RG opened this issue Nov 10, 2018 · 10 comments
Closed

multiple UC_HOOK_MEM callbacks for unaligned access #1041

RYDB3RG opened this issue Nov 10, 2018 · 10 comments

Comments

@RYDB3RG
Copy link

RYDB3RG commented Nov 10, 2018

It seems to fire multiple read/write callbacks for unaligned access. Is this expected?

void hook_mem(uc_engine *uc, uc_mem_type type,
    uint64_t address, int size, int64_t value, void *user_data)
{
    uint64_t rip;
    uc_reg_read(uc, UC_X86_REG_RIP, &rip);
    printf("%p %s %p %d\n", rip, (type == UC_MEM_WRITE) ? "write" : "read ", address, size);
}

int main(int argc, char **argv)
{
    const uint8_t code[] = {
        "\x48\x8b\x00"  //  mov         rax,qword ptr [rax]  
        "\x48\x89\x01"  //   mov         qword ptr [rcx],rax  
        "\xcc"
    };

    uc_engine *uc;
    uc_open(UC_ARCH_X86, UC_MODE_64, &uc);
    uc_mem_map(uc, 0, 0x10000, UC_PROT_ALL);
    uc_mem_write(uc, 0, code, sizeof(code) - 1);

    uc_hook hh;
    uc_hook_add(uc, &hh, UC_HOOK_MEM_READ | UC_HOOK_MEM_WRITE, hook_mem, nullptr, 1, 0);

    const uint64_t addr = 0x1fff;
    uc_reg_write(uc, UC_X86_REG_RAX, &addr);
    uc_reg_write(uc, UC_X86_REG_RCX, &addr);
    uc_emu_start(uc, 0, sizeof(code) - 1, 0, 2);

    return 0;
}

produces:

0000000000000000 read  0000000000001FFF 8
0000000000000000 read  0000000000001FF8 8
0000000000000000 read  0000000000002000 8
0000000000000003 write 0000000000001FFF 8
0000000000000003 write 0000000000002006 1
0000000000000003 write 0000000000002005 1
0000000000000003 write 0000000000002004 1
0000000000000003 write 0000000000002003 1
0000000000000003 write 0000000000002002 1
0000000000000003 write 0000000000002001 1
0000000000000003 write 0000000000002000 1
0000000000000003 write 0000000000001FFF 1
@firodj
Copy link

firodj commented Dec 6, 2018

it seems the hook called for read mem by the helper_le_ld_name in softmmu_template.h before the unaligned and also reread again by spanning into aligned address. (3 calls)

I think it shouldnot called the hook when there is no actual read on mem, (the 1st call).

@BrunoPujos
Copy link
Contributor

Hello,

So I had the same bug, the following python code allows to reproduce it:

from unicorn import *
from capstone import *
import unicorn.x86_const as x86C

# mov rax, 0x1FFF
# mov rbx, [rax]
code = '\xb8\xff\x1f\x00\x00H\x8b\x18'
base = 0x1000

mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(base, 0x2000)
mu.mem_write(base, code)

# code at 0x1000
# src at 0x1fff

def hook_read(uc, access, address, size, value, user_data):
    print("HK Read addr=0x{:X}, size=0x{:X}".format(address, size))
    return True

md = Cs(CS_ARCH_X86, CS_MODE_64)

def hook_code(uc, address, size, user_data):
    global md
    ins = md.disasm(str(uc.mem_read(address, size)), address).next()
    print("HK Code at 0x{:X}, size = 0x{:X}: {} {}".format(address, size, ins.mnemonic, ins.op_str))

mu.hook_add(UC_HOOK_CODE, hook_code)
mu.hook_add(UC_HOOK_MEM_READ, hook_read)

mu.emu_start(base, base + len(code))

## RESULT:
#HK Code at 0x1000, size = 0x5: mov eax, 0x1fff
#HK Code at 0x1005, size = 0x3: mov rbx, qword ptr [rax]
#HK Read addr=0x1FFF, size=0x8
#HK Read addr=0x1FF8, size=0x8
#HK Read addr=0x2000, size=0x8

This bug is indeed link to the helper_le_ld_name function in softmmu_template.h: when an unaligned access between page is triggered the function calls itself recursively for getting aligned access on each page before computing the result (https://github.com/unicorn-engine/unicorn/blob/master/qemu/softmmu_template.h#L339):

    /* Handle slow unaligned access (it spans two pages or IO).  */
    if (DATA_SIZE > 1
        && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
                    >= TARGET_PAGE_SIZE)) {
        target_ulong addr1, addr2;
        DATA_TYPE res1, res2;
        unsigned shift;
    do_unaligned_access:
#ifdef ALIGNED_ONLY
        //cpu_unaligned_access(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE,
        //                     mmu_idx, retaddr);
        env->invalid_addr = addr;
#if defined(SOFTMMU_CODE_ACCESS)
        env->invalid_error = UC_ERR_FETCH_UNALIGNED;
#else
        env->invalid_error = UC_ERR_READ_UNALIGNED;
#endif
        cpu_exit(uc->current_cpu);
        return 0;
#endif
        addr1 = addr & ~(DATA_SIZE - 1);
        addr2 = addr1 + DATA_SIZE;
        /* Note the adjustment at the beginning of the function.
        Undo that for the recursion.  */
        res1 = helper_le_ld_name(env, addr1, mmu_idx, retaddr + GETPC_ADJ); // HERE: recursive call 1
        res2 = helper_le_ld_name(env, addr2, mmu_idx, retaddr + GETPC_ADJ); // HERE: recursive call 2
        shift = (addr & (DATA_SIZE - 1)) * 8;

        /* Little-endian combine.  */
        res = (res1 >> shift) | (res2 << ((DATA_SIZE * 8) - shift));
        goto _out;
    }

As the hooks on the memory access are triggered at the beginning of the helper_le_ld_name function (https://github.com/unicorn-engine/unicorn/blob/master/qemu/softmmu_template.h#L259 for the UC_MEM_READ) it will be trigger by the first call and by the two recursive calls.

I think the easiest way to fix this problem would be to not trigger the hooks when qemu is performing a recursive call. This could be done by adding an argument to the helper_le_ld_name function or by defining a new function. As I am not sure about the consequences of such a patch, in particular link to the unmap and protect callback I did not propose a patch for this particular problem.

I also think this should be categorize as a bug.

@aquynh
Copy link
Member

aquynh commented Jul 31, 2019

We can check if we are in recursive mode. Can be done by adding a variable to uc_struct, so no need to modify this code. Please pull req.

@BrunoPujos
Copy link
Contributor

Hello,

So I did a patch as you suggested (patch_recur_hook.txt):

  1. Add a bool recur_mem_access in uc_struct (uc_priv.h)
  2. Add a check in the HOOK_FOREACH macro for recur_mem_access (uc_priv.h)
  3. Set and unset the boolean when needed in softmmu_template.h

The patch work for read and write access but there is a problem. If an access is made between two pages and the second one (higher address) is not mapped the hook for the unmap was triggered only inside the recursive call for the second page. With the patch I did, the unmap hook will not be triggered as the hook are disabled for the recursive call.

Here is a more complete test for the unmap case:

from unicorn import *
from capstone import *
import unicorn.x86_const as x86C

    
# mov rax, 0x1FFF
# mov rbx, [rax]
# mov [rax], rbx
code = '\xb8\xff\x1f\x00\x00H\x8b\x18H\x89\x18'
base = 0x1000

mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(base, 0x1000)
mu.mem_write(base, code)

# code at 0x1000
# src at 0x1fff

def hook_mem(uc, access, address, size, value, user_data):
    print("HK {} addr=0x{:X}, size=0x{:X}".format("Read" if access == UC_MEM_READ else "Write", address, size))
    return True

md = Cs(CS_ARCH_X86, CS_MODE_64)

def hook_code(uc, address, size, user_data):
    global md
    ins = md.disasm(str(uc.mem_read(address, size)), address).next()
    print("HK Code at 0x{:X}, size = 0x{:X}: {} {}".format(address, size, ins.mnemonic, ins.op_str))

def hook_mem_unmap(uc, access, address, size, value, user_data):
    print("HK Unmap {} addr=0x{:X}, size=0x{:X}".format("Read" if access == UC_MEM_READ_UNMAPPED else "Write", address, size))
    uc.mem_map(base + 0x1000, 0x1000)
    return True

mu.hook_add(UC_HOOK_CODE, hook_code)
mu.hook_add(UC_HOOK_MEM_READ | UC_HOOK_MEM_WRITE, hook_mem)
mu.hook_add(UC_HOOK_MEM_READ_UNMAPPED | UC_HOOK_MEM_WRITE_UNMAPPED, hook_mem_unmap)

mu.emu_start(base, base + len(code))

Without my patch we have several hook for read and write, with it the hook_mem_unmap is never called and the code fails.

Should I disable only the hook for UC_HOOK_MEM_READ and UC_HOOK_MEM_WRITE or is it a more complex problem which need more re-factoring ?

@aquynh
Copy link
Member

aquynh commented Aug 2, 2019

A simple solution is to detect if the memory address is at the edge, then apply the recursive condition checking wisely. You can do some experiments to see if there is a perfect solution (quite tricky i think).

@BrunoPujos
Copy link
Contributor

Could you check the PR #1113 and tell me if this seems a correct fix for you ? The test with AppVeyor seems to have failed but this appears to be a problem with AppVeyor and not the patch (or I did not get what was wrong).

Thank you

@RYDB3RG
Copy link
Author

RYDB3RG commented Aug 28, 2019

#1113 works beautifully for me, output for above code is now:

0000000000000000 read  0000000000001FFF 8
0000000000000003 write 0000000000001FFF 8

Thanks

@Waterman178
Copy link
Contributor

thanks

@wtdcode wtdcode closed this as completed Oct 3, 2021
@firodj
Copy link

firodj commented Jul 18, 2022

Hi @aquynh cc: @BrunoPujos, this issue re-appear on master branch, probably since Unicorn 2. Could you check again? thanks.

@wtdcode
Copy link
Member

wtdcode commented Jul 18, 2022

@firodj Did you check #1651 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants