Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hackmud fails to launch sometimes, failed assertion in Mono #131

Closed
mogery opened this issue Oct 11, 2021 · 12 comments · Fixed by #132
Closed

hackmud fails to launch sometimes, failed assertion in Mono #131

mogery opened this issue Oct 11, 2021 · 12 comments · Fixed by #132

Comments

@mogery
Copy link
Contributor

mogery commented Oct 11, 2021

I get the same message as in #105, except I don't get any error messages about opcodes.

TYPE: 31
* Assertion at mini-amd64.c:209, condition 'amd64_is_imm32 (disp)' not met

Stacktrace:

* Assertion: should not be reached at interp-stubs.c:105

Here are the following links to source code on GitHub with the exact lines:
mini-amd64.c
interp-stubs.c

@ptitSeb
Copy link
Owner

ptitSeb commented Oct 11, 2021

Yes, I still don't know why some pointer are not 32bits.

@mogery
Copy link
Contributor Author

mogery commented Oct 11, 2021

@ptitSeb If I can do anything to help you figure this out, let me know.

@ptitSeb
Copy link
Owner

ptitSeb commented Oct 11, 2021

I need to understand were this pointer came from. I assume it comes from a mmap or malloc or something similar, but I haven't found the source yet.
But maybe it comes from something else, some api call or something like that?

@mogery
Copy link
Contributor Author

mogery commented Oct 11, 2021

For some reason I'm having trouble getting this to trigger from within GDB.

@mogery
Copy link
Contributor Author

mogery commented Oct 11, 2021

Ok, yeah, after like an hour of running gdb over and over and over again, this does not happen in GDB.

It still can happen if I attach GDB afterwards, but backtrace fails:

(gdb) backtrace
#0  0x0000ffffb8133138 in ?? ()
#1  0x0000000000000050 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

@mogery
Copy link
Contributor Author

mogery commented Oct 12, 2021

Oh my, after hours of digging through mono source code, adding debug messages, and recompiling, I finally figured it why it crashes.

So, the issue happens when mono's JIT tries to emulate a call to its IL's FREM opcode.
It emulates this opcode with a pointer to fmod.

#ifdef MONO_ARCH_EMULATE_FREM
	register_opcode_emulation (OP_FREM, "__emul_frem", "double double double", fmod, "fmod", FALSE);
	register_opcode_emulation (OP_RREM, "__emul_rrem", "float float float", fmodf, "fmodf", FALSE);
#endif

So, when we run the code, we crash at a JIT'd call to __emul_frem (= fmod).
Let's see some debug logs I've printed out:

=== CEE_MONO_JIT_ICALL_ADDR CAUGHT ===
method name: __icall_wrapper___emul_frem
...

== CALL INFO ==
name: __emul_frem
sym : fmod
ip  : 0x130acc
func: 0xa149e4e0

This means that we attempt to call fmod, which is located at 0xa149e4e0, from our JIT'd code, which is located at 0x130acc. If we do a quick subtraction, we see that 0xa149e4e0 - 0x130acc = 0xa136da14, which is out of signed 32-bit range, and therefore fails amd64_is_imm32:

#define amd64_is_imm32(val) ((gint64)val >= -((gint64)1<<31) && (gint64)val <= (((gint64)1<<31)-1))

If we take a look at the memory mapping of the process, this is how it looks:

0012f000-00135000 rwxp 00000000 00:00 0 # this is where Mono puts the JIT'd code

00400000-00401000 r-xp 00000000 b3:02 535223 hackmud # this is where the main executable is mounted

a1269000-a1549000 rw-p 00000000 00:00 0 [heap] # this is where wrapped libm is mounted

A possible fix would be to make sure that box64 maps memory and mounts libraries before 0x7fffffff.

@ptitSeb
Copy link
Owner

ptitSeb commented Oct 12, 2021

Oh, wow, well done!

Do you know if there is other function called, other than fmod and fmodf?

@ptitSeb
Copy link
Owner

ptitSeb commented Oct 12, 2021

Also, "signed i32". Damn, that's half what I was counting on.
Easy solution would be to make the bridges nearer. I currently use posix_memalign(...) to get the bridges address (were the fmod will get is address on box64). I guess I will have to use a memmap(...) instead, and ask for a low 32bits address.

@mogery
Copy link
Contributor Author

mogery commented Oct 12, 2021

Oh, wow, well done!

Do you know if there is other function called, other than fmod and fmodf?

I have no idea, but it might be good practice to use mmap in librarian with the MAP_32BIT flag for all libraries.

I'll test and open a PR.

@ptitSeb
Copy link
Owner

ptitSeb commented Oct 12, 2021

It's not librarian here. It's the bridges, because libm is a wrapped one.

Note that I will not be able to tests box64 in the next few days. But I should still be able to look at PR (and there is CI with travis now on box64 too)

@mogery
Copy link
Contributor Author

mogery commented Oct 12, 2021

Oh alright, thanks for letting me know. I'll play around with the code.

@ptitSeb
Copy link
Owner

ptitSeb commented Oct 12, 2021

Look in tools/bridge.c the function NewBrick(...) does the allocation. Use my_mmap(...) if you want to use MAP_32BIT, as it doesn't exist in Aarch64. Or look at my_mmap(...) to see how that flags is emulated (using some functions in custommem.c)

mogery added a commit to mogery/box64 that referenced this issue Oct 12, 2021
This fixes an issue with mono where JIT compiled code would
near-call wrapped libraries, but fail because the difference
between PC and the call address did not fit into an imm32.

This was fixed by replacing posix_memalign with my_mmap and
providing the MAP_32BIT flag.

Fixes ptitSeb#131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants