-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MEGA65: Switch "zero page" to $1600 - $16ff #315
base: main
Are you sure you want to change the base?
Conversation
I assume it's because LLVM still thinks that zero page variables are at $00xx. In some addressing modes, this is important - zero page variables aren't always accessed using zero page modes. |
What I found surprising was, that |
Variables marked as "zero page" will utilize zero page addressing modes. The value being |
@asiekierka I tried to use $1602 as
|
Yes, because it's trying to fit |
This is the linker check I mentioned on Discord. On most CPUs, 5636 isn't a valid address you can access with a zero page addressing mode, but on the mega65 CPU, it is, since you can set BP to make that access legal. The linker needs to be informed about this possibility, so that it doesn't check that bits 8-15 of the 32-bit virtual address for the symbol are zero. |
Ok, I rejected the idea as wrong anyway. I just thought that I might be able to work around the problem temporarily. |
I wouldn't say it's wrong; if an address llvm-mos accesses via the zero page addressing mode actually is $1602, then its symbol value should be $1602. There's no need to lie that way; ideally, symbol values that you'd read out of an ELF file should correspond to actual addresses on the system. |
That's insufficient - there's still the We can either add a compilation flag to prohibit the compiler from knowing how to convert a zero page pointer to a regular pointer, or a compilation option to configure where exactly the zero page is redirected (this option could also effectively generalize across to the HuC6280 and 65C816, by the way!). |
That's true; it's necessary, but not sufficient. I don't think anything but the pointer case has any implications on the compiler though; it just tags symbols as either being in the zero page or not, and it uses different addressing modes. In both cases, the actual symbol values are assigned by the linker. But, for zero page pointers, there isn't actually enough information available at runtime to extend the pointer. So it makes sense to me to make that decision by fiat; probably as a target attribute. Then, you could set it on a per-function using C attributes; that's about a general as we could reasonably support. I think your idea for an option to prevent 8->16 conversions makes sense; it would play nice with per-function overrides as a way to catch mistakes. |
Yes, But if the BP register is set to $16, you would automatically accessing $1602 if you use the address $02.. Thus, starting the Zero-Page with $02 is still correct. It depends on the BP register where it is actually located in memory. |
Actually, now that I think about it again, there is! You can use The real problem, in my eyes, is the fact the "zero page" also includes the virtual registers - this is why I don't think you can really support this on a per-function basis well, at least not for a "version 1" implementation; it breaks the calling convention, for example. |
The zero page isn't really a region of memory; it's an addressing mode. For a given instruction, it says "treat the number I hand you this way, and use it to find the real address". BP just changes the way that mapping occurs. But the underlying addresses are just addresses; those are the ones that go into linker scripts and get assigned to symbols. The exception is the two high bits of the 32-bit symbol addresses; we use those as generic "bank numbers", since addresses are 16-bits for the most part. |
That's great, and we should definitely do that.
That's a good point, I'm going to have to think more on this. Imaginary registers may be worth special casing somehow. |
So here's my thoughts. (Apologies if this retreads some ground already discussed; I'd only been loosely following the discussion.) Imaginary registers pull a lot of weight. Like modern registers, they provide a compact way to refer to arguments, parameters, and temporary values of functions. The callee/caller-saved convention allows the registers to be shared between callers and callees without conflict, and relatively efficiently. There are alternative schemes like SPARC's register windows that also accomplish the above, but without the cost of saving and restoring registers. We could probably do this on the 65816, but not on the mega65, where the BPs don't overlap. But even on the 65816, it adds a one cycle penalty to all zero page accesses, and they're supposed to be as fast as the hardware can go. So it looks like they're here to stay. That being said, as @asiekierka pointed out, they're not just memory locations; they're a contract between callers and callees. Callers and callees need to agree on the memory locations used, but otherwise, they can be arbitrary, so long as they're accessible with the zero page addressing mode. This would allow an interrupt chain or thread to use a completely different set of imaginary registers, which is highly desirable. This means that e.g. on Mega65 Presently, the compiler leaves the numeric value of the register up to the linker; it just issues references to Still, the compiler has to emit something when it wants to refer to an imaginary register, whether in absolute or zero paged addressing modes. But, the same function could be called at runtime with many different sets of imaginary registers, so it can't statically know the actual addresses of these. The linker can't know either, so it can't be as simple as a symbol reference to a real address. Here's a proposal. Instead of linker symbol values, we could consider imaginary registers to be on the other end of 8-bit pointers, where the values of these pointers are set by symbol references. The linker scripts would be able to place the registers as per usual, but for targets with variable BP, the symbols would always range from 0x0000 to 0x00ff, and the compiler would need to itself issue the logic to extend the pointer using BP to a 16-bit address if needed. This would allow a single function to be responsive to runtime calls by callers with varying BP. Zero page variables and their zero page pointers, on the other hand, have actual fixed BPs, and those don't vary at runtime. The linker will assign them actual real memory locations, and it's up to the program to make sure that BP is set appropriately to access them. This does raise a wrinkle in extending zero page pointers using BP; it would have to assume that the current value of BP is sufficient to access the value. This essentially amounts to treating zero page pointer extensions as accesses, which seems reasonable. Also, I can't actually recall anywhere where we use an imaginary register symbol with absolute addressing in the compiler proper; the only place I can recall offhand is |
It's a decent proposal, thouh I would make sure to limit the changes to the compiler only. If we do it by adding a new address space ( |
What would extending |
You're right, but that means I'm missing something in my mental model and I'm not sure what it is. |
I just want to interject that I consider this to be an important discussion, and I also feel that some of the proposed solutions have far-reaching long-term implications. I suggest that we continue to talk about this problem for a bit longer before we make a decision here. I was considering the idea of introducing a relocation type for this particular use case... Namely, there might be a new relocation that emits a tab instruction inline if it cannot be demonstrated that the BP is correct for the current instruction. This may have far-reaching implications too, and I am not at all sure it is the way to go. But it may be possible. Another possibility would be a new relocation type that simply truncates the top 8 bits of the 16 bit address, and slams the lower 8 bits into the instruction. This feels wrong to me somehow, like we are throwing away an important clue, but it may be possible. I am very open to different approaches here. |
I'm looking at approaches other compilers have taken. ca65 seems to deal with it by not dealing with it, although Greg King started a fork to provide support for base pages. vasm seems to introduce a new assembler directive, .setbp, which tells following code that it can assume that the B register is set to a specified value. If that is correct, then .setbp is a bad name for the directive, as it doesn't actually set the B register. |
This is how e.g. "range extension" is done on most architectures; if you have a branch that is too far, or a PC-relative address that is too far away, then it will insert a "thunk" or materialize a constant nearby and redirect to that thunk or constant. In this case though, the linker wouldn't be able to do even trivial reasoning about the value of the B register, since it has no notion of control flow, functions, etc; it barely has a model of machine instructions. It would then need to conservatively set BP any time such a relocation appears; in that case, one may as well use absolute addressing instead. Contrast the thunks above; these are based on distances, so the linker can reason what ranges a thunk can be reused within.
This is the semantics I'd argue the zero page relocation should have on the 65CE02 and 65816. In the absence of BP inserting relocs as per the above, a zero page reloc would amount essentially to a promise that BP is already correctly set to the high byte of the symbol address. There would be no way for the linker to verify this, especially since the same function could (incorrectly) be called from several contexts with other BPs. |
Hello, the sod responsible for the 45GS02 here :) Some key things that come to mind are:
Now some thoughts of processor tweaks that I could provide at relatively low cost:
|
I think if there's interest in supporting the prototype, and it doesn't do horrible things to the compiler, and someone wants to do it, it's a good idea. Those three ifs multiply together though to make a probability that seems pretty low to me. I definitely have no special interest in the C65; and only a cursory one for FPGA 6502 derivatives. But I wouldn't want to stand in the way of the enthusiasm of others, excepting the cases those various enthusiasms come into conflict. I can say I don't remember anyone expressing an interest in supporting the C65 so far.
This is rough; as long as I've been a compiler engineer, such lists are really hard to come by, and they can usually only be created by domain experts with an absolutely ridiculous amount of experience; everyone else doesn't know how much they don't know. That's hard to come by in this space; I have a ton of compiler/linker experience (relatively speaking), but much less with the various target systems, save a couple. Others are extremely familiar with target systems, but usually lack compiler/linker experience. Lacking this, usually the best you can do is just put stuff out there "in the soup", and let it be broken. When it breaks, it breaks in a specific, tangible way; and that fosters discussion. The right people will eventually complain about it, and then you can go in and fix it. Honestly the mega65 target is already pretty much in that state; this whole thread is the right people complaining about it. So I'd expect a simple BP management approach to be similar.
I think most of this is Quality of Life stuff; being able to work around stuff like that in a compiler is table stakes. Were those present, we wouldn't use them right now, just like we don't use any of the other 45GS02 extentions, just for lack of dev time. Accordingly, I don't think slosh in managing BP would compare to not using e.g. stack-relative addressing.
Similarly, getting to the point where the backend would be able to use something like this is a long ways off. I do think we want to support the equivalent of the various x86's near, far, and huge memory models for code and data, but while we've talked off and on about it at length, it's not really at the top of anyone's TODO list. (Except maybe if @asiekierka gets a new computer ;) ) |
The LLVM backend does distinguish between the 4510 and 45GS02. However, that mostly means we provide correct assembler support for all 6502 subtypes. The question of what's supported for compilation by the LLVM backend is separate; for example, on 65C02 and its derivatives, we currently compile Likewise, we could implement something for the 45GS02 only, but not the 4510. The question is mostly that of the effort required and whether someone will put it in.
Your best option is to get involved with LLVM-MOS development, I'm afraid; the best way to deeply understand how the compiler thinks is to study and work with it at a low level. Note also that other compilers handle things differently (for example, SDCC has its own intermediate representation which predates the popularity of SSA), so things which work well for us might not work so well for others. |
@gardners @mysterymath @asiekierka I've taken the liberty of creating a new issue, #317, specifically for continuing the important discussion of possible architectural changes to the 45GS02. Hopefully this will allow us to continue the discussion re "correct" handling of direct page in the presence of a B register here. |
Hello! MEGA65 ROM developer here. I'm excited to see the brainstorming in this thread, and can see how it could result in useful improvements to MEGA65 support specifically and llvm-mos in general. The opportunity to use target CPU features like the relocatable base page could be compelling for a bunch of reasons. I do want to amend the problem statement in this ticket, though. The motivating premise was that the MEGA65 KERNAL reserves the entirety of the zero page. This is only true in the same sense as other Commodores: the KERNAL uses a limited range of upper ZP, BASIC uses the rest, and if machine code generated by a compiler doesn't need to return to BASIC, it can safely use BASIC's ZP region and still keep the KERNAL active (screen terminal, IRQ handler, I/O calls). If such a program does want to return cleanly to BASIC (as in, an RTS to the SYS command, not just a warm boot), it has to preserve/restore BASIC's ZP region. (And of course, a program that installs new hardware IRQ handlers and never calls the KERNAL can do whatever it wants.) We've been telling early MEGA65 developers to avoid touching ZP entirely while we confirm what the KERNAL API contract should say about its use of the ZP. This is only meant to be temporary, and we're close to being able to document $02-$8F as available, similar to other Commodores. $03-$0B are currently part of the JMPFAR/JSRFAR KERNAL calls, but if llvm-mos doesn't call that, it can use the full range, or just use $0C-$8F for safety. From the KERNAL's perspective, it's really just a testing and documentation project at this point. (We need to confirm that we didn't accidentally add anything to the KERNAL that uses lower ZP addresses, for example.) I want to finish and document this as soon as next month. This issue has come up a few times on our side because people start out writing llvm-mos programs that try to return normally at the end of main(), only to see BASIC misbehave afterwards. I wonder if there is canonical behavior we can add to the MEGA65 target's main() exit to improve this. I don't know what Commodore C programmers normally expect (warm boot? "press a key" followed by warm boot? infinite loop?), so I yield to your judgement. We could even have llvm-mos's MEGA65 target document $1600-$16FF as reserved, and do the BASIC ZP copy/restore in main() entry/exit by default. I'm sure we can get a MEGA65 dev to implement the details. If there's anything we can do on the KERNAL side to simplify things, please let me know. |
On the outside, this feels like a mistake. I think there's three aspects of the
If any of the above weren't true, there'd be a way to use the resulting PRG files such that they cleanly returned, but as is it feels like one of the second two points should change. We may want to either split the commodore targets into BASIC and no-BASIC variants, or if it's possible, provide a link-time configuration symbol for this. Whatever the default ends up being, it should infinite loop if it isn't safe to return to BASIC; that at least may let one inspect the output.
This is also a reasonable option, and it's come up before. I'll admit to not being sure what the best way to proceed here is; it probably bears some thought and discussion. The easiest change we could make to make this consistent would be to change the EDIT: Looking at the git blame, it looks like this was my fault, from day one of my original SDK rework. Still, we don't have a strong backcompat guarantee beyond using Semantic Versioning, so we should definitely fix this situation. |
Please follow the discussion here: https://discord.com/channels/1058149494107148399/1058149494107148402/1210065267716259880
Problem:
The memory between $0000 - $15FF is currently documented as reserved on the MEGA65. llvm-mos currently uses the address $0002 - $0090 for imaginary registers (and more) and therefore conflicts with various ROM routines which are expecting to use the so called Zero-Page for information as well.
Proposed solution
The CPU on the MEGA65 provides a BP (Base Page) register that is used as high byte for every Load or Store operation with 8bit addresses (so called Zero Page addressing mode). Thus, the Zero-Page (now called Base-Page) can be relocated freely within the first bank.
This solution proposes to use the BP register to relocate all accesses using the 8Bit address mode (Base-Page addressing) to $1600 - $16ff by setting BP to $16. llvm-mos can therefore still put the imaginary register to a fast memory by using a BP other then $00.
Inline assembly that needs to call any ROM routine, needs to set the BP register back to $00 before entering the kernel and switch back to $16 afterwards.
The performance impact can be held quite low by using this concept.
Current State:
I've modified the unmap-basic.S so that it is setting the BP to $16.
I thought about changing
__basic_zp_start
from $0002 to $1602 but I decided against it.imag-regs.ld
uses this address as base for defining the addresses of the imaginary registers. As these registers should be in the BP and we want still use BP addressing for them, it would make no sense to use 16bit addresses IMHO.I've discovered that printf() and malloc does not work after this change, unfortunately. Thus, we see side effects.
It needs to be discovered what is the reason for this.