x86 - how does internal functions of kernel resolve after paging? - Stack Overflow

时间： 2025-01-06 admin 业界

I was recently learning about kernel developement where I came across the concept of higher half kernels, until now I used to think that entire kernel must be 1:1 mapped after paging, but it seems that's not the case, which brings me to my question. As after linking, all the function calls/jumps are just hardcoded addresses (i think), how does kernel resolve those addresses as they point to physical memory? Or are those addresses after linking virtual? If so how does the linker script figure out what will be the virtual address for a function? Also how can the kernel be even loaded if those virtual addresses (that gets replaced while linking) are different then physical address?

Share Improve this question asked 23 hours ago Hououin_kyouma 293 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

x86 calls/jumps use relative addressing (like jmp rel32 does RIP += sign_extend(rel32)), so are position-independent. Only absolute addresses like pointers (data and function) would need to be fixed up when the kernel is relocated if you want them to work from both virtual addresses.

If your bootloader jumps to your kernel entry point at a virtual address that isn't what you want, you can map the desired virt addresses to the same physical pages you're currently executing from, then jump there. (It's fine to have multiple virtual mappings reference the same physical page; x86 CPU caches are required to handle that without corrupting anything.)

If your whole kernel isn't position-independent, a sensible design would be to have some code in the kernel entry point which sets up mappings before the main part of the kernel runs at all. This special part of the kernel could be hand-written in asm, or compiled as position-independent.

Since the rest of the kernel will only run from one virtual address, you can just tell the linker to link your kernel at e.g. 1 or 2GiB below the end of virtual address-space. (gcc -mcmodel=kernel, like non-PIE Linux kernels used to use, so absolute addresses can be used as sign-extended 32-bit immediates for stuff like mov eax, [array + rdi*4])

You'd need some mechanism for the early-boot part of the kernel to tell the main kernel which physical memory it used for the page table, and which physical pages are holding the kernel's code+data+stack.