Running multiple operating systems on an IA32 Plex86 research document Kevin Lawton * Summary: Virtualizing an Intel chip is pure pain * Expanded Summary: This paper describes how to virtualize an Intel chip. The major complications are 1) Unprivledged user-mode instructions that read system registers; 2) Instructions that have different effects based on processor mode. The machine emulator must trap all attempts to execute these instructions by guest code (operating system or application). The basic approach is place software traps before privledged instructions, thereby allowing the monitor to regain control and emulate the effect of the instruction. This implementation is tricky, to say the least. The big problem is self-modifying code, which prevents us from doing a one-time pass of code pages. Another problem is that guest code can read the modified code page, which would reveal the virtualization. The basic approach is as follows. Initially, all pages are marked invalid. Upon a page fault, scan the page and insert traps before: a) bad instructions; b) a branch instruction (more on this below). Mark the page as read-only. Any write attempt invalidates the scan for the page. As an optimization, we can omit pre-branch traps if the branch target has already been scanned. This requires maintaining state about the scan status of each instruction (or at least every basic block). A few cases arise: 1) The branch target(s) is dynamic: not much we can do here: trap 2) The branch target(s) is static and is in the current page: this is easy--we can safely allow this branch to execute without trapping. 3) The branch target(s) is static and on a different page. If the target pages are scanned, we can allow this to safely pass through. But, if the target page is ever modified, we need to invalidate every branch that jumps into the page. Ouch. Another concern is that modified code can examine itself to learn about the inserted software traps. Any program which assumes a precise instruction sequence would be affected. Two possible solutions: 1) Create a private copy of all modified pages; execution occurs in the private copy; reads are handled in the user copy. Intel's segmentation functions can make this happen. 2) Exploit the separation of the data and instruction TLBs to emulate execute-only code. A much better solution, but not available on certain Intel clones.