NetBSD i386 PAE patches This page documents some patches I've thrown together to support PAE on NetBSD i386 - the summary is "It works, more or less", and the patches are: http://netbsd.org/~jmorse/pae/corebits.diff pmap and locore changes to support PAE addressing, as well as a swathe of format string fixes to cope with paddr_t changing size http://netbsd.org/~jmorse/pae/nx.diff Enables i386 NX bit, preventing data pages from being executed http://netbsd.org/~jmorse/pae/pckb_wedge.diff pckb fix, removes an infinite loop in pckbport driver All patches are against NetBSD-5 Core patch The bulk of these alterations are locore/pmap changes - Following how XEN+PAE works, third-level PDPs are constantly mapped to page directories and are essentially ignored except during pmap creation/destruction. No locking changes were required. However this adds quite a bit to the #ifdef soup that is x86 pmap.c, and as there are multiple combinations of {i386,amd64}/xen/pae that use x86 pmap, these changes will need some reviewing to ensure they don't break anything. A large number of printfs/format strings make the (not unreasonable) assumption that a paddr_t is always going to be a long, and cause some warnings when it isn't. I've solved this by always specifying %jx and casting paddr_ts up to uintmax_t - PRIxPADDR and PRIxVADDR don't appear to be available in NetBSD-5 UVM also follows this trend, by treating paddr_ts as vaddr_ts on occasion (ie ptoa(x)) and using vsize_t instead of psize_t to count segment sizes in uvm_page_init. Aside from these mishaps though, almost all of UVM coped perfectly well with managing PAE memory, which was pleasently suprising. bus_dma(9) limits PCI dma allocations to < 4Gb when PAE is turned on, changing this to allow drivers to use > 4Gb addresses lead to all sorts of weird behavior. No doubt this is because the size of bus_addr_t and bus_size_t have not changed (something I've not attempted, and would probably lead to trauma anyway). bus_dma remains allocating below 4Gb for now. One final note - approximately 116Mb of memory was allocated as vm_pages to manage my 6Gb of memory (see below) - PAEs limit is 64Gb, which would require 1.2Gb of vm_pages to manage memory. Clearly anyone wishing to use PAE needs to consider how much kernel vma it is going to consume. NX bit NX is already supported and enabled in AMD64, but requires PAE to operate on i386. Enabling this was as simple as setting an msr, then a clause in trap.c to tell uvm_fault when execute access is desired. Slightly troublesome were the cpu_feature flags - I'm not familiar with how these are organised on x86, and the existing implementation doesn't correctly detect NX on my processor. This was fixed by adding a cpu_feature3 set of flags which corresponds to ci_feature3_flags and testing that for the NX feature bit. However this almost certainly breaks AMD64. One gotcha was discovered - when new processors hatch and enable paging, pages marked NX (ie, the stack) are interpreted by the mmu as having a reserved bit set, so it faults. This was solved by enabling the NX msr bit before paging is enabled in mptramp.S. Happily the NX bit works just fine, and can be tested by running a test program, see http://netbsd.org/~jmorse/pae/nx_test.c , which tries to execute a trap instruction in the data segment - on a non-NX system, such a program will execute the trap instruction and receive SIGTRAP. On a NX enabled system, it will fault when executing data and receive SIGSEGV. Casualties For some bizare reason the pckb controller stops working when the console is initialized, and the driver enters an infinite loop (see patch above to fix this). X won't start on my laptop (i915 chipset) with an out-of-memory error - I haven't put much time into fixing this, although it could be the problem described here: http://mail-index.netbsd.org/port-xen/2009/12/05/msg005575.html Enabling UVM_PAGE_TRKOWN leads to an assertion failing in ffs_alloc, for reasons unknown. I don't know whether this is due to PAE, or something else (ie, WAPBL). libkvm - I've no experience with this, I imagine it needs modifying for PAE addressing Testing I happen to have a quad-core PC with 6Gb of physical memory and an i386 install. Twiddling a BIOS option gives me two memory configurations: first with 3.1Gb of memory reported (up to the PCI address hole), second with 2Gb reported between addresses 0Gb -> 2Gb and 4Gb between addresses 4Gb -> 8Gb. Once the system was stable I checked the machine could happily use high memory - first by running six instances of ``build.sh release''. Unfortunately this failed to use much over 1.5Gb of memory. Instead I used a few test programs to allocate a total of 6.3Gb and and repeatedly fill it with zeros. That confirmed high memory was being used; it also invoked the pager, which worked fine. My main PC has been running a PAE kernel for a week now with X/gnome and many kernel builds, with no ill effects.