Skip to main content.
Google custom search

NetBSD Documentation: Why implement traditional vfork()


Introduction (top)

vfork() is designed to be used in the specific case where the child will exec() another program, and the parent can block until this happens. A traditional fork() required duplicating all the pages of the parent process in the child - a significant overhead.

The Mach VM system added Copy On Write (COW), which made the fork() much cheaper, and in BSD 4.4, vfork() was made synonymous to fork(). After NetBSD 1.3, a traditional vfork() was reimplemented.

A good amount of effort was directed at making COW better in UVM, but an address space-sharing vfork() still turns out to be a win. It shaves several seconds off a build of libc on a 200MHz PPro.

vfork()/exec() using the 4.4BSD vfork() and COW (top)

  • Traverse parent's vm_map, marking the writable portions of the address space COW. This means invoking the pmap, modifying PTEs, and flushing the TLB.

  • Create a vm_map for the child, copy the parent's vm_map entries into the child's vm_map. Optionally, invoke the pmap to copy PTEs from the parent's page tables into the child's page tables.

  • Block parent.

  • Child runs. If PTEs were not copied, take page fault to get a physical mapping for the text page at the current program counter.

  • Child execs, and unmaps the entire address space that was just created, and creates a new one. This implies that the parent's vm_map has to be traversed to mark the COW portions not-COW.

  • Unblock parent.

  • Parent runs, takes page fault when modifying previously R/W data that was marked R/O for COW. No data is copied at this time.

The 3.0BSD/NetBSD vfork(), using address space sharing (top)

  • Take reference to parent's vmspace structure.

  • Block parent.

  • Child runs. No page faults occur because the parent's page tables are being used, and the PTEs are already valid.

  • Child execs, deletes the reference it had to the parent's vmspace structure, and creates a new one.

  • Unblock parent.

  • Parent runs. (No page faults occur because the parent's vm_map was not modified.)

So, in the case where you're going to fork and then exec, the latter case is clearly faster. Even if your COW algorithms are good, you still have to do a lot more work compared to the vmspace-sharing case!


Back to NetBSD Documentation: Kernel