DragonFly kernel List (threaded) for 2007-01
From: Matthew Dillon <email@example.com>
Subject: Re: VKernel progress update - 9 Jan 2006
Date: Thu, 11 Jan 2007 17:40:44 -0800 (PST)
X-Trace: 1168566429 crater_reader.dragonflybsd.org 829 188.8.131.52
Xref: crater_reader.dragonflybsd.org dragonfly.kernel:10429
:On Thu, Jan 11, 2007 at 04:53:52PM -0800, Bill Huey wrote:
:> This is more recent.
:Jeff Dike's modifications for address space handling in the host kernel.
This is more interesting. It looks like SKAS is a lot closer to
what I have been doing, though it is still unclear to me what they
are using to manage the multiple emulated user VM spaces. In DragonFly
I built direct VM space support into the (real) kernel and the virtual
kernel can simply manipulate them however it wishes, and just tells
the real kernel to switch into one.
The original UVM used one real-kernel process for each UVM emulated
process and had some fairly serious security issues related to the
visiblity of the UVM kernel to said processes.
The new SKAS stuff is using two primary processes under the real kernel,
one running the UVM kernel, and one running whichever emulated user
process the UVM kernel wants to run.
In DragonFly there is just one user process and N VM spaces and the
virtual kernel simply tells the real kernel which VM space to run in
that process. Some of the linux slides and comments imply that there
is less overhead splitting that into two separate processes, but I
don't see how that can be the case since VM space swapping is exactly
the same as context switching. Signals within the virtual kernel
and emulated virtual user process are entirely handled by the virtual
kernel. Any real-kernel signal simply causes the real kernel to swap
the virtual kernel's VM space back in and then delivers the signal
Dealing with VM/CPU contexts is nasty as hell, no matter how you twist
it. Switching VM spaces requires reloading the MMU page directory
(%cr3), swapping out the register frame, swapping out the TLS (three
descriptors in the cpu's GDT), and the LDT table. If floating point
is involved we can at least lazy swap the FP registers, but it is still
The memory mapping is another big issue. The real kernel is managing
the VM spaces which means that it is also managing the PMAPs. On BSD
systems (including ours), PMAPs are always throw-away entities so the
real kernel overhead is almost non-existant. Since the DragonFly
virtual kernel manipulates the entire emulated user VM space with a
single real-kernel mmap() using MAP_VPAGETABLE to slave the
emulated user VM space to the virtual kernel's 'memory', the vm_map
overhead in the real kernel is almost non-existant.
Oooh, my buildworld finshed!
vkernel# /usr/bin/time make -j 8 buildworld
3748.21 real 0.00 user 3337.10 sys
2071.23 real 2621.60 user 331.64 sys
(the user time is zero in the virtual kernel only because I haven't
fixed up the real/user/sys context test in the virtual kernel build
That isn't bad considering that all the I/O is synchronous and I haven't
made any effort to optimize anything yet.