DragonFly kernel List (threaded) for 2004-02
Re: Diary entry RFS (request for submissions)
: Something good that one of you can add to the diary would be a
: reference to the recently discovered pmap races. This reference
: can point to the recent mailing list thread and would benefit both
: the FreeBSD and DragonFly communities.
: FWIW, Peter Wemm suspected that a similar race existed and made a
: mention of it to me over 2 months ago, but both having been swamped
: with other work at the time had neglected to persue details. So now
: that it's been defined and exposed, it would be a shame to lose the
:Bosko Milekic * bmilekic@xxxxxxxxxxxxxxxx * bmilekic@xxxxxxxxxxx
What I can do is collect together my correspondances with Tor and
Alan, get their permission to publish the email (that wasn't posted
to a public group), and throw it up on the web site.
Once Alan explained the TLB writeback issue it became glaringly
obvious. David Rhodus (I think) brought up the possibility a few
months ago too from reading something but I didn't twig to the fact
that the race was against userland and thought that the MP lock protected
us. But it doesn't if the race is against userland.
There is also some significant Intel cpu errata related to TLB races
against PTE changes between cpus. For example, when flushing a dirty
page (making it clean again) or when issing I/O on a page which is
also memory mapped into userland, the kernel must remove write permissions
on the page. When invalidating or in low memory situations the kernel
must remove the page table entry entirely. Normally this would simply
cause an instruction running in the user process to fault, the kernel
would remap the page, and resume the userland process. But it is
possible in certain very rare situations for the cpu to retire the
EFLAGS effects of the faulted instruction BEFORE faulting it, causing
the wrong EFLAGS to be restored when the kernel resumes the process.
The only workable solution is to force the cpu's involved in the race
to enter into a known state before modifying the page table entry.
Note that the commit made recently to FreeBSD does not solve the problem,
it is only a partial solution. There are a ton of races in the PMAP
code. My current patch set fixes the PMAP code but does not deal with
the VM Page dirty bit testing code (but it will soon, that's actually
easier to fix then the pmap code. Alan and Tor have their work cut out
for them. It was easy to do in DragonFly because we already have an IPI
messaging subsystem to leverage off of).