DragonFly kernel List (threaded) for 2003-12
On Tue, 9 Dec 2003, Matthew Dillon wrote:
> Trusted BSD is an interesting project but I don't like the mess it makes
> in the FreeBSD-5 kernel, and it does not solve the most basic problem of
> bugs in system calls creating root holes.
> NetBSD or OpenBSD (I forget which) has a system call masking feature which
> is probably more effective.
> I don't dislike the idea of having compartmentalized security, I just
> think it is far safer to have it all in one place... e.g. like a loadable
> 'filter' on the syscall messages (and VFS messages, and DEV messages),
> instead of having to go in and modify individual system calls, filesystems,
> and so forth.
> If the only way to get into the kernel is via a syscall message, and the
> only way to access a filesystem is via a VFS message, and the only way to
> access a device is via a device message, then that is where we code up
> our security mechanisms.
I agree and disagree both in principle and practice :-). Actually, I
think you'll find that if you dig a little deeper, the placement of MAC
Framework entry points is done exactly on the philosophy you describe. In
order to prevent race conditions, you have to perform access control
checks on the actual objects, not the names provided in system calls. We
place our checks at the front ends of various subsystems: i.e., the top
layer of VFS, the top layer of the process signalling pieces, etc. This
is the point where the name has been resolved to the object, and the
correct locks are held to make sure you can perform a consistent check. In
a traditional UNIX kernel, you cannot do this safely at the system call
layer using wrappers, because that involves multiple lookups, which can be
raced (time of check, time of use).
However, if you have a compartmentalized kernel (i.e., microkernel) with
message passing between subsystems, subsystems can perform the checks at
the point where the message enters, which might accomplish what both you
want, and what I want architecturally :-). However, that relies on
cleaning up object naming, perhaps in the style of Mach
ports/capabilities, so that the names used in messages are "authoritative"
and safe to control with.
The problem with system call level wrappers is pretty hard to fix,
however, and sometimes, it can cause more security vulnerabilities than it
solves. Take, for example, systrace's interception and replacement of
path names. There are actually two race conditions here: first, a dual
copyin, which can be raced by threaded processes and shared memory -- this
is fixed through proper encapsulation of system call arguments. Second, a
semantic race in the implementation of the file system code, which is a
lot harder to solve. Systrace's lookup occurs before the kernel has
resolved the file name from the string passed by the process, so the
lookup actually occurs twice: once in the wrapper for the control, and
once when the actual system call does the work. Neils has explored using
a "look aside buffer" to cache system call arguments to address the first
problem, but I think the "Separation" in DragonFly will solve this much
more cleanly. The second can't be avoided unless the name used for the
test acts more like a capability (or, you combine the checks with the same
locked referenced used in the file system code, as in FreeBSD).
Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
robert@xxxxxxxxxxxxxxxxx Senior Research Scientist, McAfee Research