DragonFly BSD
DragonFly commits List (threaded) for 2005-08
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: cvs commit: src/sys/kern vfs_cache.c vfs_syscalls.c vfs_vnops.c vfs_vopops.c src/sys/sys namecache.h stat.h


To: Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
From: Raphael Marmier <raphael@xxxxxxxxxxx>
Date: Fri, 26 Aug 2005 01:20:30 +0200

How does this FSMID mechanism compare to (similar?) one recently introduced in Darwin/MacOSX 10.4? On this OS application that wants receive notification of filesystem change as they take place. It is necessary for applications like Spotlight, which use it to keep its indexing uptodate.
On my 3 years old Mac, its very fast and there doesn't seem to be much overhead involved.
It would be great to have such capability on Dragonfly.


Raphael

Matthew Dillon wrote:
:
:To shift the subject somewhat, would this be at all usable for
:implementing something that works like tripwire?
:
:-Devon

    On a live system, yes, the FSMIDs can tell you that something 'might' have
    been changed (but not that it has been changed for sure... the journal
    data would be needed for that).  The journal itself is an even better
    auditing tool if you can stand the performance hit.  In fact, if one
    has decided to take the plunge and not only turn on journaling, but
    route the journaling data to another box on the lan, then the auditing
    tools could be running on that other far more secure box rather then
    on the box running the potentially compromising software.

So, e.g. if you see the FSMID for '/' change, then you know that
something has modified something on any of the filesystem mounts on
the box (maybe that's not so useful :-)). If you monitor the FSMID
for /usr/local then you can detect that changes have been made to
anything under /usr/local. And so on and so forth. Things that are
supposed to not change, like /bin, /sbin, /usr/bin, /usr/sbin, etc...
those would be really easy to monitor as coarse trip points. But to
then figure out exactly what in those subhierarchies has actually changed
you would either need to record the FSMID's for all the files and dirs
in the subhiearchy and compare them, or you would need to be running a
live journal and index the filepaths recorded in the journal (the whole
filepath is recorded in the journal transaction... kinda wasteful of space, and eventually I'll have a shortcut/compression method to avoid
it, but that's how the journal works now).


dhcp62# fsmid / /tmp /usr
/       2916
/tmp    2427
/usr    199

dhcp62# echo > /tmp/x
dhcp62# fsmid / /tmp /tmp/x /usr
/       2923
/tmp    2923
/tmp/x  2923
/usr    199

/*
 * Simple program to display the fsmid for specified paths
 */
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>

int
main(int ac, char **av)
{
    struct stat st;
    int i;

    for (i = 1; i < ac; ++i) {
        if (stat(av[i], &st) == 0)
            printf("%s\t%lld\n", av[i], st.st_fsmid);
    }
    return(0);
}

    The FSMID's are insufficient just after a crash... well, obviously
    with this commit since we aren't recording them in the inode yet :-)  But
    once we do start recording them (and using real transaction id's rather
    the incrementing a number from 0), then after a crash the FSMIDs can
    help us figure out how far back we have to rerun the journal.  Making
    crash recovery fully coherent in the presence of a journal+mirror
    also requires being able to 'rollback' any data that made it to the
    filesystem on-disk, but didn't make it to the journal (due to the
    crash).  That is a far more difficult, but still very solvable problem.

One solution would be to delay the disk writes related to unacknowledged
journal records... kind of a bastardized and much simplified version of
softupdates, in fact. Then there wouldn't be a problem becaue the
journal would always be ahead of the on-disk data. Since filesystem
operations are almost universally asynchronous anyway, this is almost
the case we have now and we would not create any additional performance
issues above and beyond those involved with writing out a journal in
the first place.


    There are other solutions too that do not require tight sychronization
    with the journal... I was just talking to David Rhodus about how we could
    mark cylinder groups as being 'active', then have an idle timeout on the
    cylinder group to mark it inactive again.  That could be combined with
    a requirement that the related journal data be acknowledged prior to
    marking the cylinder group inactive.  After a crash, all inodes
    related to active cylinder groups would be resynchronized from the
    journal+mirror to clean out any extranious data.  The FSMID's would then
    be in synch again.

    My preference is to use a block level solution because it would be
    outside of the filesystem code (since the kernel manages filesystem
    buffers and the journal, all the logic could be emplaced in higher kernel
    layers).

-Matt
Matthew Dillon <dillon@xxxxxxxxxxxxx>



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]