DragonFly commits List (threaded) for 2005-08
Re: cvs commit: src/sys/kern vfs_cache.c vfs_syscalls.c vfs_vnops.c vfs_vopops.c src/sys/sys namecache.h stat.h
:On Thu, Aug 25, 2005 at 11:34:17AM -0700, Matthew Dillon wrote:
:> Implement FSMID. Use one of the spare 64 bit fields in the stat structure
:> for the FSMID. The FSMID is a recursively updated field which allows one
:> to determine whether a subdirectory hierarchy has changed simply by checking
:> the base directory of the desired hierarchy. The new field is st_fsmid.
:Please don't do it. This kind of functionality can be synthesised by
:imon under Linux oder kqueue on the BSDs. It is therefore redundant.
:The approach doesn't solve most of the problems and just provides the
:means necessary to detect something changed, it still needs to recurse
:into the directory hierachy. It's IMO also not reliable since a vnode
:change does not necessarily reach all parent directories of with entries
:for this vnode, simply because they might never have been read. It can
:also add a considerable overhead for deeply nested filesystems, which
:shouldn't be done lightly.
I don't know about imon under Linux, but kqueue on the BSDs doesn't
even come *close* to providing the functionality needed, let alone
providing us with a way monitor changes across distinct invocations.
Using kqueue for that sort of thing a terrible idea.
I'm not sure I understand what you mean about not reaching all
parent directories. Perhaps you did not read the patch set. It
most certainly DOES reach all parent directories, whether they've been
read or not. That's the whole point. It goes all the way to '/'.
And as far as searching directories goes... the whole point is to
reduce the number of directories that have to be searched to JUST the
portions of the hiearchy containing the modifications. If one is
trying to synchronize a huge filesystem, such as many people now have,
it is extremely important to be able to restrict such synchronization
to just the elements that have changed, and to do so without having to
constantly monitor the entire filesystem.
It's a very good fit, taking the middle ground between a backup
method like tar/dump which must scan the entire filesystem in batch,
and a live journal which requires real time monitoring of all filesystem
operations. FSMID gives you an ability to do tar/dump-like mirror
synchronizations in batch (distinct invocations, without real time
monitoring), but without having to scan the entire directory structure
of a large terrabyte filesystem. kqueue can't do that, and I really
doubt that imon could do that either.
The methodology behind the transaction id assignments can make this
a 100% reliable operation on a *RUNNING*, *LIVE* system. Detecting
in-flight changes is utterly trivial.
Nesting overhead is an issue, but not a big one. It's a very solvable
problem and certainly should not hold up an implementation. The only
real issue occurs when someone does a write() vs someone else stat()ing
a directory along the parent path. Again, very solvable and certainly
not a show stopper in any way.
:It should also be kept in mind that persistent storage is almsot fully a
:dream, since no current filesystem allows it nor is it really possible
:to correctly implement the behaviour without adding a lot of nasty hacks
:e.g. restores as well.
Not sure what you mean by no filesystem allowing it. It's an almost
trivial matter to add it to UFS. It certainly isn't difficult. It
is certainly entirely possible to correctly implement the desired
When UFS was originally developed there was some discussion about
propogating e.g. ctime back to the root of the mount point. It wasn't
done due to conerns about overhead, but I think also because the rest
of the system simply wasn't designed to be able to accomodate the caching
infrastructure required to support that sort of thing. Well, DragonFly's
new namecache infrastructure is *FULLY* capable of supporting that sort
of thing, and it would be a lot easier to implement such a beast in
DragonFly then, say, in FreeBSD.