DragonFly users List (threaded) for 2009-02
From: Matthew Dillon <email@example.com>
Subject: Re: OT - was Hammer or ZFS based backup, encryption
Date: Wed, 25 Feb 2009 09:10:56 -0800 (PST)
X-Trace: 1235582261 crater_reader.dragonflybsd.org 881 18.104.22.168
Xref: crater_reader.dragonflybsd.org dragonfly.users:12071
Generally speaking the idea with HAMMER's snapshotting and mirroring
is that everything is based on transaction-ids stored in the B-Tree.
The mirroring functionality does not require snapshotting per-say,
because EVERY sync HAMMER does to the media (including the automatic
filesystem syncs done by the kernel every 30-60 seconds) is effectively
There is a downside to the way HAMMER manages its historical data store
and it is unclear how much of burden this will wind up being without some
specific tests. The downside is that the historical information is stored
in the HAMMER B-Tree side-by-side with current information.
If you make 50,000 modifications to the same offset within a file,
for example, with a fsync() inbetween each one, and assuming you don't
prune the filesystem, then you will have 50,000 records for that HAMMER
data block in the B-Tree. This can be optimized... HAMMER doesn't have
to scan 50,000 B-Tree elements. It can seek to the last (most current)
one when it traverses the tree. I may not be doing that yet but there is
no data structure limitation that would prevent it. Even with the
optimization there will certainly be some overhead.
The mitigating factor is, of course, that the HAMMER B-Tree is pruned
every night to match the requested snapshot policy.
It would be cool if someone familiar with both ZFS's mirroring and
HAMMER's mirroring could test the feature and performance set. What
I like most about HAMMER's mirroring is that the mirroring target can
have a different history retention policy then the master.
HAMMER's current mirror streaming feature is also pretty cool if I do
say so myself. Since incremental mirroring is so fast, the hammer
utility can poll for changes every few seconds and since the stream
isn't queued it can be killed and restarted at any time. Network
outages don't really effect it.
I also added a very cool feature to the hammer mirror-stream directive
which allows you to limit the bandwidth, preventing the mirroring
operation from interfering with production performance.