Re: hammer: big file changes very often

From: Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date: Sat, 9 Aug 2008 09:24:19 -0700 (PDT)

    A database file which is modified in-place should be ok for the most
    part, though there are some further optimizations I can make in the
    HAMMER code to make the history less invasive to performance.

    I would caution against using 'nohistory' unless you really need to.
    Such files will have no history and cannot be snapshotted.  For
    example, I think its fine to make, say, /tmp, /var/tmp, and /usr/obj
    nohistory, but production partitions like /home, /, /usr, and so forth
    should not be.  I'd recommend keeping a full history on anything

    Being able to manage multiple snapshots is a big part of what HAMMER
    is about, and it necessitates thinking about storage a bit differently
    then you would normally.  The idea with HAMMER is that your disk be
    large enough so administrative functions (pruning, reblocking, snapshot
    management) can be handled with a nightly cron job.  If you feel
    pressed for space the partition is probably not big enough to
    be suitable for HAMMER use.


    The reality is that you really only have as much storage as you can
    easily backup to another machine and/or off-site.  There's not much
    point getting a terrabyte disk if don't get a second one to backup to.
    If you want to do the backups right your backup box will manage a
    multitude of snapshots covering weeks, months, even years.  HAMMER
    is designed to make that sort of management easy.  Clearly you want
    to backup things like /, /usr, /home, and not things like /usr/obj :-)

    Backups put a fairly hard cap not only on how large your production data
    sets can be but also on how much can change, on average, in a day.
    HAMMER plays into these realities very well.

    I have a backup box for the DragonFly machines + my personal machines.
    It has one 730G HAMMER partition and the off-site backup has another
    700G+ of storage.  A fresh backup eats about 25% (175G) of that
    storage and each day adds another 3.7G or so (0.5%), giving me
    around 150 days worth of daily snapshots on my backup box.  I can
    extend that by making the older snapshots more granular (only
    retain a weekly snapshot for anything older then 2 months, etc).

    So even though I have 6-7 terrabytes worth of live disks across all
    the boxes I can only reasonably backup a small portion of the total
    data set.  Fortunately most of that space is used for packages, temporary
    build space, core dumps, testing, etc... and does not need to be

					Matthew Dillon 

