DragonFly BSD
DragonFly kernel List (threaded) for 2013-06
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: [GSOC] Introduction to HAMMER2 compression feature


From: Freddie Cash <fjwcash@xxxxxxxxx>
Date: Wed, 12 Jun 2013 09:08:41 -0700

--001a11c2cf4631976104def73a2e
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wed, Jun 12, 2013 at 8:13 AM, Daniel Flores <daniel5555@gmail.com> wrote=
:

> Hello everyone,
>
> My name is Daniel Flores and my proposal called =E2=80=9CHAMMER2 compress=
ion
> feature=E2=80=9D was accepted for this year=E2=80=99s Google Summer of Co=
de. I already
> posted the draft of my proposal [1] in this kernel list, so I will not
> repeat much of it, but instead I want to focus on some design decisions
> that I=E2=80=99ve already made. Since I=E2=80=99m an inexperienced develo=
per at this point,
> I=E2=80=99d be happy to receive your suggestions and criticism.
>
> The feature itself consists in that it attempts to compress a HAMMER2
> block, which is of 64KB in size. The result should be a block of 32KB,
> 16KB, 8KB, etc. in size.
>
> Currently I=E2=80=99m searching for the algorithms that are the most appr=
opriate
> for a file system. Particularly I=E2=80=99m searching for algorithms that=
 are very
> fast; don=E2=80=99t require a lot of memory and processing power and offe=
r fairly
> good compression rate (but not necessarily the best compression rate out =
of
> all). Right now I have two candidates =E2=80=93 DEFLATE [2] and LZO [3]. =
Both of
> them have some available implementations which I intend to use, as Alex
> suggested in his review of my proposal.
>
> DEFLATE seems to be a good choice, because it works with small amounts of
> data and has a sliding window of 32KB =E2=80=93 just nice for a 64KB bloc=
k. It is
> based on another algorithm =E2=80=93 LZ77, which is successfully used in
> compression feature for NTFS, so hopefully DEFLATE would be good as well.
>
> LZO seems to be a good choice, because, similarly, it works on small
> amounts of data, it is as fast as DEFLATE and was specifically designed t=
o
> have a very fast decompression speed.
>

You might want to consider LZ4 [1].  It was just recently added to ZFS
[2].  It's multi-threaded-aware, extremely fast for both compression and
decompression, and gives very good compression ratios.  It also includes a
short-circuit design to prevent wasting time trying to compress already
compressed blocks or highly incompressible blocks.  It's also OSS and
several reference implementations are available in various languages.

[1] http://code.google.com/p/lz4/
[2] http://wiki.illumos.org/display/illumos/LZ4+Compression

--=20
Freddie Cash
fjwcash@gmail.com

--001a11c2cf4631976104def73a2e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">=
On Wed, Jun 12, 2013 at 8:13 AM, Daniel Flores <span dir=3D"ltr">&lt;<a hre=
f=3D"mailto:daniel5555@gmail.com"; target=3D"_blank">daniel5555@gmail.com</a=
>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Hello ev=
eryone,<br><br>My name is Daniel Flores and my proposal called =E2=80=9CHAM=
MER2 compression feature=E2=80=9D was accepted for this year=E2=80=99s Goog=
le Summer of Code. I already posted the draft of my proposal [1] in this ke=
rnel list, so I will not repeat much of it, but instead I want to focus on =
some design decisions that I=E2=80=99ve already made. Since I=E2=80=99m an =
inexperienced developer at this point, I=E2=80=99d be happy to receive your=
 suggestions and criticism.<br>


<br>The feature itself consists in that it attempts to compress a HAMMER2 b=
lock, which is of 64KB in size. The result should be a block of 32KB, 16KB,=
 8KB, etc. in size.<br><br>Currently I=E2=80=99m searching for the algorith=
ms that are the most appropriate for a file system. Particularly I=E2=80=99=
m searching for algorithms that are very fast; don=E2=80=99t require a lot =
of memory and processing power and offer fairly good compression rate (but =
not necessarily the best compression rate out of all). Right now I have two=
 candidates =E2=80=93 DEFLATE [2] and LZO [3]. Both of them have some avail=
able implementations which I intend to use, as Alex suggested in his review=
 of my proposal.<br>


<br>DEFLATE seems to be a good choice, because it works with small amounts =
of data and has a sliding window of 32KB =E2=80=93 just nice for a 64KB blo=
ck. It is based on another algorithm =E2=80=93 LZ77, which is successfully =
used in compression feature for NTFS, so hopefully DEFLATE would be good as=
 well.<br>


<br>LZO seems to be a good choice, because, similarly, it works on small am=
ounts of data, it is as fast as DEFLATE and was specifically designed to ha=
ve a very fast decompression speed.<br></div></blockquote><div><br></div>

<div>You might want to consider LZ4 [1].=C2=A0 It was just recently added t=
o ZFS [2].=C2=A0 It&#39;s multi-threaded-aware, extremely fast for both com=
pression and decompression, and gives very good compression ratios.=C2=A0 I=
t also includes a short-circuit design to prevent wasting time trying to co=
mpress already compressed blocks or highly incompressible blocks.=C2=A0 It&=
#39;s also OSS and several reference implementations are available in vario=
us languages.<br>

</div></div><br>[1] <a href=3D"http://code.google.com/p/lz4/";>http://code.g=
oogle.com/p/lz4/</a><br>[2] <a href=3D"http://wiki.illumos.org/display/illu=
mos/LZ4+Compression">http://wiki.illumos.org/display/illumos/LZ4+Compressio=
n</a><br>
<br>-- <br>Freddie Cash<br><a href=3D"mailto:fjwcash@gmail.com"; target=3D"_=
blank">fjwcash@gmail.com</a>
</div></div>

--001a11c2cf4631976104def73a2e--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]