DragonFly BSD
DragonFly kernel List (threaded) for 2013-06
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

[GSOC] Introduction to HAMMER2 compression feature


From: Daniel Flores <daniel5555@xxxxxxxxx>
Date: Wed, 12 Jun 2013 17:13:44 +0200

--001a11c380d4b678d204def67515
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Hello everyone,

My name is Daniel Flores and my proposal called =93HAMMER2 compression
feature=94 was accepted for this year=92s Google Summer of Code. I already
posted the draft of my proposal [1] in this kernel list, so I will not
repeat much of it, but instead I want to focus on some design decisions
that I=92ve already made. Since I=92m an inexperienced developer at this po=
int,
I=92d be happy to receive your suggestions and criticism.

The feature itself consists in that it attempts to compress a HAMMER2
block, which is of 64KB in size. The result should be a block of 32KB,
16KB, 8KB, etc. in size.

Currently I=92m searching for the algorithms that are the most appropriate
for a file system. Particularly I=92m searching for algorithms that are ver=
y
fast; don=92t require a lot of memory and processing power and offer fairly
good compression rate (but not necessarily the best compression rate out of
all). Right now I have two candidates =96 DEFLATE [2] and LZO [3]. Both of
them have some available implementations which I intend to use, as Alex
suggested in his review of my proposal.

DEFLATE seems to be a good choice, because it works with small amounts of
data and has a sliding window of 32KB =96 just nice for a 64KB block. It is
based on another algorithm =96 LZ77, which is successfully used in
compression feature for NTFS, so hopefully DEFLATE would be good as well.

LZO seems to be a good choice, because, similarly, it works on small
amounts of data, it is as fast as DEFLATE and was specifically designed to
have a very fast decompression speed.

My initial proposal only covers the implementation of one algorithm, but if
the time allows, I would try to add both of them and maybe add some more.
The design will be modular, so it will not be difficult to change the
algorithm.

My initial strategy, as I already mentioned in my proposal, is to develop,
first, a prototype application to test those algorithms. The application
would take a file, break it in 64KB blocks, compress them one by one, and
then decompress them. After that the resulting uncompressed blocks would be
compared with the original blocks and if they are identical, then it would
mean that the algorithm works correctly.

I will try to compare the algorithms in terms of efficiency and speed, and
choose the best one. I also expect that I=92ll have to modify a bit the
implementations, because most likely they won=92t produce a block of an exa=
ct
size as required, so I would have to add some padding to it.

After the exhaustive testing with the prototype application, I will add the
compression feature in HAMMER2 itself and do real-life testing. The utility
to set up the compression feature would also be done at this point. By the
end of summer, the compression feature should be fully and correctly
implemented and ready for use.


Daniel

[1] =96 http://lists.dragonflybsd.org/pipermail/kernel/2013-April/031228.ht=
ml
[2] =96 http://en.wikipedia.org/wiki/DEFLATE
[3] =96 http://en.wikipedia.org/wiki/LZO

--001a11c380d4b678d204def67515
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hello everyone,<br><br>My name is Daniel Flores and my pro=
posal called =93HAMMER2 compression feature=94 was accepted for this year=
=92s Google Summer of Code. I already posted the draft of my proposal [1] i=
n this kernel list, so I will not repeat much of it, but instead I want to =
focus on some design decisions that I=92ve already made. Since I=92m an ine=
xperienced developer at this point, I=92d be happy to receive your suggesti=
ons and criticism.<br>
<br>The feature itself consists in that it attempts to compress a HAMMER2 b=
lock, which is of 64KB in size. The result should be a block of 32KB, 16KB,=
 8KB, etc. in size.<br><br>Currently I=92m searching for the algorithms tha=
t are the most appropriate for a file system. Particularly I=92m searching =
for algorithms that are very fast; don=92t require a lot of memory and proc=
essing power and offer fairly good compression rate (but not necessarily th=
e best compression rate out of all). Right now I have two candidates =96 DE=
FLATE [2] and LZO [3]. Both of them have some available implementations whi=
ch I intend to use, as Alex suggested in his review of my proposal.<br>
<br>DEFLATE seems to be a good choice, because it works with small amounts =
of data and has a sliding window of 32KB =96 just nice for a 64KB block. It=
 is based on another algorithm =96 LZ77, which is successfully used in comp=
ression feature for NTFS, so hopefully DEFLATE would be good as well.<br>
<br>LZO seems to be a good choice, because, similarly, it works on small am=
ounts of data, it is as fast as DEFLATE and was specifically designed to ha=
ve a very fast decompression speed.<br><br>My initial proposal only covers =
the implementation of one algorithm, but if the time allows, I would try to=
 add both of them and maybe add some more. The design will be modular, so i=
t will not be difficult to change the algorithm.<br>
<br>My initial strategy, as I already mentioned in my proposal, is to devel=
op, first, a prototype application to test those algorithms. The applicatio=
n would take a file, break it in 64KB blocks, compress them one by one, and=
 then decompress them. After that the resulting uncompressed blocks would b=
e compared with the original blocks and if they are identical, then it woul=
d mean that the algorithm works correctly.<br>
<br>I will try to compare the algorithms in terms of efficiency and speed, =
and choose the best one. I also expect that I=92ll have to modify a bit the=
 implementations, because most likely they won=92t produce a block of an ex=
act size as required, so I would have to add some padding to it.<br>
<br>After the exhaustive testing with the prototype application, I will add=
 the compression feature in HAMMER2 itself and do real-life testing. The ut=
ility to set up the compression feature would also be done at this point. B=
y the end of summer, the compression feature should be fully and correctly =
implemented and ready for use.<br>
<br><br>Daniel<br><br>[1] =96 <a href=3D"http://lists.dragonflybsd.org/pipe=
rmail/kernel/2013-April/031228.html">http://lists.dragonflybsd.org/pipermai=
l/kernel/2013-April/031228.html</a><br>[2] =96 <a href=3D"http://en.wikiped=
ia.org/wiki/DEFLATE">http://en.wikipedia.org/wiki/DEFLATE</a><br>
[3] =96 <a href=3D"http://en.wikipedia.org/wiki/LZO";>http://en.wikipedia.or=
g/wiki/LZO</a><br><br></div>

--001a11c380d4b678d204def67515--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]