Huffman Compression / Decompression in OCaml

I worked lately on a simple application in OCaml that compresses files using the Huffman compression algorithms. The application is a showcase of how to use the Huffman’s Static compression, decompression algorithms compared to the dynamic (adaptive) algorithms (Vitter algorithm is used in the application).

The source code is available on github here https://github.com/martani/Huffman-compression--OCaml-. Although the code is not as elegant as it should be, we have used excessively the imperative features of OCaml for optimization reasons, especially for the adaptive algorithm where the tree changes with every introduction of a new symbol to code. So it is not really the best place for OCaml beginners who should be baptized with the functional beauty of the language before they see its evil imperative side :).

Although, It might be a great place too to see how the imperative and functional programming could be done on the same land, I don’t promise anything though, since we did the project in the hurry with just few days in our hands to prepare for exams and … you know the rest of the story.

So the project uses only the primitive OCaml modules, there is no libraries to install or anything, there is also an associated Makefile to compile and test the executable.

The Makefile has a make test directive to test the executable on a file named ”input” in the working directory (that you should include yourself obviously), basically it stars a static compression followed by a decompression then compares the original and the decompressed files to ensure the program works correctly. It does the same with the dynamic method too.

On a side note, we were surprised that while compiling the code with ocamlopt, we get sometimes programs that run 15x times faster than the same program compiled with ocamlc.

For a file about 5.8 MB we got some executing time around

   1: ***@****-Studio-1558:~/workspace/huffman$ make test   2:     3: Testing... static huffman   4: Compress:   5: 1.18 user    6: 0.00 system   7: 0:01.19 elapsed    8: 99% CPU (0avgtext+0avgdata 5504maxresident)   9:    10: Uncompress  11: 2.26 user   12: 0.00 system   13: 0:02.27 elapsed   14: 99% CPU   15: <<<<<<<<<<<<<<<<Compressed = Uncompress ??>>>>>>>>>>>>>>>>  16:    17: Testing... dynamic huffman  18: Compress  19: 8.11 user   20: 0.01 system   21: 0:08.13 elapsed   22: 99%CPU   23:    24: Uncompress  25: 4.04 user   26: 0.00 system   27: 0:04.05 elapsed   28: 99%CPU   29: <<<<<<<<<<<<<<<Compressed = Uncompress ??>>>>>>>>>>>>>>>>>

As you can notice, the static version of the program is faster than the dynamic one, this is due to the fact that we optimized the the static version to use actual bits on compression / decompression, but we used lists to represent the bits (0’s and 1’s as actual 32 bit integers!!), so that was expected.

Link for the source code : https://github.com/martani/Huffman-compression--OCaml-

<Martani/> Blog

Huffman Compression / Decompression in OCaml

Archives

Popular Posts

Categories