ARRAY COMPRESSION LIBRARY


With the current climate in the advances of computational technology (e.g., CPUs follow Moore’s law,
other components progress at a slower rate), data compression can be used to effectively increase the
bandwidth of both inter-process communication and secondary storage. Application scalability can be
tuned by trading CPU cycles for reduced bandwidth requirements. Standard file compression algorithms
(e.g., those in bzip2 and gzip) are typically formulated for character-based data. A library designed
for real and integer based data with fixed encoding schemes (skip lists, with bit packed data, variable
size encoding) would provide application and library developers a way to take advantage of the ability
to trade CPU cycles for bandwidth or storage.

In collaboration with Mathematical and Computational Sciences (MCS) staff at Argonne National Laboratory
(ANL) and staff at Pacific Northwest National Laboratory (PNNL), we have identified and implemented fixed
compression algorithms that have high performance and well defined compression ratios. These and other
algorithms will be the basic functional units of the resultant library. We have interfaces to the loss-less
compression algorithms bzip2 and gzip as well as fixed floating point compression algorithms based on simple
extensions to the IEEE floating point representations. We have developed a Davidson (in C) and Davidson-Liu
(in Fortran-77) application kernels for testing this library tool.