CombiHeader: Minimizing the number of shim headers in redundancy elimination systems

CombiHeader: Minimizing the number of shim headers in redundancy elimination systems,10.1109/INFCOMW.2011.5928920,Sumanta Saha,Andrey Lukyanenko,Antti

CombiHeader: Minimizing the number of shim headers in redundancy elimination systems  
BibTex | RIS | RefWorks Download
Redunda ncy elimination has been used in many places to improve network performance. The algorithms for doing this typically split data into chunks, fingerprint them, and compare the fingerprint with cache to identify similar chunks. Then these chunks are removed from the data and headers are inserted instead of them. However , this approach presents us with two crucial shortcomings. Depending on the size of chunks, either many headers need to be inserted, or probability of missing similar regions is increased. Algorithms that try to overcome missed similarity detection by expanding chunk boundary suffers from excessive memory access due to byte-by-byte comparison. This situation leads us to propose a novel algorithm, CombiHeader, that allows near maximum similarity detection using smaller chunks sizes while using chunk aggregation technique to transmit very few headers with few memory accesses. CombiHeader uses a specialized directed graph to track and merge adjacent popular chunks. By generating different generations of CombiNodes, CombiHeader can detect different lengths of similarity region, and uses the smallest number of headers possible. Experiments show that CombiHeader uses less than 25% headers than general elimination algorithms, and this number improves with the number of hits. The required memory access to detect maximal similarity region is in the range of 1 % -5 % of comparable algorithms for certain situations. CombiHeader is implemented as a pluggable module, which can be used with any existing redundancy elimination algorithm. recent works (6-8) have shown that it has wider scope of use in different scenarios and deployments of network. This discovery has triggered the invention of several new algorithms for redundancy elimination at different parts of the network. A typical RE algorithm inspects the contents of an IP packet and generates content dependent chunks from it. This chunking is done mostly using the famous Rabin fingerprinting (9) method. These chunks are then fingerprinted for easy searching and retrieval. At a later time, the incoming packets are also fingerprinted and checked for containment in the existing cache. If some parts of the new packets are found in the cache, they are stripped off and a shim header is inserted in place. These stripped packet are then transported as far as possible before finally reconstructing them at a downstream network element with the help of the shim header. One very important parameter in any RE algorithm is the average chunk size. While smaller chunk sizes allow better similarity detection, bigger chunk sizes allow inserting fewer shim headers in the outgoing packet resulting in better compression rate. Most of the previous work (6, 7, 10) just use a small chunk size to report high similarity detection. However, they fail to mention the increasing number of headers to transmit, and the added complexity in the operation. While the importance of choosing the right chunk size is well understood and analyzed in (11) and (8), no work so far has proposed any solution to combine the advantages of using small chunk size (better similarity detection) with that
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.