Sign in
Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(12)
Bloom Filter
Data Structure
Flash Memory
Forest Structure
Indexation
Physical Characteristic
Processing Speed
Reading and Writing
Space Use
False Positive Rate
Random Access Memory
Solid State Drive
Subscribe
Academic
Publications
A Forest-structured Bloom Filter with flash memory
A Forest-structured Bloom Filter with flash memory,10.1109/MSST.2011.5937232,Guanlin Lu,Biplob Debnath,David H. C. Du
Edit
A Forest-structured Bloom Filter with flash memory
(
Citations: 1
)
BibTex
|
RIS
|
RefWorks
Download
Guanlin Lu
,
Biplob Debnath
,
David H. C. Du
A
Bloom Filter
(BF) is a
data structure
based on probability to compactly represent/record a set of elements (keys). It has wide applications on efficiently identifying a key that has been seen before with minimum amount of recording space used. BF is heavily used in chunking based data de-duplication. Traditionally, a BF is implemented as in-RAM data structure; hence its size is limited by the available RAM space on the machine. For certain applications like data de-duplication that require a big BF beyond the size of available RAM space, it becomes necessary to store a BF into a secondary storage device. Since BF operations are inherently random in nature, magnetic disk provides worse performance for the random read and write operations. It will not be a good fit for storing the large BF.
Flash memory
based
Solid State Drive
(SSD) has been considered as an emerging storage device that has superior performance and can potentially replace disks as the preferred secondary storage devices. However, several special characteristics of
flash memory
make designing a
flash memory
based BF very challenging. In this paper, our goal is to design an efficient
flash memory
based BF that is fully aware of these physical characteristics. To this end, we propose a Forest-structured BF design (FBF). FBF uses a combination of RAM and
flash memory
to design a BF. BF is stored on the flash, while RAM helps to mitigate the impact of slow write performance of flash memory. In addition, in-flash BF is organized in a forest-like structure in order to improve the lookup performance. Our experimental results show that FBF design achieves 2 times faster
processing speed
with 50% less number of flash write operations when compared with the existing
flash memory
based BF designs. I. INTRODUCTION A
Bloom Filter
(BF) is a bit vector that compactly represents a set of items (keys) and supports key query/insert operations. It can definitely tell if a key is not present, but it may not tell with guarantee that a key is indeed present. In other word, an answer given by a BF bears certain
false positive
rate. To keep this
false positive rate
low, traditional BF designs have to set
Bloom Filter
size a priori to be a few times larger than the maximum number of items represented. Bloom Filters are heavily used in chunking based data de-duplication. Chunking based de-duplication is an efficient technique to eliminate data redundancy within both backup data and data stored in primary storage. Traditionally, chunk- ing based de-duplication requires a chunk index that consists of each chunk's identifier (i.e., a SHA1 hashed value computed based on chunk's content) and its resided location in disk. This
Conference:
Symposium on Mass Storage Systems - MSS
, pp. 1-6, 2011
DOI:
10.1109/MSST.2011.5937232
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
ieeexplore.ieee.org
)
(
ieeexplore.ieee.org
)
Citation Context
(1)
...Recent deduplication design has mainly focused on its write performance by: (1) improving the duplication detection and chunk existence querying efficiency with efficient chunking, faster hash indexing [2], locality-preserving index caching [1], and efficient bloom filters [
3
]; (2) compressing the unique chunks and performing (fixed-size) large writes through containers [4], [1] or similar structures [5]...
Youngjin Nam
,
et al.
Chunk Fragmentation Level: An Effective Indicator for Read Performance...
References
(10)
Space/time trade-offs in hash coding with allowable errors
(
Citations: 2025
)
Burton H. Bloom
Journal:
Communications of The ACM - CACM
, vol. 13, no. 7, pp. 422-426, 1970
PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities
(
Citations: 195
)
Francisco Matias Cuenca-acuna
,
Christopher Peery
,
Richard P. Martin
,
Thu D. Nguyen
Conference:
IEEE International Symposium on High Performance Distributed Computing - HPDC
, pp. 236-249, 2003
BloomFlash: Bloom Filter on Flash-Based Storage
(
Citations: 2
)
Biplob Debnath
,
Sudipta Sengupta
,
Jin Li
,
David J. Lilja
,
David H. C. Du
Conference:
International Conference on Distributed Computing Systems - ICDCS
, pp. 635-644, 2011
Secure Hash Algorithm - 1
(
Citations: 17
)
D. Eastlake
,
P. Jones
Published in 2001.
Theory and Network Applications of Dynamic Bloom Filters
(
Citations: 37
)
Deke Guo
,
Jie Wu
,
Honghui Chen
,
Xueshan Luo
Conference:
IEEE INFOCOM - INFOCOM
, pp. 1-12, 2006
Sort by:
Citations
(1)
Chunk Fragmentation Level: An Effective Indicator for Read Performance Degradation in Deduplication Storage
Youngjin Nam
,
Guanlin Lu
,
Weijun Xiao
,
David H. C. Du
Conference:
High Performance Computing and Communications - HPCC
, 2011