Academic
Publications
Exploring data reliability tradeoffs in replicated storage systems
Exploring data reliability tradeoffs in replicated storage systems   (Citations: 7)
BibTex | RIS | RefWorks Download
This paper explores the feasibility of a cost-efficient storage architecture that offers the reliability and access performance characteristics of a high-end system. This architecture exploits two opportunities: First, scavenging idle storage from LAN- connected desktops not only offers a low-cost storage space, but also high I/O throughput by aggregating the I/O channels of the participating nodes. Second, the two components of data reliability - durability and availability - can be decoupled to control overall system cost. To capitalize on these opportunities, we integrate two types of components: volatile, scavenged storage and dedicated, yet low-bandwidth durable storage. On the one hand, the durable storage forms a low-cost back-end that enables the system to restore the data the volatile nodes may lose. On the other hand, the volatile nodes provide a high-throughput front-end. While integrating these components has the potential to offer a unique combination of high throughput, low cost, and durability, a number of concerns need to be addressed to architect and correctly provision the system. To this end, we develop analytical- and simulation-based tools to evaluate the impact of system characteristics (e.g., bandwidth limitations on the durable and the volatile nodes) and design choices (e.g., replica placement scheme) on data availability and the associated system costs (e.g., maintenance traffic). Further, we implement and evaluate a prototype of the proposed architecture: namely a GridFTP server that aggregates volatile resources. Our evaluation demonstrates an impressive, up to 800MBps transfer throughput for the new GridFTP service.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Peer-to-peer (P2P) storage networks aim at aggregating the storage in today’s resource-abundant computers to form a large, shared storage space at a much lower price [1, 2]. Due to the prevalence of peer dynamics (i.e., churn), a fundamental issue in P2P storage systems is how to tolerate failures and departures of peers [3, 4]. To meet the service level agreements (SLA), a system must provide a certain level of durability for the data ...
    • ... aggregating the storage in today’s resource-abundant computers to form a large, shared storage space at a much lower price [1, 2]. Due to the prevalence of peer dynamics (i.e., churn), a fundamental issue in P2P storage systems is how to tolerate failures and departures of peers [3, 4]. To meet the service level agreements (SLA), a system must provide a certain level of durability for the data stored in it (notice that recent studies [2, ...
    • ...Many storage systems replicate data for durability [2, 7, 8]. A common way, known as reactive replication, is to detect node failures and make response by regenerating new replicas...

    Zhi Yanget al. AutoProc: An automatic proactive replication scheme for P2P storage

    • ...We use a lower target availability level for the Maze-like system because the system is more dynamic and it is desirable to mask more transient failures, trading imperfect availability for low replication cost and high scalability [17]...

    Zhi Yanget al. Protector: A Probabilistic Failure Detector for Cost-Effective Peer-to...

    • ...In order to move towards practical system deployment while still leveraging users’ resources, hybrid architectures, where both servers and peers coexist, have been very recently proposed in various contexts [14]...
    • ...Such a servercentric approach is also to be found in [14], where authors...

    Serge Defranceet al. Efficient peer-to-peer backup services through buffering at the edge

    • ...In this paper, we solve R(t) by modeling replica dynamics as the continuous-time Markov chain which is widely adopted in literatures [16]–[19]...
    • ...Markov chain, as adopted by this paper, assumes the durations of the peer staying online and offline follow exponential distributions [16]–[19], while alternating renewal process allows the durations follow arbitrary distribution [27], [28]...

    Yuanjian Xinget al. On the QoS of Offline Download in Retrieving Peer-Side File Resource

    • ...Gharaibeh et. al. proposed a low-cost reliable storage system built on a combination of scavenged storage of desktop computers and a set of low-bandwidth dedicated storage such as Automated Tape Library (ATL) or remote storage system such as Amazon S3 [10]...

    Heshan Linet al. MOON: MapReduce On Opportunistic eNvironments

Order by: