Backup server using direct I/O scheme

Backup server using direct I/O scheme,Kyung Woo Hur,Won Vien Park,Wan Yeon Lee,Jin Kim,Young Woong Ko

Backup server using direct I/O scheme  
BibTex | RIS | RefWorks Download
Recently the study of network backup server has developed rapidly and its achievements have become a center of attraction. In this paper we focus on I/O subsystem performance analysis for network backup server. The key issue is to analyse the trade-off between direct I/O and buffered I/O mechanism. We investigate how the two mechanisms affect overall performance of network backup server. We designed and implemented the backup server and conducted several experiments. Experimentàì result shows overall performance of direct I/O method outperforms traditional buffered I/O mechanism. Recent advances in Internet technology has led to a convenient network backup, and it has replaced traditional tape backup method. The network backup systems are widely used in current storage market. However, current network backup server has drawbacks. First, it cannot deal with the large volumes of duplicated files efficiently. The duplicated files waste the capacity of storage devices and network bandwidth. Therefore, to improve the performance of backup server, we have to deal with eliminating duplicated files. Moreover, data duplication problem can be extended to block level because there exist duplicated blocks in a file. The easiest way to find duplicated file is to exploit file metadata information such as file name, size, and date. With this information, we can easily find duplicated version of files in a backup server. There is more powerful mechanism to find duplicated data called hashing such as MD5 (1) and SHA1 (2). We can easily confirm whether two files are identical or not by simply comparing two hash data. Second, the performance of network backup server is not satisfactory. It is well-known that storage system causes a significant delay for processing I/O requests because almost all storage systems are based on mechanical device. To increase the performance of network backup server, we have to minimize disk I/O latency. To address these problems, we adapted direct I/O scheme on network backup system to reduce I/O overhead. The well- known direct I/O schemes are raw disk I/O and O_DIRECT which can bypass kernel buffer cache. Generally, if we are performing a backup procedure, the data blocks are transferred to kernel buffer cache from user application memory, and then the data blocks in kernel buffer cache are moved to hard disk at later. We believe that network backup server does not need to exploit kernel buffer cache because the cached data blocks are rarely reused in backup system. Therefore, direct I/O schemes are much more efficient for network backup server. In this paper, we described how to reduce I/O latency for network backup server using direct I/O schemes. We analysed the performance of network backup server using direct I/O and compared the performance with traditional buffered I/O scheme. We also conducted several performance experiments to show the detailed performance results.
Published in 2011.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.