Academic
Publications
Global-scale distributed I / O with ParaMEDIC

Global-scale distributed I / O with ParaMEDIC,10.1002/cpe.1590,Concurrency and Computation: Practice and Experience,Pavan Balaji,Wu-chun Feng,Heshan L

Global-scale distributed I / O with ParaMEDIC   (Citations: 1)
BibTex | RIS | RefWorks Download
Achieving high performance for distributed I/O on a wide-area network continues to be an elusive holy grail. Despite enhancements in network hardware as well as software stacks, achieving high-performance remains a challenge. In this paper, our worldwide team took a completely new and non-traditional approach to distributed I/O, called ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing, by utilizing application-specific transformation of data to orders of magnitude smaller metadata before performing the actual I/O. Specifically, this paper details our experiences in deploying a large- scale system to facilitate the discovery of missing genes and constructing a genome similarity tree by encapsulating the mpiBLAST sequence-search algorithm into ParaMEDIC. The overall project involved nine computational sites spread across the U.S. and generated more than a petabyte of data that was 'teleported' to a large-scale facility in Tokyo for storage.
Journal: Concurrency and Computation: Practice and Experience - CONCURRENCY , vol. 22, no. 16, pp. 2266-2281, 2010
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Our basic idea is to compare all Open Reading Frames (ORFs) greater than or equal to a minimum length from all fully-sequenced genomes using BLAST [9] based on a novel high-performance approach [10]...
    • ...mpi-BLAST parallelizes BLAST using database fragmentation, query segmentation [13], parallel input-output [14], and advanced scheduling [15]; further details are given in [10]...
    • ...In order to find the missing genes using all fullysequenced prokaryotic genomes we have used an innovative high-performance methodology [10]...
    • ...The performance of both mpiBLAST [13] and ParaMEDIC [10] theoretically scales well if the number of processors that are used scales as the square of the number of replicon sequences...

    Andrew S. Warrenet al. Missing genes in the annotation of prokaryotic genomes

Sort by: