Academic
Publications
Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration

Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration,10.1007/978-3-642-19137-4_6,Richard Membarth,Frank

Frameworks for Multi-core Architectures: A Comprehensive Evaluation Using 2D/3D Image Registration   (Citations: 1)
BibTex | RIS | RefWorks Download
The development of standard processors changed in the last years moving from bigger, more complex, and faster cores to putting several more simple cores onto one chip. This changed also the way programs are written in order to leverage the processing power of multiple cores of the same processor. In the beginning, programmers had to divide and distribute the work by hand to the available cores and to manage threads in order to use more than one core. Today, several frameworks exist to relieve the programmer from such tasks. In this paper, we present five such frameworks for parallelization on shared memory multi-core architectures, namely OpenMP, Cilk++, Threading Building Blocks, RapidMind, and OpenCL. To evaluate these frameworks, a real world application from medical imaging is investigated, the 2D/3D image registration. In an empirical study, a fine-grained data parallel and a coarse-grained task parallel parallelization approach are used to evaluate and estimate different aspects like usability, performance, and overhead of each framework.
Conference: Architektur von Rechensystemen - ARCS , pp. 62-73, 2011
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...In a previous study [4], we have evaluated different frameworks for standard shared memory processors like OpenMP, Cilk++, Threading Building Blocks (TBB), RapidMind, as well as OpenCL...
    • ...A more detailed description of the used 2D/3D image registration with mathematical formulas can be found in [5] and [4]...
    • ...The execution times of the implementation of each framework on the Quadro FX 5800 and the Tesla C2050 is shown in Table II. The table shows also the execution time of our reference implementation as well as the execution time of a task parallel implementation of the 2D/3D image registration on a 24 core system using TBB (see [4])...

    Richard Membarthet al. Frameworks for GPU Accelerators: A comprehensive evaluation using 2D/3...

Sort by: