Sign in
Author

Conference

Journal

Organization

Year

DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(10)
Data Structure
Dense Linear Algebra
High Performance Computer
level 3 blas
Linear Equations
Matrix Computation
Matrix Equation
Matrix Factorization
Memory Hierarchy
Software Implementation
Related Publications
(11)
Recursive Array Layouts and Fast Matrix Multiplication
Vector and parallel algorithms for Cholesky factorization on IBM 3090
Organizing matrices and matrix operations for paged memory systems
Tiling, Block Data Layout, and Memory Hierarchy Performance
A recursive formulation of Cholesky factorization of a matrix in packed storage
Subscribe
Academic
Publications
Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software
Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software,10.1137/S0036144503428693,Siam Review,Erik Elmroth,Fred Gust
Edit
Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software
(
Citations: 83
)
BibTex

RIS

RefWorks
Download
Erik Elmroth
,
Fred Gustavson
,
Isak Jonsson
,
Bo Kagstrom
Matrix computations are both fundamental and ubiquitous in computational science and its vast application areas. Along with the development of more advanced computer systems with complex memory hierarchies, there is a continuing demand for new algorithms and library software that efficiently utilize and adapt to new architecture features. This article reviews and details some of the recent advances made by applying the paradigm of recursion to dense matrix computations on today's memorytiered computer systems. Recursion allows for efficient utilization of a
memory hierarchy
and generalizes existing fixed blocking by introducing automatic variable blocking that has the potential of matching every level of a deep memory hierarchy. Novel recursive blocked algorithms offer new ways to compute factorizations such as Cholesky and QR and to solve matrix equations. In fact, the whole gamut of existing
dense linear algebra
factorization is beginning to be reexamined in view of the recursive paradigm. Use of recursion has led to using new hybrid data structures and optimized superscalar kernels. The results we survey include new algorithms and library software implementations for level 3 kernels, matrix factorizations, and the solution of general systems of
linear equations
and several common matrix equations. The software implementations we survey are robust and show impressive performance on today's high performance computing systems.
Journal:
Siam Review  SIAM REV
, vol. 46, no. 1, pp. 345, 2004
DOI:
10.1137/S0036144503428693
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
www.csc.kth.se
)
(
www.mat.uc.pt
)
(
adsabs.harvard.edu
)
(
ngssc.uppmax.uu.se
)
(
www.cs.umu.se
)
(
link.aip.org
)
(
www.ngssc.vr.se
)
(
www8.cs.umu.se
)
More »
Citation Context
(54)
...It is wellknown that utilization of recursive storage is advantageous when implementing dense matrix operations ([
1
])...
Michele Martone
,
et al.
On BLAS Operations with Recursively Stored Sparse Matrices
...By storing matrices hierarchically [10], [
13
], [19] with one level of indirection, each submatrix block can be easily demarcated in order to determine data dependencies between each task in the resulting algorithmsbyblocks...
Ernie Chan
,
et al.
Transforming linear algebra libraries: From abstraction to parallelism
...By storing matrices hierarchically [
12
] and viewing submatrix blocks as the unit of data and operations with blocks (tasks) as the unit of computation, we reintroduced the concept of algorithmsbyblocks [19]...
Ernie Chan
,
et al.
Managing the complexity of lookahead for LU factorization with pivotin...
...On optimization of linear algebra library, contemporary research focuses on an algorithmic level [2,
3
]...
Yun Xu
,
et al.
Optimization of Triangular Matrix Functions in BLAS Library on Loongso...
...The choice of storing the nonzeros within blocks in a recursive layout is opposite to the common wisdom for storing dense matrices [
18
]...
Aydin Buluç
,
et al.
Parallel sparse matrixvector and matrixtransposevector multiplicati...
References
(102)
On Optimal Batching Policies for VideoonDemand Storage Servers
(
Citations: 180
)
Charu C. Aggarwal
,
Joel L. Wolf
,
Philip S. Yu
Conference:
International Conference on Multimedia Computing and Systems/International Conference on Multimedia and Expo  ICME(ICMCS)
, pp. 253258, 1996
Silo, Rainbow, and Caching Token: Schemes for Scalable, Fault Tolerant Stream Caching
(
Citations: 56
)
Youngsu Chae
,
Katherine Guo
,
Milind M. Buddhikot
,
Subhash Suri
,
Ellen W. Zegura
Journal:
IEEE Journal on Selected Areas in Communications  JSAC
, 2000
Minimizing Bandwidth Requirements for OnDemand Data Delivery
(
Citations: 207
)
Derek L. Eager
,
Mary K. Vernon
,
John Zahorjan
Conference:
Workshop on Multimedia Information Systems  MIS
, pp. 8087, 1999
Find Service Paths in Service Proxy Network
(
Citations: 2
)
Dongyan Xu
,
Klara Nahrstedt
Published in 2002.
How to Model an Internetwork
(
Citations: 1033
)
Ellen W. Zegura
,
Kenneth L. Calvert
,
Samrat Bhattacharjee
Conference:
IEEE INFOCOM  INFOCOM
, vol. 2, pp. 594602, 1996
Sort by:
Citations
(83)
Multilevel Optimization of Matrix Multiplication for GPUequipped Systems
(
Citations: 1
)
Kazuya Matsumoto
,
Naohito Nakasato
,
Tomoya Sakai
,
Hideki Yahagi
,
Stanislav G. Sedukhin
Journal:
Procedia Computer Science
, vol. 4, pp. 342351, 2011
Scheduling dense linear algebra operations on multicore processors
(
Citations: 8
)
Jakub Kurzak
,
Hatem Ltaief
,
Jack Dongarra
,
Rosa M. Badia
Journal:
Concurrency and Computation: Practice and Experience  CONCURRENCY
, vol. 22, no. 1, pp. 1544, 2010
On BLAS Operations with Recursively Stored Sparse Matrices
(
Citations: 4
)
Michele Martone
,
Salvatore Filippone
,
Marcin Paprzycki
,
Salvatore Tucci
Conference:
Symbolic and Numeric Algorithms for Scientific Computing  SYNASC
, 2010
Solving path problems on the GPU
(
Citations: 5
)
Aydin Buluç
,
John R. Gilbert
,
Ceren Budak
Journal:
Parallel Computing  PC
, vol. 36, no. 56, pp. 241253, 2010
Fast recursive matrix multiplication for multicore architectures
(
Citations: 1
)
Gudula Rünger
,
Michael Schwind
Journal:
Procedia Computer Science
, vol. 1, no. 1, pp. 6776, 2010