Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all domains
Limit my searches in the following domains
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(12)
Algorithm Design
Decomposition Algorithm
Field Programmable Gate Array
Floating Point Arithmetic
Hardware Accelerator
Hardware Implementation
High Performance
Matrix Decomposition
Memory Access
Memory Management
Parallel Processing
Software Implementation
Subscribe
Academic
Publications
FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture
Edit
FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture
BibTex
|
RIS
|
RefWorks
Download
Manish Kumar Jaiswal
,
Nitin Chandrachoodan
Decomposition of a matrix into lower and upper triangular matrices (LU decomposition) is a vital part of many scientific and engineering applications, and the block LU
decomposition algorithm
is an approach well suited to parallel hardware implementation. This paper presents an approach to speed up implementation of the block LU
decomposition algorithm
using FPGA hardware. Unlike most previous approaches reported in the literature, the approach does not assume the matrix can be stored entirely on chip. The memory accesses are studied for various FPGA configurations, and a schedule of operations for scaling well is shown. The design has been synthesized for FPGA targets and can be easily retargeted. The design outperforms previous hardware implementations, as well as tuned software implementations including the ATLAS and MKL libraries on workstations.
Journal:
IEEE Transactions on Computers - TC
, vol. 61, no. 1, pp. 60-72, 2012
DOI:
10.1109/TC.2011.24
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
ieeexplore.ieee.org
)
(
ieeexplore.ieee.org
)
References
(24)
Large Dense Numerical Linear Algebra in 1993: the Parallel Computing Influence
(
Citations: 18
)
A. Edelman
Journal:
International Journal of High Performance Computing Applications - IJHPCA
, vol. 7, no. 2, pp. 113-128, 1993
Software Libraries for Linear Algebra Computations on High Performance Computers
(
Citations: 76
)
Jack J. Dongarra
,
David W. Walker
Journal:
Siam Review - SIAM REV
, vol. 37, no. 2, 1995
The Torus-Wrap Mapping for Dense Matrix Calculations on Massively Parallel Computers
(
Citations: 81
)
Bruce A. Hendrickson
,
David E. Womble
Journal:
Siam Journal on Scientific Computing
, vol. 15, no. 5, 1994
Origin and development of the method of moments for field computation
(
Citations: 27
)
R. Harrington
Journal:
IEEE Antennas and Propagation Magazine - IEEE ANTENNAS PROPAG MAG
, vol. 32, no. 3, pp. 31-35, 1990
Panel Methods in Computational Fluid Dynamics
(
Citations: 40
)
J L Hess
Journal:
Annual Review of Fluid Mechanics - ANNU REV FLUID MECH
, vol. 22, no. 1, pp. 255-274, 1990