Sign in
Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(7)
Binary Translation
Computer Simulation
Dynamic Binary Translation
Parallel Applications
Parallel Architecture
Parallel Computer
Performance Analysis
Subscribe
Academic
Publications
Parallelisation of the Valgrind Dynamic Binary Instrumentation Framework
Parallelisation of the Valgrind Dynamic Binary Instrumentation Framework,10.1109/ISPA.2008.94,Daniel Robson,Peter E. Strazdins
Edit
Parallelisation of the Valgrind Dynamic Binary Instrumentation Framework
(
Citations: 4
)
BibTex
|
RIS
|
RefWorks
Download
Daniel Robson
,
Peter E. Strazdins
Valgrind is a
dynamic binary translation
and instrumentation framework. It is suited to analysing memory usage. It is used in memory validation and profiling tools. Currently, Valgrind is restricted to executing a guest with serialised thread scheduling. This results in lost opportunity for performance when analysing highly
parallel applications
on parallel architectures. We have extended the framework to allow parallel execution of guest threads. Code caching mechanisms have been made thread-safe, by delaying flushing of translated code, while preserving critical areas of performance. Three methods which preserve atomicity of instructions are implemented and evaluated with respect to speed, reliability and instrumentation effects. Serialising both store and atomic operations preserves atomicity in the strongest sense, but suffers unacceptable performance overhead. Serialising only atomic instructions or utilising host atomic instructions provides speedup in line with native execution. These methods show average slow downs of only 2.6�? and multithreaded 2.2�? over native parallel execution respectively.
Conference:
IEEE International Symposium on Intelligent Signal Processing - WISP
, pp. 113-121, 2008
DOI:
10.1109/ISPA.2008.94
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
www.informatik.uni-trier.de
)
(
dx.doi.org
)
(
ieeexplore.ieee.org
)
(
ieeexplore.ieee.org
)
More »
Citation Context
(3)
...NUMAgrind is developed within our parallelised Valgrind framework [
34
] as a functional cache profiler [35]‐ [37] based on binary-translation [38], [39]...
Rui Yang
,
et al.
Profiling Directed NUMA Optimization on Linux Systems: A Case Study of...
...This has been tried by the authors of the pValgrind [
28
] project...
Aaron Pohle
,
et al.
Capability wrangling made easy: debugging on a microkernel with valgri...
...One solution that we are pursuing in other work is to use cache and interconnect simulation to provide cache misses broken down by memory locality domain [
8
]...
Rui Yang
,
et al.
A Simple Performance Model for Multithreaded Applications Executing on...
References
(11)
The Nas Parallel Benchmarks
(
Citations: 596
)
David Bailey
,
E. Barszcz
,
J. T. Barton
,
D. S. Browning
,
R. L. Carter
,
L. Dagum
,
R. A. Fatoohi
,
P. O. Frederickson
,
T. A. Lasinski
,
R. S. Schreiber
,
H. D. Simon
,
V. Venkatakrishnan
http://academic.research.microsoft.com/io.ashx?type=5&id=1256723&selfId1=0&selfId2=0&maxNumber=12&query=
Journal:
International Journal of High Performance Computing Applications - IJHPCA
, vol. 5, no. 3, pp. 63-73, 1991
QEMU, a Fast and Portable Dynamic Translator
(
Citations: 269
)
Fabrice Bellard
Conference:
USENIX Technical Conference - USENIX
, pp. 41-46, 2005
An API for Runtime Code Patching
(
Citations: 317
)
Bryan Buck
,
Jeffrey K. Hollingsworth
Journal:
International Journal of High Performance Computing Applications - IJHPCA
, vol. 14, no. 4, pp. 317-329, 2000
Dynamic Instrumentation of Production Systems
(
Citations: 173
)
Bryan Cantrill
,
Michael W. Shapiro
,
Adam H. Leventhal
Conference:
USENIX Technical Conference - USENIX
, pp. 15-28, 2004
A Tool Suite for Simulation Based Analysis of Memory Access Behavior
(
Citations: 28
)
Josef Weidendorfer
,
Markus Kowarschik
,
Carsten Trinitis
Conference:
International Conference on Computational Science - ICCS
, pp. 440-447, 2004
Sort by:
Citations
(4)
Profiling Directed NUMA Optimization on Linux Systems: A Case Study of the Gaussian Computational Chemistry Code
Rui Yang
,
Joseph Antony
,
Alistair Rendell
,
Danny Robson
,
Peter Strazdins
Published in 2011.
Capability wrangling made easy: debugging on a microkernel with valgrind
Aaron Pohle
,
Björn Döbel
,
Michael Roitzsch
,
Hermann Härtig
Journal:
Sigplan Notices - SIGPLAN
, pp. 3-12, 2010
Capability wrangling made easy: debugging on a microkernel with valgrind
Aaron Pohle
,
Björn Döbel
,
Michael Roitzsch
,
Hermann Härtig
Conference:
International Conference on Virtual Execution Environments - VEE
, pp. 3-12, 2010
A Simple Performance Model for Multithreaded Applications Executing on Non-uniform Memory Access Computers
(
Citations: 3
)
Rui Yang
,
Joseph Antony
,
Alistair P. Rendell
Conference:
High Performance Computing and Communications - HPCC
, pp. 79-86, 2009