-
...this paper presents some of the unique issues faced by an instruction scheduler for the pa-8000. several features of the micro-architecture are presented along...features include latency, resource constraints, instruction polarity cache interfaces, and memory dependences. the performance results in the paper show that instruction scheduling remains an important compiler optimization...
-
...gwennap, 1994), an innovative dynamically scheduled machine which is the first implementation of the 64 bit pa 2.0 member of the hp pa-risc architecture family. this wide...ample hardware resources, many of the optimizing transformations which proved effective for the pa-8000 served to augment its ability to exploit the available bandwidth and to hide...
-
...instruction scheduling is one of the most important steps forimproving the performance of object code produced by a compiler. thelocal instruction scheduling problem is to find a minimum length instructionschedule for a basic block subject to...
-
...fundamental source of information regarding the resource dependencies, intra-instruction and inter-instruction parallelism, and the cost-performance of the architecture. additionally, we propose a...can obtain a better static schedule of the assembly level instructions. for better understanding of the techniques developed, we have presented...
Published in 1996.
-
...combining speculative execution with on-the-fly instruction reordering. the heart of the machine, the instruction reorder buffer, provides out-of...hardware. we also implemented dynamic instruction reordering in hardware to maximize instruction-level parallelism available to the execution units. the pa-8000 connects to a high-bandwidth...
-
...a key role in unlocking the performance of the pa-8000, an innovative dynamically-scheduled machine which is the first implementation of the 64-bit pa 2.0 member of the...ample hardware resources, many of the optimizing transformations which proved effective for the pa-8000 served to augment its ability to exploit the available bandwidth and to hide...
-
...is capable of transparently improving the performance of a native instruction stream as it executes on the processor. the input native instruction stream to dynamo can be...native binary. this paper evaluates the dynamo system in the latter, more challenging situation, in order to emphasize the limits, rather than the potential, of the system. our experiments demonstrate that...
-
...is capable of transparently improving the performance of a native instruction stream as it executes on the processor. the input native instruction stream to dynamo can be...native binary. this paper evaluates the dynamo system in the latter, more challenging situation, in order to emphasize the limits, rather than the potential, of the system. our experiments demonstrate that...
-
...implemented side-by-side on the same machine to allow full performance evaluation. the hewlett-packard pa-8000 microprocessor implements both a simple dynamic prediction scheme and static prediction, selectable by the application programmer. this paper studies the pa-8000's trade-off between static and dynamic prediction, and the compiler optimizations needed to support...
-
...parameters move through different domains. for example, modeling unrealistic caches can under- or over-state the benefits of better prediction or a larger instruction window. avoiding such pitfalls requires...sampling full-length runs with the spec reference inputs. in particular, the results show that branch mispredictions limit the benefits of larger instruction windows, that better branch pre...