Academic
Publications
Out-of-order processing: a new architecture for high-performance stream systems

Out-of-order processing: a new architecture for high-performance stream systems,Proceedings of The Vldb Endowment,Jin Li,Kristin Tufte,Vladislav Shkap

Out-of-order processing: a new architecture for high-performance stream systems   (Citations: 15)
BibTex | RIS | RefWorks Download
Many stream-processing systems enforce an order on data streams during query evaluation to help unblock blocking operators and purge state from stateful operators. Such in-order processing (IOP) systems not only must enforce order on input streams, but also require that query operators preserve order. This order- preserving requirement constrains the implementation of stream systems and incurs significant performance penalties, particularly for memory consumption. Especially for high-performance, poten- tially distributed stream systems, the cost of enforcing order can be prohibitive. We introduce a new architecture for stream sys- tems, out-of-order processing (OOP), that avoids ordering con- straints. The OOP architecture frees stream systems from the bur- den of order maintenance by using explicit stream progress indi- cators, such as punctuation or heartbeats, to unblock and purge operators. We describe the implementation of OOP stream systems and discuss the benefits of this architecture in depth. For example, the OOP approach has proven useful for smoothing workload bursts caused by expensive end-of-window operations, which can overwhelm internal communication paths in IOP approaches. We have implemented OOP in two stream systems, Gigascope and NiagaraST. Our experimental study shows that the OOP approach can significantly outperform IOP in a number of aspects, including memory, throughput and latency.
Journal: Proceedings of The Vldb Endowment - PVLDB , vol. 1, no. 1, pp. 274-288, 2008
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...A CQ often contains datareducing operators, such as aggregation and sampling, and memory needs are minimized if we can move stream elements through the query to such operators without ordering them [7].,Such a distinction between physical and logical streams is observed (either implicitly or explicitly) in many DSMSs, both in academia [4, 5, 6, 7, 10, 11] and industry [22, 25]...

    Badrish Chandramouliet al. Physically Independent Stream Merging

    • ...Correctness guarantees with low latency are provided due to the system’s ability to (1) output speculative [1, 10] results based on potentially incomplete or inaccurate sets of events, (2) compensate [1, 22] (or correct) incorrect output as late events and/or more accurate payloads are received at the system’s input, and (3) identify which output is guaranteed [4, 6] to be correct, i.e., cannot change in the future, based on received (or ...
    • ...While the exact nature of the control parameter associated with events varies across systems [1, 10, 17], two common notions are: (1) an event generation time, and (2) a duration, which indicates the period of time over which an event can influence output...

    Mohamed H. Aliet al. The extensibility framework in Microsoft StreamInsight

    • ...Knowing a bound on the amount of disorder has been traditionally used to determine the size of a buffer required to restore the order, but more recent work takes disorder more into the account for specific operators and provides related optimizations [30, 29]...

    Peter M. Fischeret al. Stream schema: providing and exploiting static metadata for data strea...

    • ...There have been new architectures proposed for handling out of order processing [20], [21] for a variety of problems, with the main concern being memory footprint...

    Sorabh Gandhiet al. Space-efficient online approximation of time series data: Streams, amn...

    • ...In [13] it is shown that this kind of processing can lead to a lower memory consumption, a lower average latency as well as to a lower runtime of queries...

    Jonas Jacobiet al. A physical operator algebra for prioritized elements in data streams

Sort by: