Using Magpie for Request Extraction and Workload Modelling

Using Magpie for Request Extraction and Workload Modelling,Paul Barham,Austin Donnelly,Rebecca Isaacs,Richard Mortier

Using Magpie for Request Extraction and Workload Modelling   (Citations: 215)
BibTex | RIS | RefWorks Download
Tools to understand complex system behaviour are es- sential for many performance analysis and debugging tasks, yet there are many open research problems in their development. Magpie is a toolchain for auto- matically extracting a system's workload under realis- tic operating conditions. Using low-overhead instru- mentation, we monitor the system to record fine-grained events generated by kernel, middleware and application components. The Magpie request extraction tool uses an application-specific event schema to correlate these events, and hence precisely capture the control flow and resource consumption of each and every request. By removing scheduling artefacts, whilst preserving causal dependencies, we obtain canonical request descriptions from which we can construct concise workload models suitable for performance prediction and change detec- tion. In this paper we describe and evaluate the capa- bility of Magpie to accurately extract requests and con- struct representative models of system behaviour.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Magpie [4] is a system for monitoring and modeling server workload...

    Lenin Ravindranath Sivalingamet al. AppInsight: Mobile App Performance Monitoring in the Wild

    • ...Log analysis for failure diagnosis Existing log analysis work focuses on post-mortem diagnosis using logs, learning statistical signatures [3, 8, 16, 29, 56] or inferring partial execution paths and run-time states [57]...

    Ding Yuanet al. Improving software diagnosability via log enhancement

    • ...Many build on low-overhead end-to-end tracing (e.g., [4, 7, 9, 11, 31, 34]), which captures the flow (i.e., path and timing) of individual requests within and across the components of a distributed system.,For example, with such rich information about a system’s operation, researchers have developed new techniques for detecting anomalous request flows [4], spotting large-scale departures from performance models [33], and comparing observed behaviour to manually-constructed expectations [26].,Several efforts, including Magpie [4], Whodunit [7], Pinpoint [9], X-Trace [10, 11], Google’s Dapper [31], and Stardust [34] have independently implemented such tracing and shown that it can be used continuously with low overhead, especially when request sampling is supported [10, 28, 31].,With regard to the latter, Magpie [4] and recent versions of both Stardust [34] and X-Trace [10] explicitly account for concurrency by embedding information about thread synchronization in their traces (see Figure 2). These implementations are a natural fit for request-flow comparison, as they can disambiguate true structural differences from false ones caused by alternate interleavings of concurrent activity.,Due to space constraints, mocked-up graphs are shown in which nodes represent the type of component accessed. tier systems [4].,Conversely, anomaly detection techniques, as implemented by Magpie [4] and Pinpoint [9], mine a single period’s request flows to identify rare ones that differ greatly from others.,The number of categories could be further reduced by using unsupervised clustering algorithms, such as those used in Magpie [4], to bin similar but not necessarily identical requests into the same category.,Our experiences with unsupervised learning algorithms, such as clustering [4, 29], for merging categories indicate they are inadequate.,Magpie [4], Pinpoint [9], WAP5 [27], and Xu [38], all identify anomalous requests by finding rare ones that differ greatly from others...

    Raja R. Sambasivanet al. Diagnosing performance changes by comparing request flows

    • ...Of the literature on the determination of causal paths, only Magpie [13] collects enough information about resource usage along paths that detailed response time modeling might be attempted...

    Shuyi Chenet al. Using Link Gradients to Predict the Impact of Network Latency on Multi...

    • ...Previous work, especially those on path-based analysis [14, 7, 13, 8, 19, 26, 18, 27], has largely addressed the important problem of generating and correlating runtime information from executions of a distributed system.,Previous work [14, 7, 13, 8, 20, 18, 27] on correlating runtime information has effectively addressed this shortcoming by capturing common causal relationship in distributed systems.,Runtime overhead. Runtime overhead for emitting events and edges is comparable to those in previous work on capturing causal dependencies [14, 7, 13, 8, 19, 26, 18, 27].,It is clearly related to pathbased analysis [14, 13, 26, 7, 8, 27], where a path is often defined as a sequence of events that is triggered by a client request.,Path instances can either be collected through annotation [14, 13, 26] and schemas [8] provided by developers, or statistically inferred from inter-machine communications [7].,For example, Magpie [8] aims to analyze workload models from path instances; PinPoint [14, 13] uses statistical methods to find components that are highly correlated to failed requests; Pip [26] checks these instances against specifications of expected system behavior defined by users.,Technically, pathbased techniques [14, 13, 26, 7, 8, 19] could be applied on these forward slices and therefore integrated into G 2 . We plan to investigate this feasibility in the future...

    Zhenyu Guoet al. G2: A Graph Processing System for Diagnosing Distributed Systems

Sort by: