This paper describes a new approach to identify relevant flow records in large scale flow dataset. We propose a method that leverages the well known page rank algorithm in order to extract the most relevant flows. We introduce a dependency relation that uses a simple and efficient causal relationship. The strength of this dependency is determined by time related information. We have tested our method on datasets coming from our campus network.
