Sign in
Author

Conference

Journal

Organization

Year

DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(13)
Algorithm Design
Belief Propagation
Data Mining
Drug Abuse
Fixed Point
Graph Algorithm
Graph Mining
Indexing Terms
Line Graph
Parallel Algorithm
Scaling Up
Social Network
Web Graph
Subscribe
Academic
Publications
Mining large graphs: Algorithms, inference, and discoveries
Mining large graphs: Algorithms, inference, and discoveries,10.1109/ICDE.2011.5767883,U. Kang,Duen Horng Chau,Christos Faloutsos
Edit
Mining large graphs: Algorithms, inference, and discoveries
(
Citations: 1
)
BibTex

RIS

RefWorks
Download
U. Kang
,
Duen Horng Chau
,
Christos Faloutsos
How do we find patterns and anomalies, on graphs with billions of nodes and edges, which do not fit in memory? How to use parallelism for such terabytescale graphs? In this work, we focus on inference, which often corresponds, intuitively, to "guilt by association" scenarios. For example, if a person is a drugabuser, probably its friends are so, too; if a node in a
social network
is of male gender, his dates are probably females. We show how to do inference on such huge graphs through our proposed HADOOP
LINE GRAPH
FIXED POINT
(HALFP), an efficient
parallel algorithm
for sparse billionscale graphs, using the HADOOP platform. Our contributions include (a) the design of HALFP, observing that it corresponds to a
fixed point
on a
line graph
induced from the original graph; (b) scalability analysis, showing that our algorithm scales up well with the number of edges, as well as with the number of machines; and (c) experimental results on two private, as well as two of the largest publicly available graphs — the Web Graphs from Yahoo! (6.6 billion edges and 0.24 Tera bytes), and the Twitter graph (3.7 billion edges and 0.13 Tera bytes). We evaluated our algorithm using M45, one of the top 50 fastest supercomputers in the world, and we report patterns and anomalies discovered by our algorithm, which would be invisible otherwise. Index Terms—HALFP, Belief Propagation, Hadoop, Graph
Conference:
International Conference on Data Engineering  ICDE
, pp. 243254, 2011
DOI:
10.1109/ICDE.2011.5767883
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
dx.doi.org
)
(
www.informatik.unitrier.de
)
(
ieeexplore.ieee.org
)
(
ieeexplore.ieee.org
)
More »
Citation Context
(1)
...It also provides a distributed file system (HDFS) and data processing tools such as PIG [13] and Hive . Due to its extreme scalability and ease of use, HADOOP is widely used for large scale data mining [9,
8
] ...
U Kang
,
et al.
Spectral Analysis for BillionScale Graphs: Discoveries and Implementa...
References
(23)
Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach
(
Citations: 85
)
Judea Pearl
Conference:
National Conference on Artificial Intelligence  AAAI
, pp. 133136, 1982
Understanding belief propagation and its generalizations
(
Citations: 415
)
J. S. Yedidia
,
W. T. Freeman
,
Y. Weiss
Conference:
International Joint Conference on Artificial Intelligence  IJCAI
, 2003
Efficient Belief Propagation for Early Vision
(
Citations: 257
)
Pedro F. Felzenszwalb
,
Daniel P. Huttenlocher
Journal:
International Journal of Computer Vision  IJCV
, vol. 70, no. 1, pp. 4154, 2006
Detecting Fraudulent Personalities in Networks of Online Auctioneers
(
Citations: 19
)
Duen Horng Chau
,
Shashank Pandit
,
Christos Faloutsos
Conference:
Principles of Data Mining and Knowledge Discovery  PKDD
, pp. 103114, 2006
SNARE: a link analytic system for graph labeling and risk detection
(
Citations: 4
)
Mary Mcglohon
,
Stephen Bay
,
Markus G. Anderle
,
David M. Steier
,
Christos Faloutsos
Conference:
Knowledge Discovery and Data Mining  KDD
, pp. 12651274, 2009
Sort by:
Citations
(1)
Spectral Analysis for BillionScale Graphs: Discoveries and Implementation
(
Citations: 1
)
U Kang
,
Brendan Meeder
,
Christos Faloutsos