Top publications in data mining 1–100 of 71,208 results
Publications Citations  
1
Classification and Regression Trees (1984) 9405
2
Data Mining: Concepts and Techniques (2000) 5979
3
Uci repository of machine learning databases (1998) 5440
4
Introduction to Modern Information Retrieval (1984) 4976
5
Modern Information Retrieval (1999) 4930
6
Mining association rules between sets of items in large databases (1993) 4908
7
A Tutorial on Support Vector Machines for Pattern Recognition (1998) 4844
8
The anatomy of a large-scale hypertextual Web search engine (1998) 4676
9
Fast Algorithms for Mining Association Rules (1994) 4575
10
Algorithms for Clustering Data (1988) 4196
11
Social Network Analysis: Methods and Applications (1994) 4081
12
Data Mining: Practical Machine Learning Tools and Techniques (2005) 3991
13
Indexing by Latent Semantic Analysis (1990) 3902
14
Authoritative sources in a hyperlinked environment (1999) 3773
15
Data clustering: a review (1999) 3497
16
Some methods for classification and analysis of multivariate observations (1967) 3438
17
The Elements of Statistical Learning (2001) 3423
18
Finding Groups in Data: An Introduction to Cluster Analysis (1990) 2937
19
Random Forests (2001) 2856
20
Robust regression and outlier detection (1987) 2746
21
Generalized linear models (1984) 2730
22
Working knowledge: how organizations manage what they know (2000) 2630
23
Text categorization with support vector machines: Learning withmany relevant features (1998) 2357
24
The PageRank Citation Ranking: Bringing Order to the Web (1998) 2185
25
Mining Sequential Patterns (1995) 2019
26
Equations of state calculations by fast computing machines (1993) 1991
27
Latent dirichlet allocation (2003) 1957
28
Mining frequent patterns without candidate generation (2000) 1939
29
Machine learning in automated text categorization (2002) 1901
30
Cluster Analysis (1993) 1846
31
An Introduction to Variable and Feature Selection (2003) 1821
32
The elements of statistical learning: data mining, inference, and prediciton (2002) 1782
33
cluster analysis for applications (1973) 1779
34
A comparative study on feature selection in text categorization (1997) 1766
35
Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (1999) 1742
36
Binary Codes Capable of Correcting Deletions, Insertions and Reversals (1966) 1733
37
The em algorithm and extensions (2000) 1708
38
Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer (1989) 1703
39
Machine learning in automated text categorization (2002) 1676
40
Some Methods for Classification and Analysis of MultiVariate Observations (1967) 1665
41
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise (1996) 1646
42
A vector space model for automatic indexing (1975) 1621
43
Multidimensional binary search trees used for associative searching (1975) 1617
44
Advances in Knowledge Discovery and Data Mining (1996) 1602
45
A survey of approaches to automatic schema matching (2001) 1591
46
Latent Dirichlet Allocation (2001) 1548
47
GroupLens: an open architecture for collaborative filtering of netnews (1994) 1541
48
Community structure in social and biological networks (2002) 1515
49
Empirical Analysis of Predictive Algorithms for Collaborative Filtering (1998) 1477
50
Spectral Graph Theory (1997) 1460
51
Statistical analysis of finite mixture distributions (1985) 1446
52
Fast Algorithms for Mining Association Rules in Large Databases (1994) 1387
53
Combining Labeled and Unlabeled Data with Co-training (1998) 1371
54
Outliers in statistical data (1994) 1358
55
BIRCH: an efficient data clustering method for very large databases (1996) 1331
56
Learning Bayesian networks: The combination of knowledge andstatistical data (1994) 1320
57
Fast Discovery of Association Rules (1996) 1294
58
Fast Effective Rule Induction (1995) 1278
59
On Spectral Clustering: Analysis and an algorithm (2001) 1230
60
A re-examination of text categorization methods (1999) 1228
61
Using collaborative filtering to weave an information tapestry (1992) 1198
62
Introduction to Data Mining (2005) 1172
63
Algorithms for Nonnegative Matrix Factorization (2000) 1162
64
Human behavior and the pmnc~ple of least effort 1153
65
Models and issues in data stream systems (2002) 1125
66
Human behavior and the principle of least effort 1122
67
Principles of Data Mining (2001) 1113
68
Item-based collaborative filtering recommendation algorithms (2001) 1104
69
What is your strategy for managing knowledge (1999) 1092
70
A comparison of event models for Naive Bayes text classification (1998) 1072
71
Text Classification from Labeled and Unlabeled Documents using EM (2000) 1058
72
Multi-interval discretization of continuous-valued attributes for classification learning (1993) 1038
73
Privacy-Preserving Data Mining (2000) 1030
74
Knowledge Acquisition via Incremental Conceptual Clustering (1987) 1025
75
Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions (2005) 1024
76
k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY 1 (2002) 1017
77
Discriminant analysis and statistical pattern recognition (1992) 1015
78
An overview of data warehousing and OLAP technology (1997) 1010
79
From Data Mining to Knowledge Discovery: An Overview (1996) 1003
80
Graph structure in the Web (2000) 996
81
An Information-Theoretic Definition of Similarity (1998) 990
82
Mining Sequential Patterns: Generalization and Performance Improvements (1996) 986
83
Evaluating collaborative filtering recommender systems (2004) 965
84
Detection of Abrupt Changes: Theory and Applications (1992) 964
85
Automatic subspace clustering of high dimensional data for data mining applications (1998) 957
86
On the Optimality of the Simple Bayesian Classifier under Zero-OneLoss (1997) 951
87
Binary codes capable of correcting deletions, insertions and reversals (1965) 945
88
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features (1998) 944
89
Selection of Relevant Features and Examples in Machine Learning (1997) 943
90
Multivariate Density Estimation: Theory, Practice, and Visualization (1992) 942
91
Probabilistic latent semantic indexing (1999) 941
92
Mining Generalized Association Rules (1995) 931
93
An evaluation of statistical approaches to text categorization (1999) 928
94
Comparing partitions (1985) 905
95
Ensemble Methods in Machine Learning (2000) 903
96
On the limited memory BFGS method for large scale optimization (1989) 900
97
An Empirical Comparison of Voting Classification Algorithms: Bagging,Boosting, and Variants (1999) 895
98
Mixture models: inference and applications to clustering (1988) 881
99
Data Mining: An Overview from a Database Perspective (1996) 879
100
Efficient and Effective Querying by Image Content (1994) 878