
Classification and Regression Trees (1984)

9405


Data Mining: Concepts and Techniques (2000)

5979


Uci repository of machine learning databases (1998)

5440


Introduction to Modern Information Retrieval (1984)

4976


Modern Information Retrieval (1999)

4930


Mining association rules between sets of items in large databases (1993)

4908


A Tutorial on Support Vector Machines for Pattern Recognition (1998)

4844


The anatomy of a largescale hypertextual Web search engine (1998)

4676


Fast Algorithms for Mining Association Rules (1994)

4575


Algorithms for Clustering Data (1988)

4196


Social Network Analysis: Methods and Applications (1994)

4081


Data Mining: Practical Machine Learning Tools and Techniques (2005)

3991


Indexing by Latent Semantic Analysis (1990)

3902


Authoritative sources in a hyperlinked environment (1999)

3773


Data clustering: a review (1999)

3497


Some methods for classification and analysis of multivariate observations (1967)

3438


The Elements of Statistical Learning (2001)

3423


Finding Groups in Data: An Introduction to Cluster Analysis (1990)

2937


Random Forests (2001)

2856


Robust regression and outlier detection (1987)

2746


Generalized linear models (1984)

2730


Working knowledge: how organizations manage what they know (2000)

2630


Text categorization with support vector machines: Learning withmany relevant features (1998)

2357


The PageRank Citation Ranking: Bringing Order to the Web (1998)

2185


Mining Sequential Patterns (1995)

2019


Equations of state calculations by fast computing machines (1993)

1991


Latent dirichlet allocation (2003)

1957


Mining frequent patterns without candidate generation (2000)

1939


Machine learning in automated text categorization (2002)

1901


Cluster Analysis (1993)

1846


An Introduction to Variable and Feature Selection (2003)

1821


The elements of statistical learning: data mining, inference, and prediciton (2002)

1782


cluster analysis for applications (1973)

1779


A comparative study on feature selection in text categorization (1997)

1766


Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (1999)

1742


Binary Codes Capable of Correcting Deletions, Insertions and Reversals (1966)

1733


The em algorithm and extensions (2000)

1708


Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer (1989)

1703


Machine learning in automated text categorization (2002)

1676


Some Methods for Classification and Analysis of MultiVariate Observations (1967)

1665


A DensityBased Algorithm for Discovering Clusters in Large Spatial Databases with Noise (1996)

1646


A vector space model for automatic indexing (1975)

1621


Multidimensional binary search trees used for associative searching (1975)

1617


Advances in Knowledge Discovery and Data Mining (1996)

1602


A survey of approaches to automatic schema matching (2001)

1591


Latent Dirichlet Allocation (2001)

1548


GroupLens: an open architecture for collaborative filtering of netnews (1994)

1541


Community structure in social and biological networks (2002)

1515


Empirical Analysis of Predictive Algorithms for Collaborative Filtering (1998)

1477


Spectral Graph Theory (1997)

1460


Statistical analysis of finite mixture distributions (1985)

1446


Fast Algorithms for Mining Association Rules in Large Databases (1994)

1387


Combining Labeled and Unlabeled Data with Cotraining (1998)

1371


Outliers in statistical data (1994)

1358


BIRCH: an efficient data clustering method for very large databases (1996)

1331


Learning Bayesian networks: The combination of knowledge andstatistical data (1994)

1320


Fast Discovery of Association Rules (1996)

1294


Fast Effective Rule Induction (1995)

1278


On Spectral Clustering: Analysis and an algorithm (2001)

1230


A reexamination of text categorization methods (1999)

1228


Using collaborative filtering to weave an information tapestry (1992)

1198


Introduction to Data Mining (2005)

1172


Algorithms for Nonnegative Matrix Factorization (2000)

1162


Human behavior and the pmnc~ple of least effort

1153


Models and issues in data stream systems (2002)

1125


Human behavior and the principle of least effort

1122


Principles of Data Mining (2001)

1113


Itembased collaborative filtering recommendation algorithms (2001)

1104


What is your strategy for managing knowledge (1999)

1092


A comparison of event models for Naive Bayes text classification (1998)

1072


Text Classification from Labeled and Unlabeled Documents using EM (2000)

1058


Multiinterval discretization of continuousvalued attributes for classification learning (1993)

1038


PrivacyPreserving Data Mining (2000)

1030


Knowledge Acquisition via Incremental Conceptual Clustering (1987)

1025


Toward the Next Generation of Recommender Systems: A Survey of the StateoftheArt and Possible Extensions (2005)

1024


kANONYMITY: A MODEL FOR PROTECTING PRIVACY 1 (2002)

1017


Discriminant analysis and statistical pattern recognition (1992)

1015


An overview of data warehousing and OLAP technology (1997)

1010


From Data Mining to Knowledge Discovery: An Overview (1996)

1003


Graph structure in the Web (2000)

996


An InformationTheoretic Definition of Similarity (1998)

990


Mining Sequential Patterns: Generalization and Performance Improvements (1996)

986


Evaluating collaborative filtering recommender systems (2004)

965


Detection of Abrupt Changes: Theory and Applications (1992)

964


Automatic subspace clustering of high dimensional data for data mining applications (1998)

957


On the Optimality of the Simple Bayesian Classifier under ZeroOneLoss (1997)

951


Binary codes capable of correcting deletions, insertions and reversals (1965)

945


Text Categorization with Suport Vector Machines: Learning with Many Relevant Features (1998)

944


Selection of Relevant Features and Examples in Machine Learning (1997)

943


Multivariate Density Estimation: Theory, Practice, and Visualization (1992)

942


Probabilistic latent semantic indexing (1999)

941


Mining Generalized Association Rules (1995)

931


An evaluation of statistical approaches to text categorization (1999)

928


Comparing partitions (1985)

905


Ensemble Methods in Machine Learning (2000)

903


On the limited memory BFGS method for large scale optimization (1989)

900


An Empirical Comparison of Voting Classification Algorithms: Bagging,Boosting, and Variants (1999)

895


Mixture models: inference and applications to clustering (1988)

881


Data Mining: An Overview from a Database Perspective (1996)

879


Efficient and Effective Querying by Image Content (1994)

878
