Sign in
Author

Conference

Journal

Organization

Year

DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all fields of study
Limit my searches in the following fields of study
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(11)
Complex Objects
Data Cleansing
Data Mining
High Performance
Indexation
K Nearest Neighbor
Nearest Neighbor Classification
Similarity Join
Similarity Search
K Means
Nearest Neighbor
Related Publications
(7)
Closest Pair Queries in Spatial Databases
Incremental Distance Join Algorithms for Spatial Databases
Supporting KDD Applications by the kNearest Neighbor Join
Evaluation of Iceberg Distance Joins
The kNearest Neighbour Join: Turbo Charging the KDD Process
Subscribe
Academic
Publications
High Performance Data Mining Using the Nearest Neighbor Join
High Performance Data Mining Using the Nearest Neighbor Join,10.1109/ICDM.2002.1183884,Christian Böhm,Florian Krebs
Edit
High Performance Data Mining Using the Nearest Neighbor Join
(
Citations: 22
)
BibTex

RIS

RefWorks
Download
Christian Böhm
,
Florian Krebs
The
similarity join
has become an important database primitive to support
similarity search
and data mining. A
similarity join
combines two sets of
complex objects
such that the result con tains all pairs of similar objects. Wellknown are two types of the similarity join, the distance range join where the user defines a distance threshold for the join, and the closest point query or kdistance join which retrieves the k most similar pairs. In this paper, we investigate an important, third
similarity join
opera tion called knearest neighbor join which combines each point of one point set with its k nearest neighbors in the other set. It has been shown that many standard algorithms of Knowledge Dis covery in Databases (KDD) such as kmeans and kmedoid clus tering,
nearest neighbor
classification, data cleansing, postpro cessing of samplingbased
data mining
etc. can be implemented on top of the knn join operation to achieve performance im provements without affecting the quality of the result of these al gorithms. We propose a new algorithm to compute the knearest neighbor join using the multipage index (MuX), a specialized in dex structure for the similarity join. To reduce both CPU and I/O cost, we develop optimal loading and processing strategies.
Conference:
IEEE International Conference on Data Mining  ICDM
, pp. 4350, 2002
DOI:
10.1109/ICDM.2002.1183884
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
www.informatik.unitrier.de
)
(
www.dbs.ifi.lmu.de
)
(
www.dbs.informatik.unimuenchen.de
)
(
ieeexplore.ieee.org
)
(
ieeexplore.ieee.org
)
(
computer.org
)
More »
Citation Context
(15)
...A more general version is the kNNJoin problem [
7
], [8], [11], [31], [33]: Given a data set P and a query set Q, for each point q ∈ Q we would like to retrieve its k nearest neighbors from points in P...
...Finally, the kNNJoin has also been studied [
7
], [8], [11], [31], [33]...
Bin Yao
,
et al.
K nearest neighbor queries and kNNJoins in large relational databases...
...Such nearest neighbor and a distancebased join operations have been used as a basic and underlying operation in many data mining applications, multimedia and spatial GIS databases, online decision support, and Internet search applications [12], [
13
], [14], [15]...
You Jung Kim
,
et al.
Performance Comparison of the R*Tree and the Quadtree for kNN and Dist...
...Nearest Neighbor Search (NNS) is an important technique in a variety of applications including pattern recognition [6], vision [13], or data mining [
1
,5]...
Eva GómezBallester
,
et al.
Combining Elimination Rules in TreeBased Nearest Neighbor Search Algo...
...Bohm and Krebs [
3
] discuss the knearest neighbor join, which associates two sets of spatial data objects DA and DB and a cardinality threshold k; the output is a set of pairs from DA and DB that include, for each data object from DA, its k NNs in DB. Shou et al. [37] study the iceberg distance join where, given two spatial data sets DA and DB, a distance threshold � , and a cardinality threshold k, the target is to retrieve all pairs of ...
Yunjun Gao
,
et al.
OptimalLocationSelection Query Processing in Spatial Databases
...The clustering model was based on link analysis over the customers’ relationship records, followed by a clustering approach using the nearest neighbor technique [
6
]...
Carlos André Reis Pinheiro
,
et al.
Customer's Relationship Segmentation Driving the Predictive Modeling f...
References
(26)
Efficient Processing of Spatial Joins Using RTrees
(
Citations: 408
)
Thomas Brinkhoff
,
HansPeter Kriegel
,
Bernhard Seeger
Journal:
Sigmod Record
, vol. 22, no. 2, pp. 237246, 1993
OPTICS: Ordering Points To Identify the Clustering Structure
(
Citations: 534
)
Mihael Ankerst
,
Markus M. Breunig
,
HansPeter Kriegel
,
Jörg Sander
Conference:
International Conference on Management of Data  SIGMOD
, pp. 4960, 1999
Independent Quantization: An Index Compression Technique for HighDimensional Data Spaces
(
Citations: 127
)
Stefan Berchtold
,
Christian Böhm
,
H. V. Jagadish
,
Hanspeter Kriegel
,
Jörg Sander
Conference:
International Conference on Data Engineering  ICDE
, pp. 577588, 2000
Fast clustering based on highdimensional similarity joins
(
Citations: 13
)
C Bohm
,
B Braunmuller
,
M. M Breunig
,
H. P Kriegel
Conference:
International Conference on Information and Knowledge Management  CIKM
, 2000
A Generic Approach to Bulk Loading Multidimensional Index Structures
(
Citations: 102
)
Jochen Van Den Bercken
,
Bernhard Seeger
,
Peter Widmayer
Conference:
Very Large Data Bases  VLDB
, pp. 406415, 1997
Sort by:
Citations
(22)
A diskaware algorithm for time series motif discovery
(
Citations: 1
)
Abdullah Mueen
,
Eamonn J. Keogh
,
Qiang Zhu
,
Sydney S. Cash
,
M. Brandon Westover
,
Nima Bigdely Shamlo
Journal:
Data Mining and Knowledge Discovery  DATAMINE
, vol. 22, no. 12, pp. 73105, 2011
RanKloud: Scalable Multimedia Data Processing in Server Clusters
(
Citations: 2
)
K. Selçuk Candan
,
Jong Wook Kim
,
Parth Nagarkar
,
Mithila Nagendra
,
Renwei Yu
Journal:
IEEE Multimedia  IEEEMM
, vol. 18, no. 1, pp. 6477, 2011
K nearest neighbor queries and kNNJoins in large relational databases (almost) for free
(
Citations: 3
)
Bin Yao
,
Feifei Li
,
Piyush Kumar
Conference:
International Conference on Data Engineering  ICDE
, pp. 415, 2010
Performance Comparison of the R*Tree and the Quadtree for kNN and Distance Join Queries
(
Citations: 2
)
You Jung Kim
,
Jignesh M. Patel
Journal:
IEEE Transactions on Knowledge and Data Engineering  TKDE
, vol. 22, no. 7, pp. 10141027, 2010
Combining Elimination Rules in TreeBased Nearest Neighbor Search Algorithms
(
Citations: 1
)
Eva GómezBallester
,
Luisa Micó
,
Franck Thollard
,
José Oncina
,
Francisco MorenoSeco
Conference:
International Workshop on Structural and Syntactic Pattern Recognition  SSPR
, pp. 8089, 2010