Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all domains
Limit my searches in the following domains
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(12)
Algorithm Design
Dynamic Program
Efficient Algorithm
Efficient Query Processing
Empirical Study
Indexation
Probabilistic Logic
Probability Theory
Random Variable
Reactive Power
Similarity Join
Uncertain Data
Subscribe
Academic
Publications
Join queries on uncertain data: Semantics and efficient processing
Edit
Join queries on uncertain data: Semantics and efficient processing
BibTex
|
RIS
|
RefWorks
Download
Tingjian Ge
Uncertain data
is quite common nowadays in a variety of modern database applications. At the same time, the join operation is one of the most important but expensive operations in SQL. However, join queries on
uncertain data
have not been adequately addressed thus far. In this paper, we study the SQL join operation on uncertain attributes. We observe and formalize two kinds of join operations on such data, namely v- join and d-join. They are each useful for different applications. Using probability theory, we then devise
efficient query processing
algorithms for these join operations. Specifically, we use probability bounds that are based on the moments of random variables to either early accept or early reject a candidate v-join result tuple. We also devise an indexing mechanism and an algorithm called Two-End Zigzag Join to further save I/O costs. For d-join, we first observe that it can be reduced to a special form of
similarity join
in a multidimensional space. We then design an
efficient algorithm
called condensed d-join and an optimal condensation scheme based on dynamic programming. Finally, we perform a comprehensive
empirical study
using both real datasets and synthetic datasets.
Conference:
International Conference on Data Engineering - ICDE
, pp. 697-708, 2011
DOI:
10.1109/ICDE.2011.5767888
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
dx.doi.org
)
(
www.informatik.uni-trier.de
)
(
ieeexplore.ieee.org
)
(
ieeexplore.ieee.org
)
More »
References
(26)
Indexing uncertain data
(
Citations: 10
)
Pankaj K. Agarwal
,
Siu-wing Cheng
,
Yufei Tao
,
Ke Yi
Conference:
Symposium on Principles of Database Systems - PODS
, pp. 137-146, 2009
The R*-tree: an efficient and robust access method for points and rectangles
(
Citations: 2132
)
Hans-peter Kriegel
,
Ralf Schneider
,
Bernhard Seeger
,
Norbert Beckmann
Journal:
Sigmod Record
, vol. 19, no. 2, pp. 322-331, 1990
Pattern Recognition and Machine Learning
(
Citations: 1077
)
Christopher M. Bishop
,
Nasser M. Nasrabadi
Journal:
Journal of Electronic Imaging - JEI
, vol. 16, no. 4, 2007
Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data
(
Citations: 43
)
Christian Böhm
,
Bernhard Braunmüller
,
Florian Krebs
,
Hans-Peter Kriegel
Conference:
International Conference on Management of Data - SIGMOD
, pp. 379-388, 2001
The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors
(
Citations: 39
)
Christian Böhm
,
Alexey Pryakhin
,
Matthias Schubert
Conference:
International Conference on Data Engineering - ICDE
, 2006