Academic
Publications
Effective keyword search in relational databases

Effective keyword search in relational databases,10.1145/1142473.1142536,Fang Liu,Clement T. Yu,Weiyi Meng,Abdur Chowdhury

Effective keyword search in relational databases   (Citations: 112)
BibTex | RIS | RefWorks Download
With the amount of available text data in relational databases growing rapidly, the need for ordinary users to search such information is dramatically increasing. Even though the major RDBMSs have provided full-text search capabilities, they still require users to have knowledge of the database schemas and use a structured query language to search information. This search model is complicated for most ordinary users. Inspired by the big success of information retrieval (IR) style keyword search on the web, keyword search in relational databases has recently emerged as a new research topic. The differences between text databases and relational databases result in three new challenges: (1) Answers needed by users are not limited to individual tuples, but results assembled from joining tuples from multiple tables are used to form answers in the form of tuple trees. (2) A single score for each answer (i.e. a tuple tree) is needed to estimate its relevance to a given query. These scores are used to rank the most relevant answers as high as possible. (3) Relational databases have much richer structures than text databases. Existing IR strategies are inadequate in ranking relational outputs. In this paper, we propose a novel IR ranking strategy for effective keyword search. We are the first that conducts comprehensive experiments on search effectiveness using a real world database and a set of keyword queries collected by a major search company. Experimental results show that our strategy is significantly better than existing strategies. Our approach can be used both at the application level and be incorporated into a RDBMS to support keyword-based search in relational databases. 1. INTRODUCTION The amount of available structured data (in internet or intranet or even on personal desktops) for ordinary users grows rapidly. Besides data types such as number, date and time, structured databases usually also contain a large amount of text data, such as names of people, organizations and products, titles of books, songs and movies, street addresses, descriptions or reviews of products, contents of papers, and lyrics of songs, etc. The need for ordinary users to find information from text in these databases is dramatically increasing. The objective of this paper is to provide effective search of text information in relational databases. We take a lyrics database (Figure 1) as an example to illustrate the problem. There are five tables in the lyrics database. Table Artist has one text column: Name. Table Album has one text column: Title. Table Song has two text columns: Title and Lyrics. The tuples of Table Artist and those of Table Album have m:n relationships (an album may be produced by multiple artists and an artist may produce more than one album), and Table Aritst- Album is the corresponding relationship table. Table Song-Album is also a relationship table capturing the m:n relationships between tuples of Album and Song (a song may be contained in multiple albums and an album many contain more than one song). Note that Table Aritst-Album and Table Aritst-Album do not have other columns except their primary keys and foreign keys.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Liu et al. [47] proposed a new ranking strategy to solve the search effectiveness problem for relational databases...

    Guoliang Liet al. Providing built-in keyword search capabilities in RDBMS

    • ...Edges are either directed [2,15,16,19,23] or undirected [1,8,10,13,14,22,24]...
    • ...The direction of an edge has been an issue in traversing algorithms [16,19,23]...
    • ...[23] improves the ranking method in [14] by adopting several normalization techniques...
    • ...IR-Style [14] TF-IDF Sparse algorithm, (Global) Pipeline algorithm Effectiveness [23] Normalization N/A...
    • ...The ranking formula is improved in [23] by using several refined weightingschemesthatfocusontheeffectivenessissuesandnormalizationfactors:tuplesize normalization, refined document length normalization, document frequency normalization, and inter-document weight normalization...
    • ...On the other hand, relational databases hold evidences in an explicitly structured space, following the lead of [13,14,23]...
    • ...However, for keyword search in relational databases, only a few approaches [13,14,8,23,24] have tried to avoid exhaustive processing by introducing a top-k processing algorithm such as a pipelined algorithm...

    Sang-goo Lee. Keyword search in relational databases

    • ...The database research community has recently recognized the benefits of keyword search and has been introducing keyword search capability into relational databases [1], [3], [10], [14], [18], [20], [33], [36], [37], [39], XML databases [8], [15], [19], [21], [29], [32], [34], [40], [41], [42], [35], graphs [17], [22], [31], and heterogenous data sources [27], [30]...

    Guoliang Liet al. KEMB: A Keyword-Based XML Message Broker

    • ...The ranking issues were also discussed in [4,15,24]...

    Lu Qinet al. Scalable keyword search on large data streams

    • ...Keyword query, one of the most popular and easy-to-use ways to retrieve useful information from a collection of plain documents, is being extended to RDBMSs to retrieve information from text-rich attributes [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]...
    • ...The first type uses SQL to find the connected trees [2], [7], [6], [10], [12], [13]...

    Bolin Dinget al. Efficient Keyword-Based Search for Top-K Cells in Text Cube

Sort by: