Academic
Publications
Indexing by Latent Semantic Analysis

Indexing by Latent Semantic Analysis,10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9,Journal of The American Society for Information Scie

Indexing by Latent Semantic Analysis   (Citations: 3902)
BibTex | RIS | RefWorks Download
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising
Cumulative Annual
    • ...At the document level, they are effective for applications including information retrieval (Salton and McGill, 1983; Deerwester et al., 1990), document clustering (Deerwester et al., 1990; Xu et al., 2003), search relevance measurement (Baeza-Yates and Ribiero-Neto, 1999) and cross-lingual document retrieval (Platt et al., 2010).,At the document level, they are effective for applications including information retrieval (Salton and McGill, 1983; Deerwester et al., 1990), document clustering (Deerwester et al., 1990; Xu et al., 2003), search relevance measurement (Baeza-Yates and Ribiero-Neto, 1999) and cross-lingual document retrieval (Platt et al., 2010).,At the word level, vector representations have been used to measure word similarity (Deerwester et al., 1990; Turney and Littman, 2005; Turney, 2006; Turney, 2001; Lin, 1998; Agirre et al., 2009; Reisinger and Mooney, 2010) and for language modeling (Bellegarda, 2000; Coccaro and Jurafsky, 1998).,Deerwester et al. (1990) defines a metric for measuring word similarity based on LSA, and it has been used in (Landauer and Dumais, 1997; Landauer et al., 1998) to answer word similarity questions derived from the Test of English as a Foreign Language (TOEFL).,Latent Semantic Analysis (Deerwester et al., 1990) is a widely used method for representing words and documents in a low dimensional vector space...

    Scott Yihet al. Polarity Inducing Latent Semantic Analysis

    • ...In this approach, we treated each sentence in the training data as a “document” and performed latent semantic analysis (Deerwester et al., 1990) to obtain a 300 dimensional vector representation of each word in the vocabulary...

    Geoffrey Zweiget al. A Challenge Set for Advancing Language Modeling

    • ... 1990), probabilistic latent semantic analysis (Hofmann 2001), and latent dirichlet allocation (Blei et al...

    Ronny Lusset al. Predicting abnormal returns from news using text classification

    • ...1990) and its variants, and measures of distributional similarity (Lin 1998; Lee 1999), attempt to derive aspects of the meanings of words by statistical analysis, and statistical information is often used when parsing to determine sentence structure (Collins 1997)...

    Daoud Clarke. A Context-Theoretic Framework for Compositionality in Distributional S...

    • ... 1990) as a technique to discover the existence of latent structure in the pattern of word usage across documents...

    Wenwen Liet al. Towards geospatial semantic search: exploiting latent semantic relatio...

Sort by: