GoogleLing: The Web as a Linguistic Corpus

GoogleLing: The Web as a Linguistic Corpus,Joseph Smarr,Tim Grow

GoogleLing: The Web as a Linguistic Corpus   (Citations: 1)
BibTex | RIS | RefWorks Download
We describe software to transform any search engine or searchable corpus into a tool for linguistic research with a rich query syntax. We provide support for case sensitive searches, within-sentence and within-N-words match constraints, part-of- speech restrictions on words, and "smart" verb-ending inflection wildcards. The software generalizes the query for the underlying search engine, and then processes the resulting pages with a set of natural language processing tools to extract matching sentences. Preliminary evaluation suggests that this greatly enhances linguists' ability to use the web as a linguistic corpus.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
Sort by: