The OXPath to success in the deep web

The OXPath to success in the deep web,10.1145/1963192.1963352,Andrew Jon Sellers,Georg Gottlob

The OXPath to success in the deep web  
BibTex | RIS | RefWorks Download
The world wide web provides access to a wealth of data. Collecting and maintaining such large amounts of data necessitates automated processing for extraction, since appropriate automation can perform extraction tasks that would be otherwise infeasible. Modern web interfaces, however, are generally designed primarily for human users, delivering sophisticated interactions through the use of client-side scripting and asynchronous server communication. To this end, we introduce OXPath, a careful extension of XPath that facilitates data extraction from the deep web. OXPath exploits XPath's familiarity and theoretical foundations. OXPath, then, achieves favourable evaluation complexity and optimal page buffering, storing only a constant number of pages for non-recursive queries. Further, OXPath provides a lightweight interface, which is easy to use and embed. This paper outlines the motivation, theoretical framework, current implementation, and preliminary results obtained so far. We conclude with proposed future work on OXPath, including an investigation of how to deploy OXPath efficiently in a highly elastic computing framework (cloud).
Conference: World Wide Web Conference Series - WWW , pp. 409-414, 2011
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.