Academic
Publications
Statistics Gathering for Learning from Distributed, Heterogeneous and Autonomous Data Sources

Statistics Gathering for Learning from Distributed, Heterogeneous and Autonomous Data Sources,Doina Caragea,Jaime Reinoso,Adrian Silvescu,Vasant Honav

Statistics Gathering for Learning from Distributed, Heterogeneous and Autonomous Data Sources   (Citations: 3)
BibTex | RIS | RefWorks Download
With the growing use of distributed information networks, there is an increasing need for algorith- mic and system solutions for data-driven knowl- edge acquisition using distributed, heterogeneous and autonomous data repositories. In many appli- cations, practical constraints require such systems to provide support for data analysis where the data and the computational resources are available. This presents us with distributed learning problems. We precisely formulate a class of distributed learning problems; present a general strategy for transform- ing traditional machine learning algorithms into distributed learning algorithms based on the de- composition of the learning task into hypothesis generation and information extraction components; formally defined the information required for gen- erating the hypothesis (sufficient statistics); and show how to gather the sufficient statistics from dis- tributed, heterogeneous, autonomous data sources, using a query decomposition (planning) approach. The resulting algorithms are provably exact in that the hypothesis constructed from distributed data is identical to that obtained by the corresponding al- gorithm when in the batch setting.
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
Sort by: