Academic
Publications
An Empirical Study of Massively Parallel Bayesian Networks Learning for Sentiment Extraction from Unstructured Text

An Empirical Study of Massively Parallel Bayesian Networks Learning for Sentiment Extraction from Unstructured Text,10.1007/978-3-642-20291-9_47,Wei C

An Empirical Study of Massively Parallel Bayesian Networks Learning for Sentiment Extraction from Unstructured Text  
BibTex | RIS | RefWorks Download
Extracting sentiments from unstructured text has emerged as an important problem in many disciplines, for example, to mine on-line opinions from the Internet. Many algorithms have been applied to solve this problem. Most of them fail to handle the large scale web data. In this paper, we present a parallel algorithm for BN(Bayesian Networks) structure leaning from large-scale dateset by using a MapReduce cluster. Then, we apply this parallel BN learning algorithm to capture the dependencies among words, and, at the same time, finds a vocabulary that is efficient for the purpose of extracting sentiments. The benefits of using MapReduce for BN structure learning are discussed. The performance of using BN to extract sentiments is demonstrated by applying it to real web blog data. Experimental results on the web data set show that our algorithm is able to select a parsimonious feature set with substantially fewer predictor variables than in the full data set and leads to better predictions about sentiment orientations than several usually used methods.
Conference: Asia-Pacific Web Conference - APWeb , pp. 424-435, 2011
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.