Academic
Publications
Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood

Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood,10.1007/s10618-010-0178-6,Data Min

Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood   (Citations: 2)
BibTex | RIS | RefWorks Download
Learning Bayesian networks is known to be an NP-hard problem and that is the reason why the application of a heuristic search has proven advantageous in many domains. This learning approach is computationally efficient and, even though it does not guarantee an optimal result, many previous studies have shown that it obtains very good solutions. Hill climbing algorithms are particularly popular because of their good trade-off between computational demands and the quality of the models learned. In spite of this efficiency, when it comes to dealing with high-dimensional datasets, these algorithms can be improved upon, and this is the goal of this paper. Thus, we present an approach to improve hill climbing algorithms based on dynamically restricting the candidate solutions to be evaluated during the search process. This proposal, dynamic restriction, is new because other studies available in the literature about restricted search in the literature are based on two stages rather than only one as it is presented here. In addition to the aforementioned advantages of hill climbing algorithms, we show that under certain conditions the model they return is a minimal I-map of the joint probability distribution underlying the training data, which is a nice theoretical property with practical implications. In this paper we provided theoretical results that guarantee that, under these same conditions, the proposed algorithms also output a minimal I-map. Furthermore, we experimentally test the proposed algorithms over a set of different domains, some of them quite large (up to 800 variables), in order to study their behavior in practice.
Journal: Data Mining and Knowledge Discovery - DATAMINE , vol. 22, no. 1-2, pp. 106-148, 2011
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...2. This problem is exponential, and there are papers which use an exhaustive search equipped with a parameter k to limit the size of parent sets as in Friedman and Koller (2003) and Teyssier and Koller (2005)...
    • ...The first improvement, used in previous studies (Friedman and Koller 2003; de Campos and Puerta 2001; Teyssier and Koller 2005), allows the reduction of the number of computations of parent sets needed in any insert operation Insertð� ; i; jÞ, from n to |i - j| ? 1. Thus, the computation of the parent sets for those variables which are not located between the variables at positions i and j can be omitted, as the set of variables ...

    Juan I. Alonso-BarbaLuiset al. Structural learning of Bayesian networks using local algorithms based ...

    • ...Furthermore, it is possible to improve upon the scalability of GES by methods successfully used in the D-space [10,5,6], which basically consist in restricting the search space...
    • ...In this second proposal, denoted as GESC, we apply a similar approach to the progressive restriction of the neighborhood used in [6]...

    Juan I. Alonso-Barbaet al. Scaling Up the Greedy Equivalence Search Algorithm by Constraining the...

Sort by: