Academic
Publications
Learning complementary multiagent behaviors: a case study

Learning complementary multiagent behaviors: a case study,10.1145/1558109.1558293,Shivaram Kalyanakrishnan,Peter Stone

Learning complementary multiagent behaviors: a case study   (Citations: 4)
BibTex | RIS | RefWorks Download
As machine learning is applied to increasingly complex tasks, it is likely that the diverse challenges encountered can only be addressed by combining the strengths of different learning algorithms. We exam- ine this aspect of learning through a case study grounded in the robot soccer context. The task we consider is Keepaway, a popular benchmark for multiagent reinforcement learning from the simulation soccer domain. Whereas previous successful results in Keepaway have limited learning to an isolated, infrequent decision that amounts to a turn-taking behavior (passing), we expand the agents' learning capability to include a much more ubiquitous action (moving without the ball, or getting open), such that at any given time, multiple agents are executing learned behav- iors simultaneously. We introduce a policy search method for learning "GetOpen" to complement the temporal difference learning approach employed for learning "Pass". Empirical results indicate that the learned GetOpen policy matches the best hand-coded policy for this task, and outperforms the best policy found when Pass is learned. We demon- strate that Pass and GetOpen can be learned simultaneously to realize tightly-coupled soccer team behavior.
Conference: RoboCup International Symposium - RoboCup , pp. 1359-1360, 2009
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...This behavior outperforms several other methods on this domain, including policy search algorithms [18,17,4,5,6]...

    Matteo Leonettiet al. Reinforcement Learning through Global Stochastic Search in N-MDPs

    • ...Whereas previous successful results in the Keepaway task have limited learning to an isolated, infrequent decision that amounts to a turn-taking behavior among players (Pass), we expand the agents’ learning capability to include the more ubiquitous action of moving without the ball (GetOpen) [4]...

    Shivaram Kalyanakrishnan. Integrating Value Function-Based and Policy Search Methods for Sequent...

    • ...In order to make the plan representation and execution clear, we show a simple example borrowed by the Keepaway Soccer domain proposed by Stone and Sutton [18, 10]...
    • ...We refer to Stone at al. [18] and especially to the more recent work by Kalyanakrishnan and Stone [10] as representatives of the “RL way” to face Keepaway Soccer and we show our methodology applied to this task...
    • ...In previous works [18, 10] the best results are about 16 seconds of hold time and they take tens of thousands of episodes to be learned...
    • ...Notice that the implementation of Kalyanakrishnan and Stone [10] fixes the behavior of the agent everywhere except for the two aspects they want to learn actually implementing a HAM...

    Matteo Leonetti. Plan Refinement Through Experience 2nd Year PhD Report

Sort by: