Sander G. van Dijk and Daniel Polani. Look-Ahead Relevant Information: Reducing Cognitive Burden over Prolonged Tasks. In IEEE Symposium on Artificial Life, pages 46-53, Paris, France, 2011. [ bib ]
Based on the fact that information processing is costly, we study in this paper the trade-off between performance and informational requirements. Most importantly, we are interested in how local decisions can alleviate future cognitive burden, measured by the amount of sensory information an agent processes, without conceding performance. We introduce look-ahead information as a novel concept to capture the long-term informational requirements and present an iterative method to determine the value of this quantity. Using an example problem, we show how these long-term considerations enable an agent to predict future effects of its actions on its informational burden, and to shape the course of the world to achieve more informationally parsimonious behaviour.
Sander G. van Dijk and Daniel Polani. Grounding Subgoals in Information Transitions. In IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pages 105-111, Paris, France, 2011. [ bib ]
In reinforcement learning problems, the construction of subgoals has been identified as an important step to speed up learning and to enable skill transfer. For this purpose, one typically extracts states from various saliency properties of an MDP transition graph, most notably bottleneck states. Here we introduce an alternative approach to this problem: assuming a family of MDPs with multiple goals but with a fixed transition graph, we introduce the relevant goal information as the amount of Shannon information that the agent needs to maintain about the current goal at a given state to select the appropriate action. We show that there are distinct transition states in the MDP at which new relevant goal information has to be considered for selecting the next action. We argue that these transition states can be interpreted as subgoals for the current task class, and we use these states to automatically create a hierarchical policy, according to the well-established Options model for hierarchical reinforcement learning.
Valerio Lattarulo and Sander G. van Dijk. Application of the “Alliance Algorithm” to Energy Constrained Gait Optimization. In 15th annual RoboCup International Symposium, Istanbul, Turkey, 2011. [ bib ]
This paper deals with the problem of energy constrained gait optimization for bipedal walking. We present a solution to this problem obtained by applying a recently introduced heuristic method, the Al- liance Algorithm (AA), and compare its performance against a Genetic Algorithm (GA). We show experimentally that the intrinsic ability of the AA to handle hard constraints enables it to find solutions signifi- cantly better than the GA. Also with the constraint removed the AA show more reliable optimization results. Finally, we show that the final gait obtained through this method outperforms most solutions to this problem presented in previous works, in terms of walking speed.
Sander G. van Dijk, Daniel Polani, and Chrystopher L. Nehaniv. What do You Want to do Today? Relevant-Information Bookkeeping in Goal-Oriented Behaviour. In Harold Fellermann, Mark Dörr, Martin Hanczyc, Lone L. Ladegaard, Sarah Maurer, Daniel Merkle, Pierre-Alain Monnard, Kasper Stø y, and Steen Rasmussen, editors, Artificial Life XII: The 12th International Conference on the Synthesis and Simulation of Living Systems, pages 176-183, Odense, Denmark, 2010. MIT Press. [ bib | .pdf ]
We extend existing models and methods for the informational treatment of the perception-action loop to the case of goal-oriented behaviour and introduce the notion of relevant goal information as the amount of information an agent necessarily has to maintain about its goal. Starting from the hypothesis that organisms use information economically, we study the structure of this information and how goal-information parsimony can guide behaviour. It is shown how these methods lead to a general definition and quantification of sub-goals and how the biologically motivated hypothesis of information parsimony gives rise to the emergence of behavioural properties such as least-commitment and goal-concealing.
Sander G. van Dijk, Daniel Polani, and Chrystopher L. Nehaniv. Hierarchical Behaviours: Getting the Most Bang for your Bit. In Proc. European Conference on Artificial Life 2009, Budapest, Budapest, Hungary, 2009. [ bib | .pdf ]
Hierarchical structuring of behaviour is prevalent in natural and artificial agents and can be shown to be useful for learning and performing tasks. To progress systematic understanding of these benefits we study the effect of hierarchical architectures on the required information processing capability of an optimally acting agent. We show that an information-theoretical approach provides important insights into why factored and layered behaviour structures are beneficial.
This study investigates the use of Hierarchical Reinforcement Learning (HRL) and automatic sub-goal discovery methods in continuous environments. These are inspired by the RoboCup 3D Simulation environment and supply navigation tasks with clear bottleneck situations. The goal of this research is to enhance the learning performance of agents performing these tasks. This is done by implementing existing learning algorithms, extending these to continuous environments and by introducing new methods to improve the algorithms.