Download a PDF with the full list of our publications: Robot-Intelligence-Lab-Publications-2021.pdf

A comprehensive list can also be found at Google Scholar, or by searching for the publications of author Kormushev, Petar.

Citation

BibTex format

@inproceedings{Pardo:2018,
author = {Pardo, F and Levdik, V and Kormushev, P},
title = {Q-map: A convolutional approach for goal-oriented reinforcement learning.},
url = {http://arxiv.org/abs/1810.02927v1},
year = {2018}
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - Goal-oriented learning has become a core concept in reinforcement learning(RL), extending the reward signal as a sole way to define tasks. However, asparameterizing value functions with goals increases the learning complexity,efficiently reusing past experience to update estimates towards several goalsat once becomes desirable but usually requires independent updates per goal.Considering that a significant number of RL environments can support spatialcoordinates as goals, such as on-screen location of the character in ATARI orSNES games, we propose a novel goal-oriented agent called Q-map that utilizesan autoencoder-like neural network to predict the minimum number of stepstowards each coordinate in a single forward pass. This architecture is similarto Horde with parameter sharing and allows the agent to discover correlationsbetween visual patterns and navigation. For example learning how to use aladder in a game could be transferred to other ladders later. We show how thisnetwork can be efficiently trained with a 3D variant of Q-learning to updatethe estimates towards all goals at once. While the Q-map agent could be usedfor a wide range of applications, we propose a novel exploration mechanism inplace of epsilon-greedy that relies on goal selection at a desired distancefollowed by several steps taken towards it, allowing long and coherentexploratory steps in the environment. We demonstrate the accuracy andgeneralization qualities of the Q-map agent on a grid-world environment andthen demonstrate the efficiency of the proposed exploration mechanism on thenotoriously difficult Montezuma's Revenge and Super Mario All-Stars games.
AU - Pardo,F
AU - Levdik,V
AU - Kormushev,P
PY - 2018///
TI - Q-map: A convolutional approach for goal-oriented reinforcement learning.
UR - http://arxiv.org/abs/1810.02927v1
UR - http://hdl.handle.net/10044/1/71861
ER -

Contact us

Senior Lecturer (Associate Professor)
Dyson School of Design Engineering
Address: 25 Exhibition Road, South Kensington, London, SW7 2DB
LinkedIn of Petar Kormushev