Direkt zum Inhalt springen
Image Understanding and Knowledge-Based Systems
TUM School of Computation, Information and Technology
Technical University of Munich

Technical University of Munich

Menu

Links

Informatik IX

Image Understanding and Knowledge-Based Systems

Boltzmannstrasse 3
85748 Garching

info@iuks.in.tum.de




Approximating the Value Function for Continuous Space Reinforcement Learning in Robot Control (bibtex)
Approximating the Value Function for Continuous Space Reinforcement Learning in Robot Control (bibtex)
by S Buck, M Beetz and T Schmitt
Abstract:
Many robot learning tasks are very difficult to solve: their state spaces are high dimensional, variables and command parameters are continuously valued, and system states are only partly observable. In this paper, we propose to learn a continuous space value function for reinforcement learning using neural networks trained from data of exploration runs. The learned function is guaranteed to be a lower bound for, and reproduces the characteristic shape of, the accurate value function. We apply our approach to two robot navigation tasks, discuss how to deal with possible problems occurring in practice, and assess its performance.
Reference:
Approximating the Value Function for Continuous Space Reinforcement Learning in Robot Control (S Buck, M Beetz and T Schmitt), In Proc. of the IEEE Intl. Conf. on Intelligent Robots and Systems, 2002. 
Bibtex Entry:
@inproceedings{buck_approximating_2002,
 author = {S Buck and M Beetz and T Schmitt},
 title = {Approximating the Value Function for Continuous Space Reinforcement
	Learning in Robot Control},
 booktitle = {Proc. of the {IEEE} Intl. Conf. on Intelligent Robots and Systems},
 year = {2002},
 abstract = {Many robot learning tasks are very difficult to solve: their state
	spaces are high dimensional, variables and command parameters are
	continuously valued, and system states are only partly observable.
	In this paper, we propose to learn a continuous space value function
	for reinforcement learning using neural networks trained from data
	of exploration runs. The learned function is guaranteed to be a lower
	bound for, and reproduces the characteristic shape of, the accurate
	value function. We apply our approach to two robot navigation tasks,
	discuss how to deal with possible problems occurring in practice,
	and assess its performance.},
}
Powered by bibtexbrowser
Go Back

Rechte Seite

Informatik IX

Image Understanding and Knowledge-Based Systems

Boltzmannstrasse 3
85748 Garching

info@iuks.in.tum.de