On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts

Vamplew, Peter; Yearwood, John; Dazeley, Richard; Berry, Adam

doi:10.1007/978-3-540-89378-3_37

Peter Vamplew³,
John Yearwood³,
Richard Dazeley³ &
…
Adam Berry³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5360))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

2275 Accesses
37 Citations
1 Altmetric

Abstract

Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting objectives. This paper argues for designing MORL systems to produce a set of solutions approximating the Pareto front, and shows that the common MORL technique of scalarisation has fundamental limitations when used to find Pareto-optimal policies. The work is supported by the presentation of three new MORL benchmarks with known Pareto fronts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Coello Coello, C.A.: Handling Preferences in Evolutionary Multiobjective Optimization: A Survey. In: 2000 Congress on Evolutionary Computation, vol. 1, pp. 30–37 (2000)
Google Scholar
Tesauro, G., Das, R., Chan, H., Kephart, J.O., Lefurgy, C., Levine, D.W., Rawson, F.: Managing power consumption and performance of computing systems using reinforcement learning. Neural Information Processing Systems (2007)
Google Scholar
Natarajan, S., Tadepalli, P.: Dynamic preferences in multi-criteria reinforcement learning. In: 22nd International Conference on Machine Learning, Bonn, Germany, pp. 601–608 (2005)
Google Scholar
Castelletti, A., Corani, G., Rizzolli, A., Soncinie-Sessa, R., Weber, E.: Reinforcement learning in the operational management of a water system. In: IFAC Workshop on Modeling and Control in Environmental Issues, Keio University, Yokohama, Japan, pp. 325–330 (2002)
Google Scholar
Gabor, Z., Kalmar, Z., Szepesvari, C.: Multi-criteria reinforcement learning. In: The Fifteenth International Conference on Machine Learning, pp. 197–205 (1998)
Google Scholar
Geibel, P.: Reinforcement learning with bounded risk. In: Proceedings of the 18th International Conference on Machine Learning (2001)
Google Scholar
Geibel, P., Wysotzki, F.: Risk-sensitive reinforcement learning applied to control under constraints. Journal of Artificial Intelligence Research 24, 81–108 (2005)
MATH Google Scholar
Mannor, S., Shimkin, N.: The steering approach for multi-criteria reinforcement learning. In: Neural Information Processing Systems, Vancouver, Canada, pp. 1563–1570 (2001)
Google Scholar
Mannor, S., Shimkin, N.: A geometric approach to multi-criterion reinforcement learning. Journal of Machine Learning Research 5, 325–360 (2004)
MathSciNet MATH Google Scholar
Shelton, C.R.: Importance sampling for reinforcement learning with multiple objectives, Massachusetts Institute of Technology AI Lab Technical Report No. 2001-003 (2001)
Google Scholar
Corne, D.W., Jerram, N.R., Knowles, J.D., Oates, M.J.: PESA-II: region-based selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 283–290 (2001)
Google Scholar
Coello Coello, C.A., Veldhuizen, D.A.V., Lamont, G.B.: Evolutionary Algorithm MOP Approaches (Chapter Two). In: Evolutionary Algorithms for Solving Multiobjective Problems. Kluwer Academic Publishers, Dordrecht (2002)
Chapter Google Scholar
Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. In: Neural Information Processing Systems (1995)
Google Scholar
Sutton, R.S.: Generalisation in reinforcement learning: Successful examples using sparse coarse coding. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems, pp. 1038–1044 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

School of ITMS, University of Ballarat, University Drive, Mt Helen, Ballarat, Victoria, Australia
Peter Vamplew, John Yearwood, Richard Dazeley & Adam Berry

Authors

Peter Vamplew
View author publications
You can also search for this author in PubMed Google Scholar
John Yearwood
View author publications
You can also search for this author in PubMed Google Scholar
Richard Dazeley
View author publications
You can also search for this author in PubMed Google Scholar
Adam Berry
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Wales, School of Computer Science and Engineering,, University of New South, NSW 2052, Sydney, Australia
Wayne Wobcke
School of Mathematics, Statistics and Computer Science, Victoria University of Wellington, P.O. Box 600, 6140, Wellington, New Zealand
Mengjie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vamplew, P., Yearwood, J., Dazeley, R., Berry, A. (2008). On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts. In: Wobcke, W., Zhang, M. (eds) AI 2008: Advances in Artificial Intelligence. AI 2008. Lecture Notes in Computer Science(), vol 5360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89378-3_37

Download citation

DOI: https://doi.org/10.1007/978-3-540-89378-3_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89377-6
Online ISBN: 978-3-540-89378-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics