Abstract
Tuning hyperparameters of machine learning models is important for their performance. Bayesian optimization has recently emerged as a de-facto method for this task. The hyperparameter tuning is usually performed by looking at model performance on a validation set. Bayesian optimization is used to find the hyperparameter set corresponding to the best model performance. However, in many cases, where training or validation set has limited set of datapoints, the function representing the model performance on the validation set contains several spurious sharp peaks. The Bayesian optimization, in such cases, has a tendency to converge to sharp peaks instead of other more stable peaks. When a model trained using these hyperparameters is deployed in real world, its performance suffers dramatically. We address this problem through a novel stable Bayesian optimization framework. We construct a new acquisition function that helps Bayesian optimization to avoid the convergence to the sharp peaks. We conduct a theoretical analysis and guarantee that Bayesian optimization using the proposed acquisition function prefers stable peaks over unstable ones. Experiments with synthetic function optimization and hyperparameter tuning for Support Vector Machines show the effectiveness of our proposed framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: ACM SIGKDD (2013)
Xue, D., et al.: Accelerated search for materials with targeted properties by adaptive design. Nat. Commun. 7, 11241–11249 (2016)
Rasmussen, C.E.: Gaussian Processes for Machine Learning. Citeseer (2006)
Mockus, J., Tiesis, V., Zilinskas, A.: The application of Bayesian methods for seeking the extremum. Towards Glob. Optim. 2(117–129), 2 (1978)
Srinivas, N., Krause, A., Seeger, M., Kakade, S.M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: ICML (2010)
Garnett, R., Osborne, M.A., Roberts, S.J.: Bayesian optimization for sensor set selection. In: IPSN (2010)
Lizotte, D.J., Wang, T., Bowling, M.H., Schuurmans, D.: Automatic gait optimization with Gaussian process regression. In: IJCAI, vol. 7, pp. 944–949 (2007)
Martinez-Cantin, R., et al.: A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Auton. Robots 27, 93–103 (2009)
Chen, B., Castro, R., Krause, A.: Joint optimization and variable selection of high-dimensional Gaussian processes. arXiv preprint arXiv:1206.6396 (2012)
Laumanns, M., Ocenasek, J.: Bayesian optimization algorithms for multi-objective optimization. In: Guervós, J.J.M., Adamidis, P., Beyer, H.-G., Schwefel, H.-P., Fernández-Villacañas, J.-L. (eds.) PPSN 2002. LNCS, vol. 2439, pp. 298–307. Springer, Heidelberg (2002). doi:10.1007/3-540-45712-7_29
Azimi, J., Fern, A., Fern, X.Z.: Batch Bayesian optimization via simulation matching. In: Advances in Neural Information Processing Systems, pp. 109–117 (2010)
Bull, A.D.: Convergence rates of efficient global optimization algorithms. J. Mach. Learn. Res. 12(Oct.), 2879–2904 (2011)
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: NIPS, pp. 2951–2959 (2012)
Jones, D.R., Perttunen, C.D., Stuckman, B.E.: Lipschitzian optimization without the Lipschitz constant. J. Optim. Theory Appl. 79, 157–181 (1993)
Girard, A., Murray-Smith, R.: Gaussian processes: prediction at a noisy input and application to iterative multiple-step ahead forecasting of time-series. In: Murray-Smith, R., Shorten, R. (eds.) Switching and Learning in Feedback Systems. LNCS, vol. 3355, pp. 158–184. Springer, Heidelberg (2005). doi:10.1007/978-3-540-30560-6_7
Acknowledgement
This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Nguyen, T.D., Gupta, S., Rana, S., Venkatesh, S. (2017). Stable Bayesian Optimization. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-57529-2_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)