Abstract
We present two approximate versions of the proximal subgradient method for minimizing the sum of two convex functions (not necessarily differentiable). At each iteration, the algorithms require inexact evaluations of the proximal operator, as well as, approximate subgradients of the functions (namely: the\(\epsilon \)-subgradients). The methods use different error criteria for approximating the proximal operators. We provide an analysis of the convergence and rate of convergence properties of these methods, considering various stepsize rules, including both, diminishing and constant stepsizes. For the case where one of the functions is smooth, we propose an inexact accelerated version of the proximal gradient method, and prove that the optimal convergence rate for the function values can be achieved. Moreover, we provide some numerical experiments comparing our algorithm with similar recent ones.
Similar content being viewed by others
References
Alber, Y.I., Iusem, A.N., Solodov, M.V.: On the projected subgradient method for nonsmooth convex optimization in a Hilbert space. Math. Program. 81(1), 23–35 (1998)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring. IEEE Trans. Image Proc. 18, 2419–2434 (2009)
Bello Cruz, J.Y.: On proximal subgradient splitting method for minimizing the sum of two nonsmooth convex functions. Set-Valued Var. Anal. 25(2), 245–263 (2017)
Bello Cruz, J.Y., Díaz Millán, R.: A direct splitting method for nonsmooth variational inequalities. J. Optim. Theory Appl. 161(3), 728–737 (2014)
Bello Cruz, J.Y., Díaz Millán, R.: A relaxed-projection splitting algorithm for variational inequalities in Hilert spaces. J. Glob. Optim. 65(3), 597–614 (2016)
Bertsekas, D.P.: Convex Optimization Theory. Athena Scientific, Belmont (2009)
Bertsekas, D.P.: Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey. CoRR arXiv:1507.01030 (2015)
Bertsekas, D.P.: Incremental proximal methods for large scale convex optimization. Math. Program. Ser. B 129(2), 163–195 (2011)
Birgin, E.G., Martínez, J.M., Raydan, M.: Inexact spectral projected gradient methods on convex sets. IMA J. Numer. Anal. 23(4), 539–559 (2003)
Boţ, R.I., Csetnek, E.R.: An inertial forward–backward–forward primal–dual splitting algorithm for solving monotone inclusion problems. Numer. Algorithms 71(3), 519–540 (2016)
Brøndsted, A., Rockafellar, R.T.: On the subdifferentiability of convex functions. Proc. Am. Math. Soc. 16, 605–611 (1965)
Burachik, R.S., Martínez-Legaz, J.E., Rezaie, M., Théra, M.: An additive subfamily of enlargements of a maximally monotone operator. Set-Valued Var. Anal. 23(4), 643–665 (2015)
Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89–97 (2004)
Combettes, P.L.: Quasi-Fejérian analysis of some optimization algorithms. In inherently parallel algorithms in feasibility and optimization and their applications. Stud. Comput. Math. 8, 115–152 (2001)
Díaz Millán, R.: Two algorithms for solving systems of inclusion problems. Numer. Algorithms 78(4), 1111–1127 (2018)
Díaz Millán, R.: On several algorithms for variational inequality and inclusion problems. PhD thesis, Federal University of Goiás, Goiânia, GO, (2015). Institute of Mathematic and Statistic, IME-UFG
Ermoliev, Y.M.: On the method of generalized stochastic gradients and quasi-Fejér sequences. Cybernetics 5, 208–220 (1969)
Guo, X.L., Zhao, C.J., Li, Z.W.: On generalized \(\epsilon \)-subdifferential and radial epiderivative of set-valued mappings. Optim. Lett. 8(5), 1707–1720 (2014)
Helou, E.S., Simões, L.E.A.: \(\epsilon \)-subgradient algorithms for bilevel convex optimization. Inverse Probl. 33, 5 (2017)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms. II. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 306. Springer, Berlin (1993)
Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)
Liang, J., Fadili, J., Peyré, G.: Local linear convergence analysis of primal-dual splitting methods. Optimization 67, 821–853 (2018)
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Minty, G.J.: Monotone (nonlinear) operators in Hilbert space. Duke Math. J. 29, 341–346 (1962)
Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM J. Optim. 23(2), 1092–1125 (2013)
Moreau, J.-J.: Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France 93, 273–299 (1965)
Nedic, A., Bertsekas, D.: Incremental subgradient methods for nondifferentiable optimization. SIAM J. Optim. 12, 109–138 (2001)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(O(1/k^2)\). Sov. Math. Dokl. 27, 372–376 (1983)
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. Ser. B 140(1), 125–161 (2013)
Nesterov, Y.: Introductory Lectures on Convex Optimization. A Basic Course. Applied Optimization, vol. 87. Kluwer, Boston (2004)
Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72, 383–390 (1979)
Polyak, B.T.: Introduction to optimization. Translations Series in Mathematics and Engineering. Optimization Software, Inc., Publications Division, New York (1987)
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)
Schmidt, M., Le Roux, N., Bach, F.: Convergence rates of inexact proximal-gradient methods for convex optimization. In: NIPS’11—25th Annual Conference on Neural Information Processing Systems (Grenada, Spain, Dec. 2011)
Shor, N.Z.: Minimization Methods for Non-differentiable Functions. Springer Series in Computational Mathematics. Springer, Berlin (1985)
Simonetto, A., Jamali-Rad, H.: Primal recovery from consensus-based dual decomposition for distributed convex optimization. J. Optim. Theory Appl. 168(1), 172–197 (2016)
Solodov, M.V., Svaiter, B.F.: A hybrid approximate extragradient-proximal point algorithm using the enlargement of a maximal monotone operator. Set-Valued Anal. 7(4), 323–345 (1999)
Solodov, M.V., Svaiter, B.F.: A unified framework for some inexact proximal point algorithms. Numer. Funct. Anal. Optim. 22(7–8), 1013–1035 (2001)
Teboulle, M.: A simplified view of first order methods for optimization. Math. Program. 170, 67–96 (2018). https://doi.org/10.1007/s10107-018-1284-2
Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward–backward algorithms. J SIAM J. Optim. 23(3), 1607–1633 (2013)
Acknowledgements
This work was partially completed while M.P.M. was supported by a CAPES post-doctoral fellowship at the University of Campinas. M.P.M. is very grateful to IMECC at the University of Campinas and especially to Professor Sandra Augusta Santos for the warm hospitality.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Proposition 3
We first define, for all \(k\ge 0\),
and prove that \(\eta _{k+1}-\eta _k\ge t_{k+1}(f+g)(\overline{x}^{k+1}) - t_k(f+g)(\overline{x}^k)\). To do this, we observe that
where the first equality above follows from the last relation in (39), and the second equality is a consequence of the definition of \(\mathfrak {L}_k\).
Next, we note that the definition of \(\eta _k\), together with the fact that the function in the minimization problem in (46) is quadratic, implies that
Therefore, combining this latter relation with (47) we obtain
where the inequality above is due to the first relation in (39).
Now, from the definition of \(t_{k+1}\) and because \(\ell _{k+1}\) is affine, we have
Moreover, denoting \(x = \dfrac{\beta _{k+1}}{t_{k+1}}x^{k+1} + \dfrac{t_k}{t_{k+1}}\overline{x}^k\) and using the definition of \(\tilde{x}^{k+1}\) in step 1 of Algorithm 3, we have \(x^{k+1} - x^k = \dfrac{t_{k+1}}{\beta _{k+1}}(x-\tilde{x}^{k+1})\). Therefore, combining these relations with Eq. (48) we obtain
where we used in the equality above the definition of \(\beta _{k+1}\). By (40) and the assumption that \(\sigma ^2<1/2\) we conclude our claim.
Now, we observe that since the sequence \((\eta _k-t_k(f+g)(\overline{x}^k))_{k\in \mathbb {N}}\) is non-decreasing, we have, for all \(k\ge 1\), that
Hence, using (46) we deduce the second inequality in (42).
To prove the first inequality in (42), we note that from the definitions of \(t_k\) and \(\beta _k\) it follows that
Thus, we conclude using that \(\alpha =\sigma ^2/L\) and an induction argument. \(\square \)
Rights and permissions
About this article
Cite this article
Millán, R.D., Machado, M.P. Inexact proximal \(\epsilon \)-subgradient methods for composite convex optimization problems. J Glob Optim 75, 1029–1060 (2019). https://doi.org/10.1007/s10898-019-00808-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-019-00808-8
Keywords
- Splitting methods
- Optimization problem
- \(\epsilon \)-Subdifferential
- Inexact methods
- Hilbert space
- Accelerated methods