Conferences >2019 IEEE 8th International W...

Polynomial Escape-Time from Saddle Points in Distributed Non-Convex Optimization

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The diffusion strategy for distributed learning from streaming data employs local stochastic gradient updates along with exchange of iterates over neighborhoods. In this ...Show More

Metadata

Abstract:

The diffusion strategy for distributed learning from streaming data employs local stochastic gradient updates along with exchange of iterates over neighborhoods. In this work we establish that agents cluster around a network centroid in the mean-fourth sense and proceeded to study the dynamics of this point. We establish expected descent in non-convex environments in the large-gradient regime and introduce a short-term model to examine the dynamics over finite-time horizons. Using this model, we establish that the diffusion strategy is able to escape from strict saddle-points in O(1/μ) iterations, where μ denotes the step-size; it is also able to return approximately second-order stationary points in a polynomial number of iterations. Relative to prior works on the polynomial escape from saddle-points, most of which focus on centralized perturbed or stochastic gradient descent, our approach requires less restrictive conditions on the gradient noise process.

Published in: 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP)

Date of Conference: 15-18 December 2019

Date Added to IEEE Xplore: 05 March 2020

ISBN Information:

DOI: 10.1109/CAMSAP45676.2019.9022458

Conference Location: Le gosier, Guadeloupe

Contents

1. Introduction

We consider a network of agents. Each agent is equipped with a local, stochastic cost of the form , where denotes a parameter vector and denotes random data. We consider a global optimization problem of the form:

$\begin{equation*} \min_{w}J(w),\quad \mathrm{where}\ J(w)\triangleq \sum_{k=1}^{N}p_{k}J_{k}(w) \tag{1} \end{equation*}$ where the weights are a function of the graph topology and will be specified further below in (4). Solutions to such problems via distributed strategies can be pursued through a variety of algorithms, including those of the consensus and diffusion type [3]–[9]. We study the diffusion strategy due to its proven enhanced performance in adaptive environments in response to streaming data and drifting conditions [4], [10]. The strategy takes the form:

$\begin{align*} \phi_{k,i}=w_{k,i-1}-\mu\widehat{\nabla J}_{k}(w_{k,i-1})\tag{2a}\\ w_{k,i}=\sum_{\ell=1}^{N}a_{\ell k}\phi_{\ell,i}\tag{2b} \end{align*}$

References is not available for this document.

Polynomial Escape-Time from Saddle Points in Distributed Non-Convex Optimization

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Polynomial Escape-Time from Saddle Points in Distributed Non-Convex Optimization

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?