Dimension-free approximations of gradients

Matieyendou Lamboni

Abstract:

In high-dimensional settings and for time-demanding models, having an efficient approach for computing i) the traditional gradient of every smooth function ($\nabla f$), and ii) the dependent gradient of functions evaluated at non-independent variables ($grad f$) is worth investigating.

In addition to the adjoint methods that provide exact traditional gradients for some classes of PDE/ODE-based models using only one run, this study relies on randomized schemes or the Monte-Carlo approach for computing both gradients. The proposed approach makes use of $\ell_p$-spherical distributions with $p\geq 1$ and Richardson's extrapolation to derive generalized stochastic surrogates of gradients using $L$-point-based evaluations of functions with $L\geq 1$. Such $\ell_p$-spherical-based surrogates of gradients and the corresponding estimators benefit from:

i) Dimension-free upper-bound of the bias;
ii) Dimension-free upper-bounds of mean squared errors (MSEs) and rates of convergence of the form $d^{2/p} N^{-1}$ with $N$ sample size;
iii) Computational efficiency and accuracy.

As a consequence, the proposed approach does not suffer from the drawbacks of dimensionality by properly choosing $p$. It improves the best known rate (i.e., $dN^{-1}$) and enables computations of gradients using a number of function evaluations $N \ll d$ by breaking down the course of dimensionality.

References

[1] O. Shamir, An optimal algorithm for bandit and zero-order convex optimization with two-point feedback, J. Mach. Learn. Res. 18 (1) (2017) 1703#1713.

[2] M. Lamboni, Dimension-free estimators of gradients of functions with(out) non-independent variables, Axioms 15 (1) (2026).

[3] A. Akhavan, E. Chzhen, M. Pontil, A. B. Tsybakov, Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm, Journal of Machine Learning Research 25 (370) (2024) 1#50.

[4] A. S. Berahas, L. Cao, K. Choromanski, K. Scheinberg, A theoretical and empirical comparison of gradient approximations in derivative-free optimization, Foundations of Computational Mathematics 22 (2) (2022) 507#560.