<Table> <Tr> <Td> </Td> </Tr> </Table> <P> For some of the above examples, gradient descent is relatively slow close to the minimum: technically, its asymptotic rate of convergence is inferior to many other methods . For poorly conditioned convex problems, gradient descent increasingly' zigzags' as the gradients point nearly orthogonally to the shortest direction to a minimum point . For more details, see the comments below . </P> <P> For non-differentiable functions, gradient methods are ill - defined . For locally Lipschitz problems and especially for convex minimization problems, bundle methods of descent are well - defined . Non-descent methods, like subgradient projection methods, may also be used . These methods are typically slower than gradient descent . Another alternative for non-differentiable functions is to "smooth" the function, or bound the function by a smooth function . In this approach, the smooth problem is solved in the hope that the answer is close to the answer for the non-smooth problem (occasionally, this can be made rigorous). </P> <P> Gradient descent can be used to solve a system of linear equations, reformulated as a quadratic minimization problem, e.g., using linear least squares . The solution of </P>

Parameter controls the magnitude of a step taken during gradient descent