Optimization is the process that aims to choose a best component (as to some foundation) from some arrangement of accessible options. Moreover, optimization is the final objective in problem solving situations, whether the problem belongs to computer science, mathematics, operational research or real life. Optimization techniques allows the problem to use available resources in best possible way. Hence it saves time, space and overall execution cost of the problem. In case of machine learning problems, one does not have idea about how exactly the new data will look like, or precise nature of data.
So, in machine learning, it is suggested to achieve optimization on the training data itself and crisscross the performance on a new validation data. Gradient descent(also know) is one of the most popular technique to optimize neural network algorithms. In simpler terms the problem situation in real world is as depicted below.
Consider, one has assigned the task to come down from the top of the hill. The available sensory information is limited. It is as good as person is blind. The only solution to walk down to hill is, try to sense the next step and check whether it is slope that take you down to the earth. If it is, then go ahead. There are many assumptions in this case. This is greedy way to find feasible and optimal solution.
Technically speaking, to compute a local minimum of a mathematical function using gradient descent technique, one takes steps proportional to the negative of the gradient of the function at the present point in the space. On the other hand, in gradient ascent, instead of negative, one takes steps proportional to the positive of the gradient and then computes a local maximum of that function.
To read about online/incremental/stochastic gradient descent technique, click here