Gradient descent is an optimization algorithm used to minimize a cost function by iteratively adjusting parameter values in the direction of the steepest descent. It works by calculating the derivative of the cost function to determine which direction leads to lower cost, then taking a step in that direction. This process repeats until reaching a minimum. Gradient descent is simple but requires knowing the gradient of the cost function. Backpropagation extends gradient descent to neural networks by propagating error backwards from the output to calculate gradients to update weights.