This document compares first and second order training algorithms for artificial neural networks. It summarizes that feedforward network training is a special case of functional minimization where no explicit model of the data is assumed. Gradient descent, conjugate gradient, and quasi-Newton methods are discussed as first and second order training methods. Conjugate gradient and quasi-Newton methods are shown to outperform gradient descent methods experimentally using share rate data. The backpropagation algorithm and its variations are described for finding the gradient of the error function with respect to the network weights. Conjugate gradient techniques are discussed as a way to find the search direction without explicitly computing the Hessian matrix.