This document discusses a formulation of neural network training as infinite dimensional gradient Langevin dynamics, ensuring global optimality and providing generalization error and excess risk bounds. It highlights a unified approach to treat both finite and infinite width networks, as well as addressing challenges related to nonconvexity and high-dimensionality in neural network optimization. The results differ from existing frameworks like neural tangent kernel and mean field analysis, offering guarantees on generalization error and learning rates.