The document discusses distributed linear classification on Apache Spark. It describes using Spark to train logistic regression and linear support vector machine models on large datasets. Spark improves on MapReduce by conducting communications in-memory and supporting fault tolerance. The paper proposes using a trust region Newton method to optimize the objective functions for logistic regression and linear SVM. Conjugate gradient is used to approximate the Hessian matrix and solve the Newton system without explicitly storing the large Hessian.