This document discusses scalable machine learning techniques. It summarizes Spark MLlib, which provides machine learning algorithms that can run on large datasets in a distributed manner using Apache Spark. It also discusses H2O, which provides fast machine learning algorithms that can integrate with Spark via Sparkling Water to allow transparent use of H2O models and algorithms with the Spark API. Examples of using K-means clustering and logistic regression are provided to illustrate MLlib and H2O.