This document discusses scaling big data using Apache Spark. It provides an overview of Spark's philosophy of providing a unified engine to support end-to-end applications using high-level APIs. It outlines some of the new features in Apache Spark 2.0, including improvements to structured APIs, structured streaming, and new deep learning and graph processing libraries. It also discusses initiatives by Databricks to grow the Spark community through massive open online courses and a free community edition of the Databricks platform.
Related topics: