This document provides an introduction to Apache Spark. It begins by explaining how Spark improves upon MapReduce by leveraging distributed memory for better performance and supporting iterative algorithms. Spark is described as a general purpose computational framework that retains the advantages of MapReduce like scalability and fault tolerance, while offering more functionality through directed acyclic graphs and libraries for machine learning. The document then discusses getting started with Spark and its execution modes like standalone, YARN client, and YARN cluster. Finally, it introduces Spark concepts like Resilient Distributed Datasets (RDDs), which are collections of objects partitioned across a cluster.