Apache Spark is an open-source, general-purpose cluster computing framework. It provides functionality for parallel processing, fault tolerance, and distributed processing on commodity hardware. Originally developed at UC Berkeley, Spark is now maintained by the Apache Software Foundation. Spark uses Resilient Distributed Datasets (RDDs) as its main programming abstraction and includes components for streaming, SQL queries, machine learning and graph processing.