This document discusses Pearson's use of Apache Blur for distributed search and indexing of data from Kafka streams into Blur. It provides an overview of Pearson's learning platform and data architecture, describes the benefits of using Blur including its scalability, fault tolerance and query support. It also outlines the challenges of integrating Kafka streams with Blur using Spark and the solution developed to provide a reliable, low-level Kafka consumer within Spark that indexes messages from Kafka into Blur in near real-time.