Ankit Kohli from Nearbuy discusses their personalization efforts through creating a data pipeline in Kafka for clickstream data, transforming the data with Spark streaming, and storing it in HBase. Kohli then covers building an ML pipeline in Spark to extract features and run algorithms like ALS, k-means, and linear regression for recommendations and real-time analytics. Common challenges like data volume, streaming vs batch processing, and Spark cluster issues are also addressed.