Customer Personalization @ nearbuy

Personalization @ Nearbuy
Ankit Kohli

About Me
- Data Scientist at Nearbuy
- Creating a world of personalization for Nearbuy’s customers

AGENDA
- Creating Data Pipeline
- Kafka ( Real Time Click Stream)
- Hbase ( where data sits)
- Data Transformation
- ML Pipeline
- Spark ( Data Feature Extraction )
- ML Algos ( how to use Spark and use cases of each algo)
- ALS

Data Pipeline @ Nearbuy
Schema
less DB
Click
Stream
KAFKA
SPARK
STREAM
HBASE
SPARK
ML
RECOMMEN
DATIONS
REAL TIME
ANALYTICS

KAFKA EVENTS
Deal view Event
{
“customerId”:”XXXXXXXXX”,
“timeStamp”:”140983388484”
“dealId”:”YYYYY”,
“source”:”APP”,
“os”:”android”
…….
}

Spark Streaming
- Details about how to implement Spark Streaming and set up jobs that runs
24x7 to ingest all click stream data
- Implement sessionization of customer activity on Nearbuy
- Transform data and store in HBASE

Spark ML Pipeline
Data Feature Extraction - Categorical Data
Will talk about Spark ML Algos
- Collaborative Filtering ( Implicit & Explicit Feedback )
- K Means
- Linear Regression
Will explain each algo in depth and its use cases and how to implement it using
SPARK.

Common Pitfalls
- Too much data
- Stream vs Batch Data
- Customer Sessionization
- Spark Cluster Mode Issues

Takeaways
- Personalization of APP can be done in many forms , our use case is one of
them ( ecommerce )
- Data visualization and ML Model selection
- Anyone who is interested to start with Data Analytics will get great insight
- Someone who has to start ML will get to know how to use SPARK .
- Someone who is already doing this can get to know how other companies are
implementing SPARK for ML
- Spark ML best Practices

Customer Personalization @ nearbuy

More Related Content

What's hot (20)

Similar to Customer Personalization @ nearbuy (20)

Recently uploaded (20)

Customer Personalization @ nearbuy