0% found this document useful (0 votes)
23 views

SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A

The document describes how to use the K-Means algorithm in SAP HANA to perform customer segmentation on phone usage data. It involves first creating a table to store customer phone data with 30 rows of sample data representing 3 customer segments. Then it generates the PAL procedure by defining table types for the input data, output assignments and centers, and control parameters. This prepares the data and framework for running the K-Means algorithm to cluster the customers.

Uploaded by

jefferyleclerc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A

The document describes how to use the K-Means algorithm in SAP HANA to perform customer segmentation on phone usage data. It involves first creating a table to store customer phone data with 30 rows of sample data representing 3 customer segments. Then it generates the PAL procedure by defining table types for the input data, output assignments and centers, and control parameters. This prepares the data and framework for running the K-Means algorithm to cluster the customers.

Uploaded by

jefferyleclerc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

3/13/24, 9:25 PM SAP HANA PAL – K-Means Algorithm or How to do Cust...

- SAP Community

Prepare the Data

The first step is creating a table that will contain information on customers mobile phone usage habits with the following
structure:

CREATE COLUMN TABLE "TELCO" (

"ID" INTEGER NOT NULL, --> Customer ID

"AVG_CALL_DURATION" DOUBLE, --> Average Call Duration

"AVG_NUMBER_CALLS_RCV_DAY" DOUBLE, --> Average Calls Received per Day

"AVG_NUMBER_CALLS_ORI_DAY" DOUBLE, --> Average Calls Originated per Day

"DAY_TIME_CALLS" DOUBLE, --> Percentage of Calls made during day time hours (9 a.m. - 6 p.m.)

"WEEK_DAY_CALLS" DOUBLE, --> Percentage of Calls made during week days (Monday thru Friday)

"CALLS_TO_MOBILE" DOUBLE, --> Percentage of Calls made to mobile phones

"SMS_RCV_DAY" DOUBLE, --> Number of SMSs received per day

"SMS_ORI_DAY" DOUBLE, --> Number of SMSs sent per day

PRIMARY KEY ("ID"))

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 4/39


3/13/24, 9:25 PM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

So each row in this table will represent a unique customer. Now I need to fill it, but I do not have access to real data, so I
had to build my own dataset. I created 30 different customers (30 rows) that can be grouped in 3 segments:

Segment 1: From Customer ID 1 thru 10. In this segment customers usually have short calls. They originate or receive
a low number of calls. These customers call more in the evening, more often during the weekend and to mobile lines.
They send and receive a fair amount of SMSs. This segment could represent personal mobile users.
Segment 2: From Customer ID 10001 thru 10010. In this segment customers have an average call duration. They
originate or receive an average number of calls. They usually call during business hours and during week days. They
send or receive a small amount of SMSs. This segment could represent small business users.
Segment 3: From Customer ID 20001 thru 20010. In this segment customers usually have long duration calls. They
usually call during business hours and during week days. They usually call to mobile lines and they heavily use SMSs.
This segment could represent enterprise business users.

The resulting table looks like this:

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 5/39


3/13/24, 9:25 PM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

Generate the PAL procedure

Now that I have my dataset, I’m ready to start coding. The first thing we need to do is generate the PAL procedure by
calling the AFL Wrapper Generator. To do so we need to create a number of Table Types that will be used to define the
structure of the data that will be used as input and output parameters:

SET SCHEMA _SYS_AFL;

/* Table Type that will be used as the output parameter

that will contain which cluster has been assigned to each

customer and what is the distance to the mean of the cluster */

DROP TYPE PAL_KMEANS_RESASSIGN_TELCO;

CREATE TYPE PAL_KMEANS_RESASSIGN_TELCO AS TABLE(

"ID" INT,

"CENTER_ASSIGN" INT,

"DISTANCE" DOUBLE

);

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 6/39


3/13/24, 9:25 PM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

/* Table Type that will be used as the input parameter

that will contain the data that I would like to cluster */

DROP TYPE PAL_KMEANS_DATA_TELCO;

CREATE TYPE PAL_KMEANS_DATA_TELCO AS TABLE(

"ID" INT,

"AVG_CALL_DURATION" DOUBLE,

"AVG_NUMBER_CALLS_RCV_DAY" DOUBLE,

"AVG_NUMBER_CALLS_ORI_DAY" DOUBLE,

"DAY_TIME_CALLS" DOUBLE,

"WEEK_DAY_CALLS" DOUBLE,

"CALLS_TO_MOBILE" DOUBLE,

"SMS_RCV_DAY" DOUBLE,

"SMS_ORI_DAY" DOUBLE,

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 7/39


3/13/24, 9:25 PM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

primary key("ID")

);

/* Table Type that will be used as the output parameter

that will contain the centers for each cluster */

DROP TYPE PAL_KMEANS_CENTERS_TELCO;

CREATE TYPE PAL_KMEANS_CENTERS_TELCO AS TABLE(

"CENTER_ID" INT,

"V000" DOUBLE,

"V001" DOUBLE,

"V002" DOUBLE,

"V003" DOUBLE,

"V004" DOUBLE,

"V005" DOUBLE,

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 8/39


3/13/24, 9:25 PM SAP HANA PAL – K-Means Algorithm or How to do Cust... - SAP Community

"V006" DOUBLE,

"V007" DOUBLE

);

/* Table Type that will be used to specify

the different parameters to run the KMeans Algorithm */

DROP TYPE PAL_CONTROL_TELCO;

CREATE TYPE PAL_CONTROL_TELCO AS TABLE(

"NAME" VARCHAR (50),

"INTARGS" INTEGER,

"DOUBLEARGS" DOUBLE,

"STRINGARGS" VARCHAR (100)

);

/* This table is used to generate the KMeans procedure

https://community.sap.com/t5/technology-blogs-by-members/sap-hana-pal-k-means-algorithm-or-how -to-do-customer-segmentation-for-the/ba-p/12976696/page/2 9/39

You might also like