Target

Need to perform clustering to summarize customer segments.

Task

This notebook contains code for:

Analysis preparation

Prepare working directory

Importing libraries

Data Preparation

Create some new features in the dataset to define the customer personalities

Data cleaning

The last step of data preparation is to handle the outliers and the missing values in the dataset if any.

Missing values

First I will count the number of missing values:

There are 24 Null values in only 1 column so we could delete these values without affects the valuable insights. The code below will delete null value

Outliners

Finally I'll detect and Remove the Outliers in the data set

I'll use Box plot to find the outliners as if there is an outlier it will plotted as point in boxplot but other population will be grouped together and display as boxes.

The plots shows :

Data Exploratory

Modeling clusters

Customer segments

The next step is to look at the clustering of clients in the dataset by defining the segments of the clients.

I define 4 customer segments with 3 metrics:

There are 4 customer segments as below:

First I'll normalize the data and then create customer clustering using the metrics above.

Now we have detail income and spending data of each customer segments:

Next, I'll plot this data to display the clustering of customers:

Products segments

Now I will futher determine which customer is the biggest spender of each products

First, I will define three segments of the customers according to the age, income and seniority:

Second, I will define new segments according to the spending of customers on each product which will be based on:

Clusters interpretation

I'll export the data and interprete the results by visualizing the customer segment using Tableau.