Monday, January 19, 2026

Clustering


Clustering

 

Step by step we are getting a deeper understanding of Artificial Intelligence! “Clustering is an unsupervised machine learning technique used to group data points, objects, or observations into clusters based on their similarities or patterns. Unlike supervised learning, clustering works with unlabeled data, aiming to uncover hidden structures or relationships within the dataset”. The final goal is that items that are similar to each other are grouped in the same cluster, in that way enabling the recognition of patterns. Consider the case of a dataset that is massive and not labeled. A clustering algorithm will be used to identify patterns in the dataset, forming several groups.

The characteristics of each datapoint must be represented as a number. A numeric label can represent each characteristic or dimension, even a color must be represented by a numeric label, so there will be no text involved.  Each datapoint will be represented by an N-dimensional hyperplane, and there will be several datapoints that will be close to each other. These datapoints will form a cluster, and the distance between these clusters will represent how similar they are. It is important to define upfront how many clusters there will be, so that the clusters contain differentiations but are not so close to each other either. Let us take the example of customer segmentation. Each characteristic will be provided a numeric label, for example: age, gender, location, income, etc. Based on these characteristics, each customer (datapoint) will be assigned a location in the N-dimensional hyperplane. Those customers who are close to each other will constitute a cluster. With this information, a personalized marketing campaign can be devised for each customer group (cluster).



Other examples of use cases include:

-            Products: how many categories of products should an e-commerce site have? By identifying similarities in the products, we can group them in clusters which will form product categories.

-            Recommendation systems: Netflix utilizes user preferences to group users into categories and recommend movies or series to each group.

-            Healthcare: Patient stratification by symptoms or outcomes, where we can group the type of patients to provide personalized attention.

Clustering appears as a major component in the Machine Learning portfolio, where dividing the datasets into groups recognizing patterns in the data appears is key to personalized attention. Now you know how the Netflix recommendation system algorithm works! This is getting interesting!

 

Clustering in Machine Learning - GeeksforGeeks

Clustering in Machine Learning


No comments:

Post a Comment

Machine Learning

  Machine Learning   “Machine Learning (ML) is a subset of Artificial Intelligence (AI) that focuses on building algorithms capable of l...