Clustering
Step by step we are getting a
deeper understanding of Artificial Intelligence! “Clustering is an unsupervised
machine learning technique used to group data points, objects, or
observations into clusters based on their similarities or patterns. Unlike
supervised learning, clustering works with unlabeled data, aiming to uncover
hidden structures or relationships within the dataset”. The final goal is that
items that are similar to each other are grouped in the same cluster, in that
way enabling the recognition of patterns. Consider the case of a dataset that
is massive and not labeled. A clustering algorithm will be used to identify
patterns in the dataset, forming several groups.
The characteristics of each
datapoint must be represented as a number. A numeric label can represent each
characteristic or dimension, even a color must be represented by a numeric label,
so there will be no text involved. Each datapoint
will be represented by an N-dimensional hyperplane, and there will be several
datapoints that will be close to each other. These datapoints will form a
cluster, and the distance between these clusters will represent how similar
they are. It is important to define upfront how many clusters there will be, so
that the clusters contain differentiations but are not so close to each other either.
Let us take the example of customer segmentation. Each characteristic will be
provided a numeric label, for example: age, gender, location, income, etc.
Based on these characteristics, each customer (datapoint) will be assigned a
location in the N-dimensional hyperplane. Those customers who are close to each
other will constitute a cluster. With this information, a personalized marketing
campaign can be devised for each customer group (cluster).
Other examples of use cases include:
-
Products: how many categories of products should
an e-commerce site have? By identifying similarities in the products, we can
group them in clusters which will form product categories.
-
Recommendation systems: Netflix utilizes user
preferences to group users into categories and recommend movies or series to
each group.
-
Healthcare: Patient stratification by symptoms
or outcomes, where we can group the type of patients to provide personalized attention.
Clustering appears as a major
component in the Machine Learning portfolio, where dividing the datasets into
groups recognizing patterns in the data appears is key to personalized
attention. Now you know how the Netflix recommendation system algorithm works!
This is getting interesting!
Clustering
in Machine Learning - GeeksforGeeks
Clustering
in Machine Learning

No comments:
Post a Comment