Organization attributes
Other attributes
Clustering, or cluster analysis, is a process of unsupervised learning in which similar data points are identified and grouped together in order to help profile the attributes of different groups. The general aim of clustering is to maximize intra-cluster similarity while minimizing inter-cluster similarity. In other words, finding clusters such that the data points within a given cluster are as similar to each other as possible while the clusters themselves are as different from each other as possible.
There are several popular clustering algorithms used by data scientists depending on the specific applications involved. These popular algorithms include:
- K-means
- K-means++
- Hierachical clustering
- Density clustering
- Spectral clustering
- Consensus clustering
- Expectation-maximization
- Mean-shift
Clustering is a prominent technique for data / statistical analysis and exploratory data mining, with applications in numerous fields. Some of the use cases for clustering include:
- Market segmentation
- Summarizing news (e.g. Google News algorithms)
- Pattern recognition
- Recommender systems (e.g. Netflix, giveable, etc.)
- Sequence analysis (genetics)
- Identifying extreme areas for various types of crimes
- Many more