Cluster Detection through Model Based Clustering Approach

Vikas Singh, Mukesh Kumar


Cluster analysis is an important technique for creating homogeneous group of objects, items. These groupings are sometimes essential for the proper analysis of data. Therefore, the technique of clustering is very useful in data analysis, especially in case of large data-sets. In clustering, the groups or clusters are found based on some characteristics of interest. Based on the chosen characteristics of interest similarity measures/distance matrices are calculated to perform the cluster analysis through heuristic clustering techniques. However, a more scientific approach of clustering is model based clustering approach. This approach is basically based on the distributional setups of observations. In this method, the data points are supposed to be drawn from a mixture distribution. The mixture distribution of multivariate normal gives some exciting and interesting result as compared to the other component distributions. In this paper, we have made an attempt to find clusters through a mixture of multivariate normal as components for the real dataset of student enrollment in higher education in India for clustering. The datasets is taken from All India Survey on Higher Education (AISHE) report for the year 2017-18. BIC is used for reporting the optimal no. of cluster and adequacy of the model.


Model based cluster, EM-algorithm, Datasets, Bayesian Information Criterion.

