A Comparative Study of Fast and Accurate Clustering Algorithms in Multi-Sized Data Sets
DOI:
https://doi.org/10.14738/tecs.122.14317Keywords:
Cluster Analysis, Data Mining, AlgorithmsAbstract
Unsupervised learning or clustering in large data sets is a challenging problem. Most clustering algorithms are not efficient and accurate in such data sets. Therefore, development of clustering algorithms capable of solving clustering problems in large data sets is very important. In this paper, we present an overview of various algorithms and approaches which are recently being used for Clustering of large data and Edocument. We use the squared Euclidean norm to define the similarity measure. In this paper, a comparative study of the performance of various clustering algorithms: the global kmeans algorithm (GKM), the multi-start modified global kmeans algorithm (MSMGKM), the multi-start kmeans algorithm (MS-KM), the difference of convex clustering algorithm (DCA), the incremental clustering algorithm based on the difference of convex representation of the cluster function and non-smooth optimization (DC-L2), is carried out using Python. CCS Concepts: Information systems Data mining, Information systems, Data cleaning, Information systems Clustering.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Syed Quddus, Adil Bagirov
This work is licensed under a Creative Commons Attribution 4.0 International License.