Amalgamation of Unsupervised and Supervised Approaches for Data Labeling
DOI:
https://doi.org/10.14738/tmlai.1305.19355Abstract
Data abundance is inevitable when every human activity is revolving around Internet of Things (IoT). The data is extensive, but it lacks the labels needed by the machine learning models to identify patterns and characteristics for accurate prediction and automation. Data labeling is a very crucial and essential task for consuming this abundant data for applications like customer relationship management systems, recommendation systems and pattern recognition. We propose a novel approach called Amalgamation of Unsupervised and Supervised Approaches for Data Labeling (AUSL), which integrates clustering and classification using rule-based refinement. Given the unlabeled data, AUSL offers a robust and interpretable framework for uncovering meaningful data labels. Ensemble-based clustering and AdaBoost SVM ameliorates the selection of important attributes for data labeling, which are further processed by association rule mining to extract underlying significant data characteristics from the reduced domain. Experiments are conducted on four data sets to prove the robustness of the proposed method. The comparative performance of AUSL with an existing method is promising, achieving finer labels with an average hit rate exceeding 90% and confidence levels above 80%. These results indicate the robustness, adaptability, and superior label refinement ability of the proposed method. In conclusion, AUSL provides a scalable, interpretable, and effective solution for structured data labelling, with strong potential for real-world deployment in various data-driven applications.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Sharanjit Kaur, Meenu Mohil, Ansh Sharma, Hardik Bhaniya, Harshita Singh, Manju Bhardwaj

This work is licensed under a Creative Commons Attribution 4.0 International License.
