Classifying an Object using Class Differentiators

  • Seunghyun Im University of Pittsburgh at Johnstown
  • Li-Shiang Tsay North Carolina A&T University
Keywords: Classification algorithm, Supervised Classifier, Categorical Data, Reduct, Rough Set

Abstract

This paper discusses a supervised classification method. The method classifies an object using class differentiators. The class differentiators are the smallest set of values in a class that effectively distinguish one class from the others. The class membership is determined by the degree of homology between the test object and the class differentiators. Unlike many rule based classifiers, the algorithm proposed in this paper does not require input parameters and always produces the same results from the same data set. The algorithm is designed to work with categorical data, and is particularly useful when the quantification of the data is infeasible. We present an experimental result to show the validity of the algorithm.

References

. Pawlak, Z., "Rough sets", International Journal of Computing and Information Sciences, 11(5), pp. 341-356, 1982

. Pawlak, Z., Skowron, A., "Rough sets and Boolean reasoning" Information Sciences, 177(1), pp. 41–73, 2007

. Duda, R., Hart, P. and Stork, D., "Pattern classification", Wiley, New York, 2nd edition, 2001

. Tan, P., Steinbach, M. and Kumar, V., "Introduction to data mining", 1st edition, Pearson Addison Wesley, Boston, 2005

. Boriah, S., Ch, ola, V. and Kumar, V., "Similarity measures for categorical data: A comparative evaluation", In: SIAM Data Mining Conference, pp.243-254, 2008

. Dougherty, J., Kohavi, R., Sahami, M., "Supervised and unsupervised discretization of continuous features", In: Twelfth International Conference on Machine Learning. pp.194-202, 1995

. Guarino, N., "Formal ontology, conceptual analysis and knowledge representation", International journal of human-computer studies, 43(5), pp.625-640, 1995

. Xiaodan Zhang, Liping Jing, Xiaohua Hu, Michael Ng, and Xiaohua Zhou. "A comparative study of ontology based term similarity measures on PubMed document clustering", In: 12th international conference on Database systems for advanced applications, pp.115-126, 2007

. Brickell, J. and Shmatikov, V., "The cost of privacy: destruction of data-mining utility in anonymized data publishing" In: ACM SIGKDD international conference, pp.70-78, 2008

. Im, S., "Privacy aware data management and chase", Fundamenta Informaticae, 78(4), pp.507-524, 2007

. Charu C. Aggarwal, and ChengXiang Zhai., "A Survey of Text Classification Algorithms", Mining Text Data, Springer, pp 43-76, 2012

. Posthoff, Ch. and Steinbach, B., "Logic Functions and Equations - Binary Models for Computer Science", Springer, Dordrecht, 2004

. Dardzinska, A. and Ras, Z., "Rule-based Chase algorithm for partially incomplete information systems", In: Second International Workshop on Active Mining, pp.42-51, 2003

. Dardzinska, A. and Ras, Z.W., "Extracting Rules from Incomplete Decision Systems: System ERID" In: Foundations and Novel Approaches in Data Mining, Studies in Computational Intelligence, Vol. 9, Springer, pp.143-154, 2006

Published
2014-11-03