The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models

Authors

  • Noureldien Noureldien UST
  • Saffa Mohmoud

DOI:

https://doi.org/10.14738/tmlai.94.10101

Keywords:

Filter feature selection methods, Ensemble models, Aggregation methods, Ranked lists.

Abstract

Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classification accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question.

      In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes.

      The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.

References

[1] Ali Asghar, Shahrjooi Haghighi, Hichem Frigui , Xiang Zhang. (2019). Ensemble Feature Selection for Biomarker Discovery in Mass Spectrometry-based Metabolomics, SAC '19, Limassol, Cyprus.
[2] Martin Binder, Julia Moosbauer, Janek Thomas, and Bernd Bischl. (2020). Multi-Objective Hyperparameter Tuning and Feature Selection using Filter Ensembles. In Genetic and Evolutionary Computation Conference (GECCO '20), Cancún, Mexico
[3] H. Liu and L. Yu. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering. 17, 491–502.
[4] M. Dash and H. Liu. (2003). Consistency-based search in feature selection. Artif. Intell., 151, 155–176.
[5] Tsymbal, A., Pechenizkiy, M., Cunningham, P.(2005). Diversity in search strategies for ensemble feature selection. Inf. fusion 6(1), 1566–2535.
[6] Seijo-Pardo, B., Porto-Díaz, I., Bolón-Canedo, V., Alonso-Betanzos, A. (2017). Ensemble feature selection: homogeneous and heterogeneous approaches. Knowledge. Based Syst. 118, 124–139. https://doi.org/10.1016/j.knosys.2016.11.017
[7] Saeys, Y., Abeel, T., Van der Peer, Y. (2008). Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., et al. (eds.) European Conference on Machine Learning (ECML PKDD). LNAI 5212, 313–325.
[8] Bolon- Candeo, V. And A. Alonso – Betanzoes. (2018). Recent Advances in Ensembles for Feature Selection. Springer
[9] B. Seijo-Pardo, I. Porto-D´iaz, V. Bol´on-Canedo, A. Alonso-Betanzo.(2016). Ensemble Feature Selection: Homogeneous and Heterogeneous Approaches. Knowledge-Based Systems ·
.[10] Wald, R.; Khoshgoftaar, T. M.; Dittman, D. J.; Awada, W.; and Napolitano, A. (2012). An extensive comparison of feature ranking aggregation techniques in bioinformatics. In IEEE 13th International Conference on Information Reuse and Integration, 377–384.
[11] Wald, R.; Khoshgoftaar, T. M.; and Dittman, D. (2012). Mean aggregation versus robust rank aggregation for ensemble gene selection. In 11th International Conference on Machine Learning and Applications, 1, 63–69.
[12] David J. Dittman, Taghi M. Khoshgoftaar, Randall Wald, and Amri Napolitano. (2013). Classification Performance of Rank Aggregation Techniques for Ensemble Gene Selection. Proceedings of the Twenty-Sixth International Florida Artificial Intelligence Research Society Conference

Downloads

Published

2021-08-17

How to Cite

Noureldien, N., & Mohmoud, S. (2021). The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models. Transactions on Engineering and Computing Sciences, 9(4), 39–51. https://doi.org/10.14738/tmlai.94.10101

Issue

Section

Special Issue : 1st International Conference on Affective computing, Machine Learning and Intelligent Systems