Feature Selection and an Ensemble Framework for Metagenomic Data

Zoltán Pödör; Máté Hekfusz

doi:10.14738/tmlai.1304.19266

Authors

Zoltán Pödör Eötvös Loránd University, Faculty of Informatics, Budapest, H-1117, Hungary
Máté Hekfusz Eötvös Loránd University, Faculty of Informatics, Budapest, H-1117, Hungary

DOI:

https://doi.org/10.14738/tmlai.1304.19266

Keywords:

Feature Selection, Classification, Ensemble Framework, Genome Data

Abstract

Genome data, characterized by its high dimensionality and complexity, presents significant challenges for computational analysis and biological interpretation. Feature selection plays a crucial role in reducing dimensionality, improving model interpretability, and enhancing predictive performance by identifying the most informative genomic attributes. In this study, we construct a robust, generalisable ensemble framework for the feature selection and ML classification of metagenomic data. The framework incorporates six different feature selection algorithms of different types working in an ensemble. We comprehensively assess four ML classifiers to pair with them and three aggregation methods to combine their results, testing numerous configurations to find which ones perform best. Our result shows that Random Forest is a general and reliable algorithm for metagenomic datasests and consistent with the literature, we found that feature selection universally improves classification performance, though this improvement varies per dataset and, on non-wrapper methods, depends on choosing the right subset size. When looking at their best scores, the six FS algorithms performed broadly similarly across the data, with the largest differences being on the hardest-to-classify datasets, where mRMR and Boruta edged out the others.

Feature Selection and an Ensemble Framework for Metagenomic Data

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Author Center

Indexing

Follow Us

Current Issue

Most Read Last week

Scholar Publishing

Our Journals

Useful Links