Binary Banyan Tree Growth Optimization: A Practical Approach to High-dimensional Feature Selection
High-dimensional feature spaces in Scientific and Technical Service Resources (STSR)
classification present significant challenges, including increased computational costs and
diminished accuracy. Identifying an optimal subset of features from raw text vectors is thus critical
for effective data classification. This paper introduces a novel metaheuristic algorithm called Binary
Banyan Tree Growth Optimization (BBTGO), specifically designed for high-dimensional feature
selection (FS). Inspired by the unique growth patterns of the banyan tree, BBTGO leverages a
combination of innovative Boolean vectors, including rooting, multi-trunk, and adjustment operator,
along with a perturbation phase to enhance the search efficiency and reduce feature dimensionality.
These operators enhance the search for promising regions and reduce features by utilizing the
optimal solutions clustered within subgroups. Furthermore, BBTGO incorporates a dynamic
adjustment mechanism that periodically activates different growth operators to meet the search
demands of high-dimensional space. We rigorously evaluate the exploration and exploitation
capabilities of BBTGO through comprehensive statistical analyses of various performance metrics.
The proposed method demonstrates superior results on 12 high-dimensional benchmark datasets
and is successfully applied to feature selection in STSR text classification tasks. Experimental
results show that BBTGO significantly outperforms existing methods in terms of classification
accuracy, selected features, convergence speed, and processing time. These results underscore the
potential of BBTGO as a robust and versatile solution for high-dimensional FS, with broad
applicability to real-world classification challenges.
Funding
This work is supported by Shanghai Pujiang Program (No. 22PJ1403800), National Natural Science Foundation of China (No. 62203290), National Key Research and Development Program of China (No. 2019YFB1405500) and 111 Project (No. D18003).
History
Author affiliation
College of Science & Engineering Comp' & Math' SciencesVersion
- AM (Accepted Manuscript)