This research examines different feature selection methods to enhance the predictive accuracy of macroeconomic forecasting models, focusing on Iran’s economic indicators derived from World Bank data. Fourteen feature selection techniques were thoroughly compared, classified into Filter, Wrapper, Embedded, and Similarity-based categories. The evaluation utilized Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) metrics under a 10-fold cross-validation scheme. The findings highlight that Stepwise Selection, Tree-based approaches, and Similarity-based methods, especially those employing Hausdorff and Euclidean distances, consistently outperformed others with average MAE values of 32.03 for Stepwise Selection and 62.69 for Hausdorff Distance. Conversely, Recursive Feature Elimination and Variance Thresholding exhibited weaker results, yielding significantly higher average MAE scores. Similarity-based approaches achieved an average rank of 9.125 across datasets, demonstrating their robustness in managing high-dimensional macroeconomic data. These outcomes underscore the value of integrating similarity measures with traditional feature selection techniques to improve the efficiency and reliability of predictive models, offering meaningful insights for researchers and policymakers in economic forecasting. |
- Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, et al. Feature selection: A data perspective. ACM Comput Surv. 2017; 50(6):1–45.
- Remeseiro B, Bolon-Canedo V. A review of feature selection methods in medical applications. Comput Biol Med. 2019; 112:103375.
- Zhu Z, Ong YS, Dash M. Wrapper-filter feature selection algorithm using a memetic framework. IEEE Trans Syst Man Cybern. 2007; 37(1):70–6.
- Mitra P, Murthy CA, Pal SK. Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell. 2002; 24(3):301–12. doi:10.1109/34.990133
- Shi J, Wang B, Shi Q, et al. Adaptive-similarity-based multi-modality feature selection for multimodal classification in Alzheimer's disease. Med Image Anal. 2020; 60:101618.
- Mehri M, Chaieb R, Kalti K, Héroux P, Mullot R, Essoukri Ben Amara N. A comparative study of two state-of-the-art feature selection algorithms for texture-based pixel-labeling task of ancient documents. J Imaging. 2018; 4(8):97.
- Shen Z, Chen X, Garibaldi JM. A novel meta-learning framework for feature selection using data synthesis and fuzzy similarity. In: 2020 IEEE Int Conf Fuzzy Syst (FUZZ-IEEE). 2020. p. 1–8.
- Goldani M, Tirvan SA. Sensitivity assessing to data volume for forecasting: introducing similarity methods as a suitable one in feature selection methods. arXiv preprint arXiv:2406.04390. 2024.
- Mathisen BM, Aamodt A, Bach K, Langseth H. Learning similarity measures from data. Prog Artif Intell. 2020; 9(2):129–43.
- Qi M, Wang T, Liu F, Zhang B, Wang J, Yi Y. Unsupervised feature selection by regularized matrix factorization. Neurocomputing. 2017; 273:593–610.
- Du S, Ma Y, Li S, Ma Y. Robust unsupervised feature selection via matrix factorization. Neurocomputing. 2017; 241:115–27.
- Hu R, et al. Graph self-representation method for unsupervised feature selection. Neurocomputing. 2015; 220:130–7.
- Venkatesh B, Anuradha J. A review of feature selection and its methods. Cybern Inf Technol. 2019; 19(1):3–26.
- Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003; 3:1157–82.
- Jović A, Brkić K, Bogunović N. A review of feature selection methods with applications. In: 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). 2015. p. 1200–5.
- Goldani M. Comparative analysis of missing values imputation methods: a case study in financial series (S&P500 and Bitcoin value data sets). Iran J Finance. 2024; 8(1):47–70.
- Ali M, Mazhar T, Shahzad T, Ghadi YY, Mohsin SM, Akber SMA, et al. Analysis of feature selection methods in software defect prediction models. IEEE Access. 2023.
- Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007; 23(19):2507–17.
- Chen H, GAO X. A new time series similarity measurement method based on fluctuation features. Tehnički Vjesnik. 2020; 27:1134–41.
- Salarpour A, Khatunloo H. A segmental distance-based similarity criterion using time deviation. J Electr Eng Univ Tabriz. 2019; (2):645–56.
- Keogh E, Pazzani M. Derivative dynamic time warping. In: Proceedings of the 2001 SIAM International Conference on Data Mining. 2001. p. 1–11. doi:10.1137/1.9781611972719.1
- Besse PC, Guillouet B, Loubes JM, Royer F. Review and perspective for distance-based clustering of vehicle trajectories. IEEE Trans Intell Transp Syst. 2016; 17(11):3306–17.
- Chen L, Ng RT. On the marriage of Lp-norms and edit distance. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases (VLDB). 2004. p. 792–803.
- Besse PC, Guillouet B, Loubes JM, Royer F. Review and perspective for distance-based clustering of vehicle trajectories. IEEE Trans Intell Transp Syst. 2016; 17(11):3306–17.
- Bergmeir C, Benítez JM. On the use of cross-validation for time series predictor evaluation. Inf Sci. 2012; 191:192–213.
|