Σύγκριση αλγορίθμων μηχανικής μάθησης στην προβλεπτική ταξινόμηση και την επιλογή των σημαντικών χαρακτηριστικών
Abstract
In the present study, Machine Learning algorithms were compared in terms of their predictive ability in classification and the identification of features that contribute most to it. The algorithms evaluated mainly in terms of their classification accuracy were Support Vector Classification (SVC), multinomial Logistic Regression, Stochastic Gradient Descent (SGD), Decision Trees, K-Nearest Neighbors (K-NN), Gaussian Naive Bayes, Neural Networks, as well as ensemble methods like Random Forest and Extra Trees. Optimal parameters for the algorithms were sought using the GridSearch method, while Adaboosting and cross-validation methods were applied to enhance the results. The dataset used was the ‘Forest Covertype, n=581,012’ from the UCI machine learning repository, which includes information on various forest cover types with the aim of predicting the type of forest cover. The algorithms were evaluated on both the original and standardized data. The results showed that the K-NN algorithm had the highest accuracy on the original data, while the Random Forest and Extra Tree algorithms exhibited the highest accuracy in both cases. Standardization of the data had no effect on the accuracy of the Decision Trees, Random Forest, Extra Trees, and multinomial Logistic Regression algorithms, improved the accuracy of the SVC, Neural Networks and SGD algorithms, and reduced the accuracy of the K-NN and Gaussian Naive Bayes algorithms. Additionally, the feature importance analysis showed that elevation, soil type, and wilderness area contributed the most to the classification. Furthermore, the prediction of a random data vector was the same across all algorithms applied to the standardized data, whereas it varied in the original data for the K-NN and Extra Tree algorithms.
Article Details
- How to Cite
-
ΠΑΠΑΦΙΛΙΠΠΟΥ Ν., Kyrana, Z., Pratsinakis, E., Dordas, C., Markos, A., & Menexes, G. (2026). Σύγκριση αλγορίθμων μηχανικής μάθησης στην προβλεπτική ταξινόμηση και την επιλογή των σημαντικών χαρακτηριστικών. Data Analysis Bulletin, 21(1). Retrieved from https://ejournals.epublishing.ekt.gr/index.php/dab/article/view/39468
- Section
- Empirical studies

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish their work in the journal DATA ANALYSIS BULLETIN agree to the following terms:
1. Authors will not be charged any submission, processing or publication fees for their work. These costs are covered by the Greek Society of Data Analysis.
2. The copyright of papers published in the journal DATA ANALYSIS BULLETIN is protected by the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license. The Authors retain the Copyright and grant the journal the right of first publication. This license allows third party licensees to use the work in any form for non-commercial purposes only. If third parties modify or adapt the content, they must license the modified material for noncommercial purposes only. If others modify or adapt the material, they must license the modified material under identical terms.
3. Provided that the terms of the licence concerning the reference to the original author and the original publication in the journal DATA ANALYSIS BULLETIN are maintained.
4. Authors may enter into separate and additional contracts and agreements for the non-exclusive distribution of the work as published in the DATA ANALYSIS BULLETIN journal (e.g., deposit in academic repositories), provided that the acknowledgement and citation of the first publication in the DATA ANALYSIS BULLETIN journal is acknowledged.
5. The DATA ANALYSIS BULLETIN journal allows and encourages authors to deposit their work in institutional (e.g. the repository of the National Documentation Centre) or thematic repositories, after publication in DATA ANALYSIS BULLETIN and under Open Access conditions, as determined by their research funders and/or the institutions with which they collaborate, as appropriate. When submitting their work, authors should provide information on the publication of the work in the journal and the sources of funding for their research. Lists of institutional and thematic repositories by country are available at http://opendoar.org/countrylist.php. Authors can deposit their work free of charge in the repository www.zenodo.org, which is supported by OpenAIRE (www.openaire.eu ), as part of the European Commission's policies to support Open Academic Research.