Optimization of XGBoost Algorithm Using Parameter Tunning in Retail Sales Prediction
DOI:
https://doi.org/10.23887/janapati.v13i3.82214Keywords:
XGBoost, Retail, random search, grid search, bayesian optimizationAbstract
In retail companies, the owner needs sales analysis to make decisions in the company's business processes. Several previous studies have introduced forecasting techniques using regression analysis, and classification approaches that need optimization. This article proposes a new approach to sales prediction using XGBoost, which is optimized by comparing the best performance from three optimization methods: Random search, grid search, and Bayesian optimization. The aim is to obtain the best comparative analysis and increase prediction accuracy. The novelty of the proposed model is determining the best value for each optimization method using XGBoost. The results of the evaluation show that the best results were achieved by the grid search optimization technique in the XGBoost model with an increase in the evaluation value R^2 from 97.31 to 98.41. The results of the proposed model analysis can help retail business owners in accurate sales predictions to determine the development of business processes.
References
S. Sharma, N. Islam, G. Singh, and A. Dhir, “Why Do Retail Customers Adopt Artificial Intelligence (AI) Based Autonomous Decision-Making Systems?,” IEEE Trans Eng Manag, vol. 71, pp. 1846–1861, 2024, doi: 10.1109/TEM.2022.3157976.
X. Dairu and Z. Shilong, “Machine Learning Model for Sales Forecasting by Using XGBoost,” 2021 IEEE International Conference on Consumer Electronics and Computer Engineering, ICCECE 2021, pp. 480–483, Jan. 2021, doi: 10.1109/ICCECE51280.2021.9342304.
H. Martinus, “ANALISIS INDUSTRI RETAIL NASIONAL”.
A. Schmidt, M. W. U. Kabir, and M. T. Hoque, “Machine Learning Based Restaurant Sales Forecasting,” Mach Learn Knowl Extr, vol. 4, no. 1, 2022, doi: 10.3390/make4010006.
D. R. Pradiptyo, I. H. Sahid, I. Budi, A. B. Santoso, and P. K. Putra, “Incorporating Stock Prices and Social Media Sentiment for Stock Market Prediction: A Case of Indonesian Banking Company,” Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI, vol. 13, no. 1, pp. 156–165, Mar. 2024, doi: 10.23887/JANAPATI.V13I1.74486.
C. Janiesch, P. Zschech, and K. Heinrich, “Machine learning and deep learning,” Electronic Markets, vol. 31, no. 3, pp. 685–695, Sep. 2021, doi: 10.1007/S12525-021-00475-2/TABLES/2.
“Air Conditioner Sales Prediction Using CTGAN, XGBoost and SHAP – IJSREM.” Accessed: Jun. 26, 2024. [Online]. Available: https://ijsrem.com/download/air-conditioner-sales-prediction-using-ctgan-xgboost-and-shap/
A. Mitra, A. Jain, A. Kishore, and P. Kumar, “A Comparative Study of Demand Forecasting Models for a Multi-Channel Retail Company: A Novel Hybrid Machine Learning Approach,” Operations Research Forum, vol. 3, no. 4, Dec. 2022, doi: 10.1007/S43069-022-00166-4.
X. Lu, “A Comparative Study of Machine Learning-Based Regression Models for Supply Chain Management,” Applied and Computational Engineering, 2024, doi: 10.54254/2755-2721/53/20241233.
K. Seethapathy, “Unlocking Inventory Efficiency: Harnessing Machine Learning for Sales Surge Prediction,” International Journal of Supply Chain and Logistics, 2024, doi: 10.47941/ijscl.1863.
D. Liu, “Enterprise Digital Retail Business Data Analysis and Forecasting Based on Time Series Analysis,” Advances in Economics Management and Political Sciences, 2024, doi: 10.54254/2754-1169/77/20241678.
H. Alparslan, “Utilizing Logistic Regression for Analyzing Customer Behavior in an E-Retail Company,” Financial Engineering, 2024, doi: 10.37394/232032.2024.2.10.
F. Weber and R. Schütte, “A domain-oriented analysis of the impact of machine learning—the case of retailing,” Big Data and Cognitive Computing, vol. 3, no. 1, 2019, doi: 10.3390/bdcc3010011.
E. Martins and N. V. Galegale, “RETAIL SALES FORECASTING INFORMATION SYSTEMS: COMPARISON BETWEEN TRADITIONAL METHODS AND MACHINE LEARNING ALGORITHMS,” in Proceedings of the 15th IADIS International Conference Information Systems 2022, IS 2022, 2022. doi: 10.33965/is2022_202201l004.
N. Wu, “Mathematically Improved XGBoost Algorithm for Truck Hoisting Detection in Container Unloading,” Sensors, 2024, doi: 10.3390/s24030839.
K. Xu, “Predicting housing prices and analyzing real estate markets in the Chicago suburbs using machine learning,” Journal of Student Research, vol. 11, no. 3, p. undefined-undefined, Aug. 2022, doi: 10.47611/JSRHS.V11I3.3459.
C. Çılgın and H. Gökçen, “Machine learning methods for prediction real estate sales prices in Turkey,” Revista de la Construccion, vol. 22, no. 1, pp. 163–177, 2023, doi: 10.7764/RDLC.22.1.163.
W. Zhang, C. Wu, H. Zhong, Y. Li, and L. Wang, “Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization,” Geoscience Frontiers, vol. 12, no. 1, pp. 469–477, Jan. 2021, doi: 10.1016/J.GSF.2020.03.007/PREDICTION_OF_UNDRAINED_SHEAR_STRENGTH_USING_EXTREME_GRADIENT_BOOSTING_AND_RANDOM_FOREST_BASED_ON_BAYESIAN_OPTIMIZATION.PDF.
A. M. Abdi, “Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data,” GIsci Remote Sens, vol. 57, no. 1, pp. 1–20, Jan. 2020, doi: 10.1080/15481603.2019.1650447.
N. Hou et al., “Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost,” J Transl Med, vol. 18, no. 1, Dec. 2020, doi: 10.1186/S12967-020-02620-5.
K. Matuszelański and K. Kopczewska, “Customer Churn in Retail E-Commerce Business: Spatial and Machine Learning Approach,” Journal of Theoretical and Applied Electronic Commerce Research, vol. 17, no. 1, 2022, doi: 10.3390/jtaer17010009.
V. Redkar, “Air Conditioner Sales Prediction Using CTGAN, XGBoost and SHAP,” Interantional Journal of Scientific Research in Engineering and Management, 2024, doi: 10.55041/ijsrem32201.
A. Mitra, A. Jain, A. Kishore, and P. Kumar, “A Comparative Study of Demand Forecasting Models for a Multi-Channel Retail Company: A Novel Hybrid Machine Learning Approach,” Operations Research Forum, vol. 3, no. 4, 2022, doi: 10.1007/s43069-022-00166-4.
S. Guo, “Revolutionizing the Used Car Market: Predicting Prices With XGBoost,” Applied and Computational Engineering, 2024, doi: 10.54254/2755-2721/48/20241349.
F. N. Fitrah Insani, “Optimizing E-Commerce in Indonesia: Ensemble Learning for Predicting Potential Buyers,” Indonesian Journal of Computer Science, 2024, doi: 10.33022/ijcs.v13i1.3690.
S. Soni, “Performance Evaluation of Multiclass Classification Models for ToN-IoT Network Device Datasets,” Indonesian Journal of Electrical Engineering and Computer Science, 2024, doi: 10.11591/ijeecs.v35.i1.pp485-493.
K. Li, “A Sales Prediction Method Based on XGBoost Algorithm Model,” BCP Business & Management, vol. 36, pp. 367–371, Jan. 2023, doi: 10.54691/BCPBM.V36I.3487.
B. M. Pavlyshenko, “Machine-learning models for sales time series forecasting,” Data (Basel), vol. 4, no. 1, 2019, doi: 10.3390/data4010015.
Y. F. Akande, J. Idowu, A. Misra, S. Misra, O. N. Akande, and R. Ahuja, “Application of XGBoost Algorithm for Sales Forecasting Using Walmart Dataset,” Lecture Notes in Electrical Engineering, vol. 881, pp. 147–159, 2022, doi: 10.1007/978-981-19-1111-8_13.
“Optimasi hyperparameter XGBoost-studi kasus prediksi klaim asuransi = Hyperparameter optimization in XGBoost-case study of insurance claim prediction.” Accessed: Jan. 09, 2024. [Online]. Available: https://lib.ui.ac.id/detail?id=20509606&lokasi=lokal
M. Ryan Afrizal, R. Adi Nugroho, D. Kartini, R. Herteno, J. Ahmad Yani Km, and K. Selatan, “XGBOOST DENGAN RANDOM SEARCH HYPER-PARAMETER TUNING UNTUK KLASIFIKASI SITUS PHISING”.
S. E. Herni Yulianti, O. Soesanto, and Y. Sukmawaty, “Penerapan Metode Extreme Gradient Boosting (XGBOOST) Pada Klasifikasi Nasabah Kartu Kredit,” Journal of Mathematics Theory and Application, 2022, doi: 10.31605/jomta.v4i1.1792.
A. Laios et al., “Factors Predicting Surgical Effort Using Explainable Artificial Intelligence in Advanced Stage Epithelial Ovarian Cancer,” Cancers (Basel), 2022, doi: 10.3390/cancers14143447.
W. Chimphlee, “Hyperparameters Optimization XGBoost for Network Intrusion Detection Using CSE-CIC-IDS 2018 Dataset,” Iaes International Journal of Artificial Intelligence (Ij-Ai), 2024, doi: 10.11591/ijai.v13.i1.pp817-826.
J. Xu, “NSGA–III–XGBoost-Based Stochastic Reliability Analysis of Deep Soft Rock Tunnel,” Applied Sciences, 2024, doi: 10.3390/app14052127.
A. Ardana, “Performance Analysis of XGBoost Algorithm to Determine the Most Optimal Parameters and Features in Predicting Stock Price Movement,” Telematika, 2023, doi: 10.31315/telematika.v20i1.9329.
O. M. Katipoğlu, “Data Division Effect on Machine Learning Performance for Prediction of Streamflow,” Dümf Mühendislik Dergisi, 2022, doi: 10.24012/dumf.1158748.
Q. Wang, “Comparison of Machine Learning Methods for Estimating Leaf Area Index and Aboveground Biomass of Cinnamomum Camphora Based on UAV Multispectral Remote Sensing Data,” Forests, 2023, doi: 10.3390/f14081688.
L. Gou, “State Reliability of Wind Turbines Based on XGBoost–LSTM and Their Application in Northeast China,” Sustainability, 2024, doi: 10.3390/su16104099.
Y. Duan, “Forecasting Carbon Price Using Signal Processing Technology and Extreme Gradient Boosting Optimized by the Whale Optimization Algorithm,” Energy Sci Eng, 2024, doi: 10.1002/ese3.1655.
Y. Ensafi, S. H. Amin, G. Zhang, and B. Shah, “Time-series forecasting of seasonal items sales using machine learning – A comparative analysis,” International Journal of Information Management Data Insights, vol. 2, no. 1, 2022, doi: 10.1016/j.jjimei.2022.100058.
“Amazon uk SalesForecasting 2019-2021 | Kaggle.” Accessed: Jun. 10, 2023. [Online]. Available: https://www.kaggle.com/datasets/revanthkrishnakomali/amazon-uk-salesforecasting-20192021
P. Chowdhury, “Analytical Detection of ‘ Smart Stock Trading System’ Utilizing AI-model,” Interantional Journal of Scientific Research in Engineering and Management, 2024, doi: 10.55041/ijsrem34829.
K. Abnoosian, “Prediction of Diabetes Disease Using an Ensemble of Machine Learning Multi-Classifier Models,” BMC Bioinformatics, 2023, doi: 10.1186/s12859-023-05465-z.
Y. Wang, “Research on Space Image Fast Classification Based on Big Data,” Scalable Computing Practice and Experience, 2023, doi: 10.12694/scpe.v24i3.2423.
M. Miteva, “Preprocessing Techniques for Brain Mri Scans: A Comparative Analysis for Radiogenomics Applications,” Ann. Sofia Univ. Fac. Math. Informat., 2023, doi: 10.60063/gsu.fmi.110.111-125.
O. G. Horsa, “Aspect-Based Sentiment Analysis for Afaan Oromoo Movie Reviews Using Machine Learning Techniques,” Applied Computational Intelligence and Soft Computing, 2023, doi: 10.1155/2023/3462691.
J. M. Ayu, S. Dachi, and P. Sitompul, “Analisis Perbandingan Algoritma XGBoost dan Algoritma Random Forest Ensemble Learning pada Klasifikasi Keputusan Kredit,” Jurnal Riset Rumpun Matematika dan Ilmu Pengetahuan Alam (JURRIMIPA), vol. 2, no. 2, 2023, doi: 10.55606/jurrimipa.v2i2.1336.
R. Siringoringo, R. Perangin-angin, and M. J. Purba, “SEGMENTASI DAN PERAMALAN PASAR RETAIL MENGGUNAKAN XGBOOST DAN PRINCIPAL COMPONENT ANALYSIS,” METHOMIKA Jurnal Manajemen Informatika dan Komputerisasi Akuntansi, vol. 5, no. 1, pp. 42–47, Apr. 2021, doi: 10.46880/jmika.Vol5No1.pp42-47.
Y. Song et al., “Spatial prediction of PM2.5 concentration using hyper-parameter optimization XGBoost model in China,” Environ Technol Innov, vol. 32, Nov. 2023, doi: 10.1016/j.eti.2023.103272.
X. Xiong, X. Guo, P. Zeng, R. Zou, and X. Wang, “A Short-Term Wind Power Forecast Method via XGBoost Hyper-Parameters Optimization,” Front Energy Res, vol. 10, May 2022, doi: 10.3389/fenrg.2022.905155.
M. Aci and G. A. Doǧansoy, “Demand forecasting for e-retail sector using machine learning and deep learning methods,” Journal of the Faculty of Engineering and Architecture of Gazi University, vol. 37, no. 3, 2022, doi: 10.17341/gazimmfd.944081.
A. Alabrah, “An Improved CCF Detector to Handle the Problem of Class Imbalance with Outlier Normalization Using IQR Method,” Sensors, vol. 23, no. 9, May 2023, doi: 10.3390/s23094406.
B. Yao, “Walmart Sales Prediction Based on Decision Tree, Random Forest, and K Neighbors Regressor,” 2023.
C. CATAL, K. ECE, B. ARSLAN, and A. AKBULUT, “Benchmarking of Regression Algorithms and Time Series Analysis Techniques for Sales Forecasting,” Balkan Journal of Electrical and Computer Engineering, vol. 7, no. 1, pp. 20–26, Jan. 2019, doi: 10.17694/bajece.494920.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Hendra Wijaya, Dandy Pramana Hostiadi, Evi Triandini
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with Janapati agree to the following terms:- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work. (See The Effect of Open Access)