Development of direct marketing strategy for banking industry: The use of a Chi-squared Automatic Interaction Detector (CHAID) in deposit subscription classification
DOI:
https://doi.org/10.31328/jsed.v5i1.3420Keywords:
banking industry, classification, deposit subscriptions, marketing strategyAbstract
A comparison between Chi-squared Automatic Interaction Detector (CHAID) and logistic regression analysis was performed for classification problems on bank direct marketing data. CHAID Performance Comparison and comparison with Logistic Regression (LR) performance were also conducted. Priority performance with two statistical measures was evaluated: classification accuracy and sensitivity in the presence of data containing categorical imbalances. Random over sampling (ROS) was then applied to deal with class balance problems to get better performance of CHAID analysis. Segmentation analysis was also performed using the CHAID approach to improve the performance of the analysis results. CHAID outperforms LR because of its advantages that it can be used to perform segmentation modeling. Direct marketers should pay attention to traits are Duration, Month, Contact, and Housing. To get a higher subscription, the bank must extend the call duration. Based on these results, the banking industry needs to prepare regulations related to human resources, infrastructure, costs, and government support to achieve higher subscriptions.JEL Classification A10; C10; G21References
Aksoy, A., Ertürk, Y. E., Eyduran, E., & Tariq, M. M. (2018). Comparing predictive performances of MARS and CHAID algorithms for defining factors affecting final fattening live weight in cultural beef cattle enterprises. Pakistan Journal of Zoology, 50(6), 2279–2286. https://doi.org/10.17582/journal.pjz/2018.50.6.2279.2286
Alexandra, J., & Sinaga, K. P. (2021). Machine learning approaches for marketing campaign in portuguese banks. 2021 3rd International Conference on Cybernetics and Intelligent System (ICORIS), 1–6.
Amzile, K., & Amzile, R. (2021). Using SVM for Smart Direct Marketing (SDM): A case of predicting bank customers interested in the term deposits. International Journal of Accounting, Finance, Auditing, Management and Economics, 2(5), 525–537. https://www.ijafame.org/index.php/ijafame/article/download/366/294/
Asare-Frempong, J., & Jayabalan, M. (2017). Predicting customer response to bank direct telemarketing campaign. In 2017 International Conference on Engineering Technology and Technopreneurship, ICE2T 2017 (Vol. 2017-January, pp. 1–4). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ICE2T.2017.8215961
Bethencourt-Cejas, M., & Diaz-Perez, F. M. (2017). An application of the CHAID algorithm to study the environmental impact of visitors to the Teide National Park in Tenerife, Spain. International Business Research, 10(7), 168–177. https://doi.org/10.5539/ibr.v10n7p168
de Caigny, A., Coussement, K., & de Bock, K. W. (2018). A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 269(2), 760–772. https://doi.org/10.1016/j.ejor.2018.02.009
DÃaz-Pérez, F. M., GarcÃa-González, C. G., & Fyall, A. (2020). The use of the CHAID algorithm for determining tourism segmentation: A purposeful outcome. Heliyon, 6(7), e04256. https://doi.org/10.1016/j.heliyon.2020.e04256
Edastama, P., Bist, A. S., & Prambudi, A. (2021). Implementation of data mining on glasses sales using the apriori algorithm. International Journal of Cyber and IT Service Management, 1(2), 159–172. Retrieved from https://iiast-journal.org/ijcitsm/index.php/IJCITSM/article/view/46
Everitt, B. S. (2019). The analysis of Contingency Tables. Chapman and Hall/CRC. https://books.google.co.id/books?id=aSe1LbYz3v0C
Fitriani, M. A., & Febrianto, D. C. (2021). Data mining for potential customer segmentation in the marketing bank dataset. JUITA: Jurnal Informatika, 9(1), 25–32. https://doi.org/10.30595/juita.v9i1.7983
Ghosh, M., & Sanyal, G. (2018). An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning. Journal of Big Data, 5(1), 1–25. https://doi.org/10.1186/s40537-018-0152-5
Hao, R., Xia, X., Shen, S., & Yang, X. (2020). Bank direct marketing analysis based on ensemble learning. Journal of Physics: Conference Series, 1627(1), 012026. https://doi.org/10.1088/1742-6596/1627/1/012026
Hasnain, M., Pasha, M. F., Ghani, I., Imran, M., Alzahrani, M. Y., & Budiarto, R. (2020). Evaluating trust prediction and confusion matrix measures for web services ranking. IEEE Access, 8, 90847–90861. https://doi.org/10.1109/ACCESS.2020.2994222
Idrees, H. I. H. (2019). Building a Classification and Forecasting Model to Support Decision Making. PhD Thesis, Sudan University of Science and Technology. Retrieved from http://41.67.53.40/handle/123456789/22745?show=full
Kaur, H., Pannu, H. S., & Malhi, A. K. (2019). A systematic review on imbalanced data challenges in machine learning: Applications and solutions. ACM Computing Surveys (CSUR), 52(4), 1–36. https://doi.org/10.1145/3343440
Klee, C. A., Rogers, B. M. A., Caravedo, R., & Dietz, L. (2018). Measuring/s/variation among younger generations in a migrant settlement in Lima, Peru. Studies in Hispanic and Lusophone Linguistics, 11(1), 29–57. https://doi.org/10.1515/shll-2018-0002
ÅadyżyÅ„ski, P., Å»bikowski, K., & Gawrysiak, P. (2019). Direct marketing campaigns in retail banking with the use of deep learning and random forests. Expert Systems with Applications, 134, 28–35. https://doi.org/10.1016/j.eswa.2019.05.020
Marinakos, G., & Daskalaki, S. (2017). Imbalanced customer classification for bank direct marketing. Journal of Marketing Analytics, 5(1), 14–30. https://doi.org/10.1057/s41270-017-0013-7
McCordic, C., & Frayne, B. (2017). Household vulnerability to food price increases: The 2008 crisis in urban Southern Africa. Geographical Research, 55(2), 166–179. https://doi.org/10.1111/1745-5871.12222
Milanović, M., & Stamenković, M. (2016). CHAID decision tree: Methodological frame and application. Economic Themes, 54(4), 563–586. https://doi.org/10.1515/ethemes-2016-0029
Møller, A. B., Iversen, B. v, Beucher, A., & Greve, M. H. (2019). Prediction of soil drainage classes in Denmark by means of decision tree classification. Geoderma, 352, 314–329. https://doi.org/10.1016/j.geoderma.2017.10.015
Olosunde, A. A., & Soyinkab, A. T. (2020). Discrimination and classification model from multivariate exponential power distribution. Electronic Journal of Applied Statistical Analysis, 13(2), 284–292. https://doi.org/10.1285/i20705948v13n2p284
Rogić, S., & Kašćelan, L. (2021). Class balancing in customer segments classification using support vector machine rule extraction and ensemble learning. Computer Science and Information Systems, 18(3), 893–925. https://doi.org/10.2298/CSIS200530052R
Santos, A. E. M., Amaral, T. K. M., Mendonça, G. A., & Silva, D. de F. S. da. (2020). Open stope stability assessment through artificial intelligence. REM-International Engineering Journal, 73, 395–401. https://doi.org/10.1590/0370-44672020730012
Sen, P. C., Hajra, M., & Ghosh, M. (2020). Supervised classification algorithms in machine learning: A survey and review. In Emerging technology in modelling and graphics (pp. 99–111). Springer. https://doi.org/10.1007/978-981-13-7403-6_11
Siregar, A. M., Faisala, S., Handayania, H. H., & Jalaludinb, A. (2020). Classification data for direct