This study scrutinizes the efficacy of established statistical methodologies alongside cutting-edge machine learning processes for projecting trends in bank deposit dynamics, utilizing data from a series of direct marketing initiatives by a bank in Portugal. Initially, the analysis employs both descriptive and inferential statistical approaches to delineate consumer behavior and evolving market scenarios, paving the way for sophisticated predictive analytics.
As the examination unfolds, it incorporates an array of machine learning techniques, encompassing logistic regression, decision trees, and support vector machines, enriched with complex ensemble frameworks such as random forests and gradient boosting. The research further harnesses the capabilities of neural networks to delve into deeper behavioral analytics of consumers.
The application of the Synthetic Minority Over-sampling Technique (SMOTE) significantly refines the accuracy of the predictive models by addressing imbalances within the class distributions, thereby enhancing overall model performance. For instance, post-SMOTE application, the Decision Tree model's accuracy escalated from 87.22% to 89.76%, with a concurrent rise in its ROC AUC from 70.19% to 89.74%. The Random Forest model similarly benefited, with its accuracy jumping from 90.06% to 93.68% and ROC AUC increasing from 92.18% to 96.05%.
Further advancements in model reliability are achieved through strategic applications of bagging and stacking techniques, which fortify the models' accuracy and stability. Through bagging, accuracy was heightened to 94% and ROC to 98.6%, while stacking brought about an accuracy of 92% and a ROC of 97%. These enhancements not only bolstered the predictive precision but also provided richer insights into the variables influencing commitments to bank deposits, underscoring the vital importance of hybrid modeling strategies in the optimization of financial decision-making.