Introduction

This project addresses the challenge of accurately predicting which bank customers are likely to subscribe to new products. Traditional marketing approaches often rely on experience and general rules, which can result in inefficient resource allocation. Leveraging a real-world dataset of 30,000 bank clients, we adopted a data-driven framework to improve precision marketing through machine learning.

Methods

  • Data Preprocessing:

    • No missing values; all features checked for consistency
    • Encoded categorical variables using OrdinalEncoder
    • Standardized numerical features with StandardScaler
    • Labeled the target variable (subscribe) as binary
  • Model Development and Optimization:

    • Explored four models: Decision Tree, Random Forest, XGBoost, and LightGBM
    • Hyperparameters were tuned automatically via Optuna to maximize F1 score or minimize logloss
    • Model performance was evaluated using 5-fold cross-validation, AUC, and F1 score
    • Stacking ensemble was tested with Logistic Regression as the meta-learner, combining Random Forest and LightGBM
  • Feature Importance Analysis:

    • Used LightGBM’s built-in tools to visualize and interpret key features influencing predictions

Results & Conclusion

  • Performance:

    • Random Forest and LightGBM achieved the best AUC scores (0.8896 and 0.8871, respectively); stacking ensemble provided only a marginal improvement (AUC 0.8917)
    • LightGBM was chosen for deployment due to its strong accuracy, high efficiency (much shorter training time than Random Forest), and model interpretability
  • Key Insights:

    • The most important driver of product subscription was the duration of the last client-bank interaction, followed by macroeconomic indicators and historical contact intervals.
    • Results support more data-driven, cost-effective, and personalized marketing strategies

My Contributions to the Project

  • Led data preprocessing, feature engineering, and model selection
  • Performed model optimization and ensemble experiments
  • Visualized feature importances and interpreted business implications
  • Authored the final report and delivered actionable insights for bank marketing

Project code and results →