Climbing the Kaggle Leaderboard: Bank Marketing with XGBoost

I recently joined Kaggle Playground Series – Season 5, Episode 8, a competition based on the classic UCI Bank Marketing dataset. The goal is straightforward: given customer demographics, account balances, and marketing contact history, predict whether they will open a new account. Straightforward, that is, until you try to build a model that ranks well against thousands of entries.

With more than 3,200 submissions, this Playground has been both competitive and a great learning opportunity. I entered with the goal of sharpening my XGBoost workflow, experimenting with feature engineering, and building out a strong portfolio project to share with recruiters and peers.

Starting Point: A Simple Split

My first baseline used standard preprocessing and a single train/validation split. It worked, but the results weren’t impressive: a Validation AUC of ~0.967 and a leaderboard rank around 1600th place. Not terrible, but not the kind of score that gets you noticed.

The biggest problem? That single validation split gave me a noisy, sometimes misleading picture of how the model would generalize. AUC numbers bounced depending on how I shuffled the data. I knew I needed a better validation strategy.

Improvements: Feature Engineering and Validation Strategy

The next step was to refine the features and the training process.

  • Feature Engineering:

    • Transformed calendar fields ( ‘day’, ‘month’ ) (day, month) into cyclic features (sin, cos).

    • Applied log scaling to skewed variables (balance, duration).

    • Created boolean flags for missing values (pdays == -1) and zero-contact cases (previous == 0).

  • Modeling: Used XGBoost with GPU acceleration for fast experimentation.

  • Validation: Most importantly, switched to Stratified 5-Fold Cross-Validation. This change alone stabilized my results, reduced noise, and gave me a fairer estimate of how the model would perform on unseen data.

I also tried Optuna hyperparameter tuning, which gave me some refinements, but surprisingly my ad-hoc parameter set was already quite strong. The real breakthrough came from the validation strategy.

Results: Climbing the Leaderboard

The payoff was immediate. With 5-Fold CV and fold-bagging (averaging models across folds), my Validation AUC improved to ~0.9704. That might not sound like a huge jump from 0.967 — but in a Playground competition, small improvements matter.

That modest gain was enough to push my leaderboard ranking from ~1600th to ~800th out of 3200 entries — a jump of about 50%!

Here’s the quick summary:

ApproachValidation AUCLeaderboard Position
Single split~0.967~1600th
Ad-hoc full dataset~0.975– (offline only)
5-Fold CV ensemble~0.970~800th

Lessons Learned

A few key takeaways stood out:

  • Validation strategy matters more than tuning. Switching from a single split to K-Fold CV improved both performance and trust in the results.

  • Simple feature engineering pays off. Cyclic encodings, log transforms, and boolean flags added measurable gains.

  • Small AUC bumps mean big leaderboard jumps. Even a 0.003–0.004 improvement can move you hundreds of places in a crowded competition.

  • Iteration beats perfection. The notebook is still evolving — I plan to explore target mean encoding, feature ratios (like campaign/previous), and model ensembling — but even interim results show meaningful progress.

Wrapping Up

Competitions like this are great practice for real-world machine learning: you deal with messy features, class imbalance, data leakage risks, and the need for robust validation. More importantly, you learn that discipline and iteration often matter more than chasing the fanciest hyperparameters.

For those curious, you can:

I’ll continue to refine the notebook, but this milestone already shows what a solid workflow can achieve: climbing from 1600th to 800th in a competition of 3,200+ entries.

Leave a Reply