is-resampling-imbalanced-datasets-a-thing-of-the-past

🎉 Big December News: A New Way to Pay!

We’ve heard you...👂🏻 international fees and surprise charges can be a pain.

So this month, we’re testing something new… and you get first access. 🙌

💳 You can now pay in your own currency. NO hidden fees, NO extra charges.

For now, it’s available on just two courses (like a secret soft launch 😉):

✨ Machine Learning Interpretability
✨ Clustering and Dimensionality Reduction

And because you're helping us test this new feature, here’s a thank-you:

🎟 30% OFF all month with code HELPUSTEST
⏰ Until December 31st

If anything feels odd or doesn’t work smoothly, just email us at [email protected]. We’ll fix it right away.

Thank you for being part of our global learning family.

Here’s to smoother, simpler learning this December. 🌍✨

Enroll today and enjoy 30% OFF

Is Resampling Imbalanced Datasets a Thing of the Past?

Mounting evidence suggests that there are better ways to tackle imbalanced datasets that do not include resampling.

For once, we should use strong classifiers like xgboost and catboost.

Then, we need to optimize the decision threshold for the classification, and not just use 0.5

There is also cost-sensitive learning to optimize the model at no extra cost. It comes backed into most model implementations.

🤔 So then, when should we use resampling?

👉 Resampling can still be used when it is impossible to use a strong classifier, say for legacy, or for whatever reason that I can’t imagine. It has been shown to improve the performance of weaker learners like random forests, adaboost, SVMs and MLPs.

👍Resampling can also be useful if the model does not output a probability, just a class.

💡So as always, there is not one solution that fits all. Depending on the project, the model, and the data, resampling may be a tool we can use, or it is better to stay away from it.

I hope this information was useful!

Wishing you a successful week ahead - see you next Monday! 👋🏻

Sole

Merry Christmas from Train in Data! 🎄✨

Wishing you a joyful and peaceful holiday season filled with warmth, gratitude, and meaningful moments.

Thank you for being part of our learning community this year, your curiosity, dedication, and passion inspire everything we do.

May this festive season bring light to your days and hope to your year ahead.

Warm wishes,

Train in Data Team 💙

Ready to enhance your skills?

Our specializations, courses and books are here to assist you:

Clustering and Dimensionality Reduction (new course)
Forecasting with Machine Learning (course)
Feature Selection for Machine Learning with Python (book)

More courses

Did someone share this email with you? Think it's pretty cool? Then just hit the button and subscribe to Data Bites. Don’t miss out on any of our tips and propel your data science career to new heights.

Subscribe

Hi…I’m Sole