Skip to content
Pusat Penelitian, Pengabdian kepada Masyarakat dan Publikasi Internasional
twitter
youtube
instagram
Pusat Penelitian, Pengabdian kepada Masyarakat dan Publikasi Internasional
Call Support 0822-7473-7806
Email Support [email protected]
Location Jl. Kolam No. 1 Medan Estate
  • Beranda
  • Tentang
    • Profil
    • Visi dan Misi
    • Struktur Organisasi
    • Pimpinan Pusat
    • Program Kerja
    • Sasaran, Program Strategis dan IK
  • Berita Kegiatan
  • Layanan & Informasi
    • Aplikasi
      • UMA
        • Penjaminan Mutu
        • Himpunan Aplikasi Online
        • Jurnal Ilmiah Online
        • Repositori UMA
        • Open Access Public Catalog
      • Unit
        • Aplikasi Penelitian & Pengabdian (LIPAN)
        • SWAMP-D
        • SUSITAO
        • SINTA Verifikator
        • BIMA Kemdiktisaintek
    • Arsip Digital
    • Helpdesk
    • Pendanaan
      • Penelitian
        • Penelitian Pendanaan Nasional
        • Penelitian Kerjasama Internasional
      • Pengabdian Kepada Masyarakat
        • PKM Pendanaan Nasional
    • Publikasi
      • Internasional Bereputasi
    • Reviewer Penelitian dan PKM
  • Kerjasama
  • Jadwal Kegiatan

Resampling Techniques for Imbalanced Data

Posted on March 14, 2025March 22, 2025 by Fachrur Rozi
0

Introduction

Resampling techniques are one of the most effective ways to handle class imbalance in machine learning. These methods modify the dataset to ensure that both the majority and minority classes have a more balanced representation. This helps models learn meaningful patterns from all classes instead of being biased toward the majority class.

Why Use Resampling?

In an imbalanced dataset, a model trained without resampling may achieve high accuracy by simply predicting the majority class. However, this would lead to poor performance in detecting the minority class. Resampling helps improve model learning by either increasing minority class examples (oversampling) or reducing majority class examples (undersampling).

Types of Resampling Techniques

1. Oversampling (Increasing Minority Class Samples)

This technique involves adding more samples of the minority class to balance the dataset.

  • Random Oversampling
    • Simply duplicates random instances from the minority class.
    • Increases class balance but may cause overfitting since duplicated samples do not introduce new information.
  • SMOTE (Synthetic Minority Over-sampling Technique)
    • Generates synthetic samples instead of duplicating existing ones.
    • Uses nearest neighbors to create new, slightly different samples.
    • Reduces overfitting risk compared to random oversampling.
  • ADASYN (Adaptive Synthetic Sampling)
    • A variation of SMOTE that focuses more on generating samples for harder-to-classify instances.

2. Undersampling (Reducing Majority Class Samples)

This technique involves removing samples from the majority class to balance the dataset.

  • Random Undersampling
    • Randomly removes majority class samples to match the minority class count.
    • Can lead to loss of important data, potentially reducing model performance.
  • Tomek Links
    • Removes majority class samples that are closest to minority class samples, helping refine class boundaries.
    • Improves decision boundaries without excessive data loss.
  • NearMiss
    • Selects majority class samples that are hardest to classify by keeping those closest to the minority class.
    • Ensures that the remaining majority class samples provide meaningful learning signals.

3. Hybrid Techniques (Combining Oversampling & Undersampling)

Sometimes, using both oversampling and undersampling together provides the best results.

  • SMOTE + Tomek Links
    • SMOTE generates new minority class samples.
    • Tomek Links removes noisy majority class samples.
    • Helps improve class balance while refining decision boundaries.
  • SMOTE + Edited Nearest Neighbors (ENN)
    • SMOTE generates new minority samples.
    • ENN removes misclassified samples from the majority class, reducing noise.

Choosing the Right Resampling Technique

Scenario Recommended Technique
Small dataset with severe imbalance Random Oversampling or SMOTE
Large dataset with imbalance Random Undersampling or Tomek Links
High risk of overfitting SMOTE + Tomek Links
Dataset with noisy labels SMOTE + ENN

Conclusion

Resampling techniques are powerful tools for addressing class imbalance in machine learning. Oversampling helps by increasing minority class representation, while undersampling reduces the dominance of the majority class. Choosing the right resampling method depends on dataset size, class imbalance severity, and potential overfitting risks.

Berita Terbaru
UMA Kukuhkan Posisi sebagai Kampus Swasta Terbaik di Sumut Versi SJR
Universitas Medan Area kembali mencatatkan pencapaian membanggakan di tingkat nasional dengan meraih predikat sebagai perguruan tinggi swasta terbaik di Sumatera...
UMA Terima Kunjungan STIE Graha Kirana: Perkuat Kolaborasi Tridharma dan Pengelolaan HKI
Medan, 24 April 2026 — Universitas Medan Area (UMA) menerima kunjungan akademik dari Sekolah Tinggi Ilmu Ekonomi (STIE) Graha Kirana...
KAMPUS I
Jalan Kolam Nomor 1 Medan Estate / Jalan Gedung PBSI, Medan 20223
(061) 7360168 CALL CENTER : 0811-6013-888
[email protected]
KAMPUS II
Jalan Sei Serayu No. 70 A / Jalan Setia Budi No. 79 B, Medan 20112
(061) 42402994
[email protected]

Statistik Pengunjung

  • 0
  • 42
  • 39
  • 21,827
  • 23,781
@Copyright 2026 BPDI | Universitas Medan Area

This will close in 10 seconds