University of Leicester
Browse

Emerging SMOTE and GAN Variants for Data Augmentation in Imbalance Machine Learning Tasks: A Review

Download (6.39 MB)
journal contribution
posted on 2025-08-06, 13:43 authored by Amadi G Udu, Marwah T Salman, Maryam K Ghalati, Andrea Lecchini-Visintini, David SiddleDavid Siddle, Hongbiao DongHongbiao Dong
Class imbalance is a pervasive challenge in real-world machine learning (ML) applications, where the minority class, often the class of interest, is significantly underrepresented. This imbalance can negatively affect model performance, lead to misleading evaluation metrics, and introduce validation challenges. Two prominent data-augmentation techniques to address class imbalance are the Synthetic Minority Oversampling Technique (SMOTE) and Generative Adversarial Networks (GAN). However, both techniques have their inherent limitations, motivating the emergence of novel variants designed to overcome these challenges. While previous reviews have primarily focused on specific domains, traditional methodologies, or broad strategy overviews, this reviewpresents a unified taxonomy that captures the causes, types, and implications of class imbalance across diverse ML tasks. It further explores emerging trends in SMOTE and GAN applications, limitations, and hybrid adaptations. By categorising imbalance types and examining models, metrics, datasets, and comparative approaches, this review provides actionable insights and future research directions for practitioners and researchers addressing class imbalance in real-world ML tasks.<p></p>

Funding

Iraqi Prime Minister’s Office, the Higher Committee of Education and Development in Iraq (HCED), the Petroleum Technology Development Fund, Nigeria

10.13039/501100004227-NISCO U.K. Research Centre, School of Engineering, University of Leicester

History

Author affiliation

College of Science & Engineering Engineering

Version

  • VoR (Version of Record)

Published in

IEEE Access

Volume

13

Pagination

113838

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

eissn

2169-3536

Copyright date

2025

Available date

2025-08-06

Language

en

Deposited by

Dr David Siddle

Deposit date

2025-07-15

Usage metrics

    University of Leicester Publications

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC