University of Leicester
Browse

AneRBC dataset: a benchmark dataset for computer-aided anemia diagnosis using RBC images

Download (90.03 MB)
journal contribution
posted on 2025-03-12, 09:22 authored by Muhammad Shahzad, Syed Hamad Shirazi, Muhammad Yaqoob, Zakir Khan, Assad Rasheed, Israr Ahmed Sheikh, Asad Hayat, Huiyu ZhouHuiyu Zhou

Visual analysis of peripheral blood smear slides using medical image analysis is required to diagnose red blood cell (RBC) morphological deformities caused by anemia. The absence of a complete anaemic RBC dataset has hindered the training and testing of deep convolutional neural networks (CNNs) for computer-aided analysis of RBC morphology. We introduce a benchmark RBC image dataset named Anemic RBC (AneRBC) to overcome this problem. This dataset is divided into two versions: AneRBC-I and AneRBC-II. AneRBC-I contains 1000 microscopic images, including 500 healthy and 500 anaemic images with 1224 × 960 pixel resolution, along with manually generated ground truth of each image. Each image contains approximately 1550 RBC elements, including normocytes, microcytes, macrocytes, elliptocytes, and target cells, resulting in a total of approximately 1 550 000 RBC elements. The dataset also includes each image’s complete blood count and morphology reports to validate the CNN model results with clinical data. Under the supervision of a team of expert pathologists, the annotation, labeling, and ground truth for each image were generated. Due to the high resolution, each image was divided into 12 subimages with ground truth and incorporated into AneRBC-II. AneRBC-II comprises a total of 12 000 images, comprising 6000 original and 6000 anaemic RBC images. Four state-of-the-art CNN models were applied for segmentation and classification to validate the proposed dataset. Database URL: https://data.mendeley.com/preview/hms3sjzt7f?a=4d0ba42a-cc6f-4777-adc4-2552e80db22b

History

Author affiliation

College of Science & Engineering Comp' & Math' Sciences

Version

  • VoR (Version of Record)

Published in

Database

Volume

2024

Pagination

baae120

Publisher

Oxford University Press (OUP)

issn

1758-0463

eissn

1758-0463

Copyright date

2024

Available date

2025-03-11

Spatial coverage

England

Language

en

Deposited by

Professor Huiyu Zhou

Deposit date

2025-02-05

Data Access Statement

The complete image data of this dataset is available at https://data.mendeley.com/datasets/hms3sjzt7f/1 The code used for segmentation and classification is available at https://github.com/shahzadmscs/AneRBC_Segmentation_Classification_code. The code for uploading image data along with CBC and morphology reports for the training of the CNN model is available at https://github.com/shahzadmscs/AneRBC_Segmentation_Classification_code/blob/main/upload_data_for_training.ipynb