University of Leicester
Browse

Estimating paddy rice yield using PlanetScope imagery and machine learning

Download (8.85 MB)
journal contribution
posted on 2025-11-12, 15:18 authored by S Ibrahim, Heiko BalzterHeiko Balzter, MS Ozigis
Paddy rice yield prediction across spatial and temporal scales is important for enhancing precision agriculture, ensuring food security, and climate mitigation, as paddy rice is a significant source of methane emissions. However, achieving accurate paddy rice yield estimations using remote sensing is challenging due to the complexities of rice plant phenology, soil background, and the need for sufficient field data. To address these research gaps, we extracted vegetation indices (VIs) from high-resolution PlanetScope imagery to reflect the optimal phenological period of rice growth stages, employed field observations of rice yield (collected in 2024), and adopted a novel framework of recursive feature elimination (RFE) and mutual information regression (MIR) for estimating rice yield. We further tested this framework on four machine learning (ML) algorithms, specifically, random forest (RF), extreme gradient boosting (XGBoost), k-nearest neighbours (k-NN), and artificial neural networks (ANN) in predicting paddy rice yield in our study site in Nigeria. All ML algorithms performed optimally when feature selection was applied using recursive feature elimination and mutual information regression. k-NN outperformed all models with an R² = 0.61 and RMSE = 578.43 kg/ha. However, ANN (R² = 0.58 and RMSE = 601.29kg/ha) outperformed RF (R² = 0.44 and RMSE = 694.48kg/ha) and XGBoost (R² = 0.34 and RMSE = 756.48kg/ha). While k-NN, RF, and XGBoost are quite sensitive to parameter optimisation, ANN performed optimally using only a few spectral features. The RF model explained 44% of the variance in the data, resulting in a 29.41% reduction in Root Mean Square Error (RMSE) and a 67% decrease in bias. Similarly, the performance of XGBoost improved compared to the previous model, with the variance explained in the data increasing by 13.33% and RMSE decreasing by 3.25%. This study demonstrates how the feature selection method can be utilised for accurate yield estimation. The mapping framework developed in this study will be useful in spatial planning applications to inform near-real-time crop yield estimates, precision agriculture, climate-smart agriculture, thereby helping farmers, researchers, and policymakers in making informed decisions about crop management, resource distribution, and food security.<p></p>

Funding

Institute for Environmental Futures at Space Park Leicester, University of Leicester, through the visiting postdoctoral research fellowship

Natural Environment Research Council of the UK through the National Centre for Earth Observation.

History

Author affiliation

University of Leicester College of Science & Engineering Geography, Geology & Environment

Version

  • VoR (Version of Record)

Published in

Smart Agricultural Technology

Volume

12

Pagination

101447 - 101447

Publisher

Elsevier BV

issn

2772-3755

eissn

2772-3755

Copyright date

2025

Available date

2025-11-12

Language

en

Deposited by

Professor Heiko Balzter

Deposit date

2025-11-05

Data Access Statement

The field data are available from the corresponding author and can be made available based on requests. We are unable to share the PlanetScope data (satellite data) due to licensing restrictions. However, according to Planet, any university-affiliated student, faculty member, or researcher may apply through their Education and Research Program to access the data for non-commercial use. We encourage users interested in these data to check this site for more information about the data accessibility: https://www.planet.com/industries/education-and-research/

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC