Paddy rice yield prediction across spatial and temporal scales is important for enhancing precision agriculture, ensuring food security, and climate mitigation, as paddy rice is a significant source of methane emissions. However, achieving accurate paddy rice yield estimations using remote sensing is challenging due to the complexities of rice plant phenology, soil background, and the need for sufficient field data. To address these research gaps, we extracted vegetation indices (VIs) from high-resolution PlanetScope imagery to reflect the optimal phenological period of rice growth stages, employed field observations of rice yield (collected in 2024), and adopted a novel framework of recursive feature elimination (RFE) and mutual information regression (MIR) for estimating rice yield. We further tested this framework on four machine learning (ML) algorithms, specifically, random forest (RF), extreme gradient boosting (XGBoost), k-nearest neighbours (k-NN), and artificial neural networks (ANN) in predicting paddy rice yield in our study site in Nigeria. All ML algorithms performed optimally when feature selection was applied using recursive feature elimination and mutual information regression. k-NN outperformed all models with an R² = 0.61 and RMSE = 578.43 kg/ha. However, ANN (R² = 0.58 and RMSE = 601.29kg/ha) outperformed RF (R² = 0.44 and RMSE = 694.48kg/ha) and XGBoost (R² = 0.34 and RMSE = 756.48kg/ha). While k-NN, RF, and XGBoost are quite sensitive to parameter optimisation, ANN performed optimally using only a few spectral features. The RF model explained 44% of the variance in the data, resulting in a 29.41% reduction in Root Mean Square Error (RMSE) and a 67% decrease in bias. Similarly, the performance of XGBoost improved compared to the previous model, with the variance explained in the data increasing by 13.33% and RMSE decreasing by 3.25%. This study demonstrates how the feature selection method can be utilised for accurate yield estimation. The mapping framework developed in this study will be useful in spatial planning applications to inform near-real-time crop yield estimates, precision agriculture, climate-smart agriculture, thereby helping farmers, researchers, and policymakers in making informed decisions about crop management, resource distribution, and food security.<p></p>
Funding
Institute for Environmental Futures at Space Park Leicester, University of Leicester, through the visiting postdoctoral research fellowship
Natural Environment Research Council of the UK through the National Centre for Earth Observation.
History
Author affiliation
University of Leicester
College of Science & Engineering
Geography, Geology & Environment
The field data are available from the corresponding author and can be made available based on requests. We are unable to share the PlanetScope data (satellite data) due to licensing restrictions. However, according to Planet, any university-affiliated student, faculty member, or researcher may apply through their Education and Research Program to access the data for non-commercial use. We encourage users interested in these data to check this site for more information about the data accessibility: https://www.planet.com/industries/education-and-research/