Evaluation of gridded precipitation and temperature datasets in Spain. A proposal for improving their accuracy using random forest multi-model ensembles

Authors

DOI:

https://doi.org/10.5311/JOSIS.2026.32.449

Keywords:

Climatic datasets, random forest, random forest spatial interpoolation, multi-model ensembles, ANOVA with correction errors, Iberian Peninsula

Abstract

The generation of uniform, gridded data from spatially discontinuous station values and the assessment of their accuracy are essential for water resources assessment and management, and for climate change studies, especially in semi-arid environments. Spatial and temporal grids have been generated in recent years as a basis for several studies. This work has two objectives: a) to evaluate the accuracy of the grids available for Spain by comparing their monthly values with stations not used in their estimation or prediction, and b) to verify the improvement in accuracy using Multi-Model Ensembles based on machine learning. A dual ensemble approach is presented: (i) multiple individual Random Forest (RF) ensembles per weather station, using only the information from the station, and (ii) spatially distributed grid prediction using a single ensemble model that incorporates all the information from the nearest stations and their distances using Random Forest Spatial Interpolation -RFSI-). Both models were used to generate monthly data grids of maximum, minimum and mean temperature, and total precipitation, with high spatial resolution (5 km). Seven datasets: Iberia01, STEAD, AEMET, SIMPA, EOBSv27 and STEAD, were used as predictors. Accuracy was estimated using the root mean square error, the percentage bias and the Nash-Sutcliffe efficiency index obtained using block cross-validation buffering (LOOBUF-CV), robust to spatial autocorrelation. The significance of the differences was assessed using ANOVA with heteroscedasticity correction in the residuals. Preliminary results indicate that multi-model ensembles using RF outperform individual grids. Among other reasons, ensembles aggregate the different representations of meteorological processes included in each grid and reduce the uncertainty associated with each individual grids.

449

Downloads

Published

2026-07-01

Issue

Section

Research Articles