Next Article in Journal
Extreme Flood Disasters: Comprehensive Impact and Assessment
Previous Article in Journal
Benefit Sharing in Hydropower Development: A Model Using Game Theory and Cost–Benefit Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modern Techniques to Modeling Reference Evapotranspiration in a Semiarid Area Based on ANN and GEP Models

1
Laboratory of Water & Environment, Faculty of Nature and Life Sciences, Hassiba Benbouali University of Chlef, Chlef 02180, Algeria
2
Research Institute of Engineering and Technology, Hanyang University, Ansan 15588, Korea
3
Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz 51666, Iran
4
Construction and Project Management Research Institute, Housing and Building National Research Centre, Giza 12311, Egypt
5
Department of Sanitary Engineering and Water Management, University of Agriculture in Krakow, 30-059 Krakow, Poland
6
Department of Civil Engineering and NOAA-CREST, The City College of New York, New York, NY 10031, USA
7
Department of Civil and Environmental Engineering, Hanyang University, Ansan 15588, Korea
*
Authors to whom correspondence should be addressed.
Water 2022, 14(8), 1210; https://doi.org/10.3390/w14081210
Submission received: 19 March 2022 / Revised: 3 April 2022 / Accepted: 6 April 2022 / Published: 9 April 2022
(This article belongs to the Section Hydrology)

Abstract

:
Evapotranspiration (ET) is a significant aspect of the hydrologic cycle, notably in irrigated agriculture. Direct approaches for estimating reference evapotranspiration (ET0) are either difficult or need a large number of inputs that are not always available from meteorological stations. Over a 6-year period (2006–2011), this study compares Feed Forward Neural Network (FFNN), Radial Basis Function Neural Network (RBFNN), and Gene Expression Programming (GEP) machine learning approaches for estimating daily ET0 in a meteorological station in the Lower Cheliff Plain, northwest Algeria. ET0 was estimated using the FAO-56 Penman–Monteith (FAO56PM) equation and observed meteorological data. The estimated ET0 using FAO56PM was then used as the target output for the machine learning models, while the observed meteorological data were used as the model inputs. Based on the coefficient of determination (R2), root mean square error (RMSE), and Nash–Sutcliffe efficiency (EF), the RBFNN and GEP models showed promising performance. However, the FFNN model performed the best during training (R2 = 0.9903, RMSE = 0.2332, and EF = 0.9902) and testing (R2 = 0.9921, RMSE = 0.2342, and EF = 0.9902) phases in forecasting the Penman–Monteith evapotranspiration.

1. Introduction

Food systems are under pressure to boost yields due to rising global food demand despite water resource constraints. As a result, there is a need to shift to more sustainable farming techniques and optimized operations that allow for more efficient use of water resources [1]. Appropriate irrigation management, which is dependent on accurate predictions of crop water requirements, is a critical component of efficient agricultural techniques [2]. Evapotranspiration (ET) is a measure of crop water requirements that includes the transport of vapor water from the land to the atmosphere by evaporation from the soil and transpiration from the plants [3]. ET is one of the most important components of the hydrological cycle and global climate system [4,5]. Accurate estimation of ET is necessary for water resource management, irrigation planning, watershed management, and the design of drainage systems [6]. To calculate the amount of ET for an agricultural system, the reference evapotranspiration (ET0) is calculated first. However, estimating ET0 is known to be very complex. ET0 is either measured directly (e.g., by lysimeter or pan setups), or complex physics-based experimentally validated equations are used. It is clear that direct measurements are very costly and time-consuming. Many commonly used physics-based equations [7,8,9,10,11], including the FAO-56 Penman–Monteith, involve multiple parameters which may not all be known from local observations [3]. Nevertheless, the FAO-56 Penman–Monteith method has been accepted as a standard and used by scientists in different climates. A high correlation is observed between the ET0 values obtained from the FAO-56 Penman–Monteith method and direct measurements even in different climatic conditions. Therefore, scientists have considered the values computed using the FAO-56 Penman–Monteith method as the desired output of data-based artificial intelligence methods and different combinations of meteorological variables as inputs for such methods for accurately estimating ET0 [12,13,14,15,16].
The development of computing, software, informatics, and networking has facilitated the measurement and computational estimation of meteorological variables to a great extent. As a result of these developments, data-based models are frequently used in modeling stochastic and complex non-linear dynamics in water resources engineering [17,18,19,20]. For example, de Oliveira Ferreira Silva et al. [21] presented the R package “agriwater” for the spatial modeling of actual evapotranspiration and radiation balance. Thorp et al. [22] developed a methodology for unbiased evaluation and comparison of three ET algorithms in the Cotton2K agroecosystem model. Guven et al. [23] successfully estimated the daily amount of ET0 in California, USA, with Genetic Programming (GP). Rahimikhoob [24] predicted ET0 values with Artificial Neural Networks (ANN) using temperature and relative humidity parameters in an eight-station region of Iran with a subtropical climate. Ozkan et al. [25] successfully estimated daily ET0 amounts using ANN and bee colony hybrid method using the meteorological data of two stations in California, USA. Cobaner [26] estimated ET0 amounts in the USA using wavelet regression (WR) and class A pan evaporation data. WR model results were found to give better results than the FAO-56 Penman–Monteith equation. Ladlani et al. [27] applied Adaptive Neuro-Fuzzy Inference System (ANFIS) and multiple linear regression models for daily ET0 estimation in the north of Algeria. According to the results of the study, ANFIS yielded better results.
Wen et al. [28] calculated daily ET0 amounts using the Support Vector Machine (SVM) method in a region of China that was extremely arid. The authors utilized limited meteorological variables as model input. It was observed that modeling is sufficient in estimating daily ET0 based on maximum and minimum temperature. Gocić et al. [29] used GP, ANN, SVM-firefly optimization algorithm, and SVM-wavelet models for ET0 prediction in Serbia. This particular study took the FAO-56 Penman–Monteith equation to be the basic method. The results pointed to SVM-wavelet being the best performing methodology for the estimation of ET0 under the given conditions. Petković et al. [30] estimated the amount of ET0 in Serbia between 1980 and 2010 using Radial Basis Function Neural Network (RBFNN) coupled with particle swarm optimization and backpropagation RBFNN methods. Pandey et al. [31] estimated daily ET0 by methods like ANN, support vector regression, and non-linear regression. In this study, limited climatic parameters were used as model input. Daily ET0 values calculated from the FAO56PM method were compared to the model output. The results pointed to the acceptability of the ANN model estimations. Fan et al. [32] estimated the daily ET0 amount using SVM, extreme learning machine models, and four tree-based ensemble methods in China’s different climatic conditions. The results pointed to the fact that tree-based ensemble methods can yield appropriate results in different climates. Wu et al. [33] used cross-station and synthetic climate data to estimate the amount of ET0. They also found that machine learning methods could perform successfully in the prediction process.
The major objective of this study is to model reference evapotranspiration in a semi-arid region. This study investigates the potential of RBFNN, Feed Forward Neural Network (FFNN), and Gene Expression Programming (GEP) models, as relatively new tools, for the estimation of daily ET0 values using different combinations of climatic variables. The models are applied in a semi-arid farmland area, namely the Lower Cheliff Plain in northwest Algeria. This research made use of the well-known FAO-56 (PM56) equation as the basic method. In this article, the role of climatic parameters in ET0 estimation in this semi-arid region was also determined.

2. Materials and Methods

2.1. Study Area and Meteorological Data Acquisition

The study area was the Lower Cheliff Plain in northwest Algeria (Figure 1), which is located between latitudes 34°03′12″ and 36°05′57″ N and longitudes 0°40′ and 01°06′08″ E and covers 40,000 hectares [34]. The climate in this region is classed as semi-arid. The average yearly rainfall is between 250 and 320 mm. Temperatures are highest in July and August and lowest in January. The average annual temperature varies from 19.5 degrees Celsius in the north to 25.3 degrees Celsius in the south. The Hmadna station (SYNMET Automatic Station), located at latitude 35°55′31″ N and longitude 00°45′04″ E, supplied historical data for this investigation. Table 1 lists the units of measurement and the sensor’s measuring range.

2.2. Description of Data

The meteorological data include daily observations of maximum, minimum, and mean air temperatures (Tmax, Tmin, and Tmean), daily mean relative humidity (RH), wind speed (WS), sunshine duration (SD), and global radiation (GR). The days with data that proved to be inadequate were excluded from the patterns. The statistical parameters pertaining to the daily climatic data are given in Table 2, in which the Xmean, Xmax, Xmin, Sx, and CV stand for the mean, maximum, minimum, standard deviation, and coefficient of variation, respectively.

2.3. Evapotranspiration Estimation Method

The FAO-56 Penman–Monteith method to calculate ET0 was implemented following the formulation in [3] as a function of daily mean net radiation, temperature, water vapor pressure, and wind speed. The procedure used was that outlined in Chapter 3 of FAO-56 [3].
ET 0 = 0.408 Δ ( R n G ) + γ 900 T m e a n + 273 U 2
where ET 0 is the reference crop evapotranspiration (mm day−1), R n is the net radiation (MJ m−2 day−1), G is the soil heat flux (MJ m−2 day−1), c is the psychrometric constant (kPaC−1), e s is the pressure of saturation vapor (kPa), e a is the pressure of the actual vapor (kPa), D is the slope of the curve for saturation vapor pressure–temperature (kPaC−1), T a is the average daily air temperature (°C), and U 2 is the mean daily wind speed at 2 m (m s−1).

2.4. Multilayer Perceptron Artificial Neural Network

ANNs are non-linear mathematical models based on ideas about the behavior of biological neural networks. An ANN consists of layers of interconnected nodes or neurons. Each neuron gets a linear combination of the previous neuron’s outputs (∑wijxj), or (for the first layer) of the network inputs and returns a non-linear transformation of this quantity.
The weights (wij) are the parameters added to each source defining this linear combination and typically also include an intercept term called the activation threshold [35]. A non-linear activation function is then applied to the linear output combination (f (∑wijxj)). This activation function can be, for example, a sigmoid function, which constrains each neuron’s output values between two asymptotes. Once the activation function is applied, each neuron’s output feeds into the outputs of the next layer. The most frequently used architecture for an ANN consists of an input layer in which the data is introduced into the ANN, a hidden layer(s) in which the data undergoes processing, and the output layer in which the effects of the input generate a predicted output value(s) [35].
The literature contains many kinds of neural networks that have been put to many uses. The Multilayer Perceptron (MLP) is a commonly used ANN configuration utilized regularly in the hydrological modeling field [36,37] (Figure 2). This study assesses the usefulness of neural MLP networks for the estimation of EP. The MLP is the most frequently used and simplest neural network architecture [38].

2.5. Radial Basis Function

Another architecture that is used commonly in ANN is the RBF. Multilayer and feed-forward RBF is often used for multi-dimensional spatial interpolation. The word “feed-forward” means the neurons in a layered neural network are arranged in layers [39]. The underlying architecture of a neural network with three layers is presented in Figure 3, with one hidden layer between input and output layers. The activation function of each neuron has the form of an RBF, generating a response only if the inputs are close to some central value determined for that particular neuron.

2.6. Gene Expression Programming

While ANNs are complicated models that typically do not capture the physical relationships between different process components understandably, GEP models can express the relationship between dependent and independent variables explicitly [40]. The procedure for modeling daily evapotranspiration (considered to be the dependent variable) based on weather variables (considered as the independent variables) involves the following: selecting the fitness function; selecting terminals T and set of functions F for creating chromosomes; selecting chromosome architecture, and selecting the link function and genetic operators (Figure 4) [35].

2.7. Evaluation Criteria

The performance of the models utilized in this study was evaluated using standard criteria for statistical performance evaluation. The statistical measures taken into account were coefficient of determination (R2), root mean square error (RMSE), and Nash Sutcliffe efficiency coefficient (EF) [41,42,43]. The calculation of the three criteria was done according to Equations (2)–(4).
R 2 = 1 i = 1 N ( E T i ( o b s e r v e d ) E T i ( m o d e l ) ) i = 1 N ( E T i ( o b s e r v e d ) E T m e a n )
R M S E = 1 N i = 1 N ( E T i ( o b s e r v e d ) E T i ( m o d e l ) )
E F = 1 i = 1 N ( E T i ( o b s e r v e d ) E T i ( m o d e l ) ) 2 i = 1 N ( E T i ( o b s e r v e d ) E T m e a n ) 2
where N is the number of observed E T data, E T i ( o b s e r v e d ) and E T i ( m o d e l ) are observed and model estimations of E T , respectively, and ETmean is the mean of observed E T .

3. Results and Discussion

In this study, firstly, ET0 values were computed by the Penman–Monteith method using climatic data. Then the following equation was used to normalize the input (meteorological data) and output (calculated ET0 by Penman–Monteith):
X n = 2 · X o X m i n X m a x X m i n 1
where: X n and X o stand for the normalized and original data, while X m i n and X m a x represent the minimum and maximum values in the original data. Approximately 70% of the available data period (from around 2006 to 2010) was selected for the training phase; the remaining 30% belonged to the year 2011 and was used for the testing process. MATLAB was used for the modeling process.

3.1. Application of MLP

In this study, the FFNN algorithm was used with a single hidden layer. More details about the parameters used for the FFNN model with one hidden layer are listed in Table 3. With the input data playing a considerable role in model development, several input combinations were used for model development. The performances of all MLP-based input combinations are listed in Table 4 for the training and testing stages. MLP-based model development is a trial and error process. In this study, the tangent sigmoid transfer function was used in the hidden layer, and the linear transfer function was used for the target. To achieve ideal performance with MLP models, the number of neurons in the hidden layer has to be optimized. The results in Table 4 suggest that the FFNN2 model, including Tmax, Tmean, (Tmax − Tmin), RH, I, WS, and GR, performed better than other FFNN-based input combination models with R2 values as 0.9903, 0.9921, RMSE values as 0.2332, 0.2342, and E values as 0.9902, 0.9902 for both training and testing stages, respectively. Nineteen neurons were used in the hidden layer to achieve this ideal performance. The performance and agreement plot among actual and predicted values of the FFNN2 model for both the training and testing stage are mapped out in Figure 5, which shows that max values lie very close to the line of 450 and follow the same pattern as the actual values in both training and testing stages. If all the values lie on the line of 450 and follow the same path, the model is ideal and predicts values similar to actual ones.
Performance evaluation results suggest that the FFNN2 model performed better than other input combination-based models. As to comparing various input combination-based models with one another, the results in Table 4 indicate that several other models are comparable in performance to the best model (FFNN2) while having a lower number of required input meteorological variables. Overall, going with the assessment in Table 4, the FFNN11 model (Tmean, RH, WS, and GR) is suitable for predicting ET with R2 values as 0.9875, 0.9892, RMSE values as 0.2656, 0.2623, and E values as 0.9873, 0.9877 for both training and testing stages, respectively. The same number of neurons (19) is used in the single hidden layer for achieving this performance, similar to the FFNN2 model. The performance and agreement plot among actual and predicted values of the FFNN11 model for both the training and testing stage is shown in Figure 5, which points to the fact that max values lie very close to the line of perfect agreement and follow the same pattern as the actual values in both training and testing stages.

3.2. Application of RBF

For the RBF method as well, several input combinations were used for model development. The performance of all input combination-based RBFNN models is listed in Table 5 for the training and testing stages. RBFNN model development is a trial and error process similar to FFNN model development. In this study, the RBF models had a single hidden layer. To achieve ideal performance with RBFNN models, the value of the spread must be found through a trial and error process. The results of Table 5 suggest that the RBFNN5 model, including Tmin, Tmax, Tmean, RH, I, WS, and GR, performs better than other input combination RBFNN based models with R2 values as 0.9907, 0.9911, RMSE values as 0.2270, 0.2374, and E values as 0.9907, 0.9899 for both training and testing stages, respectively. The performance and agreement plot among actual and predicted values of the RBFNN5 model for both training and testing stages are shown in Figure 6, which shows that max values lie very close to the line of 450 and follow the same pattern as the actual values in both training and testing stages.
The performance evaluation results suggest that the RBFNN5 model performs better than other input combination-based models. On intercomparison among various input combination-based models, the results in Table 5 indicate that the performance of several other models was comparable to the best model (RBFNN5) and involved a lower number of inputs. Overall, the assessment mapped out in Table 5 shows that the RBFNN11 model (Tmean, RH, WS, and GR) is suitable for predicting the ET with R2 values as 0.9886, 0.9892, RMSE values as 0.2514, 0.2551, and E values as 0.9886, 0.9884 for both training and testing stages, respectively. A lower rate of spread (Table 5) was used in the development of this model than in the case of the RBFNN5 model. The performance and agreement plot among actual and predicted values of the RBFNN11 model for both training and testing stages are shown in Figure 6, which shows that max values lie very close to the line of perfect agreement and follow the same pattern as the actual values in both training and testing stages.

3.3. Application of GEP

The details of parameters used in the GEP model are listed in Table 6. The performance of all input combination-based GEP models is listed in Table 7 for the training and testing stages. GEP based model development is also a trial and error process similar to the model development typical of FFNN and RBFNN models. For the performance of GEP models under different input combinations, for the training phase, the R2 ranged between 0.6973 and 0.9664, RMSE ranged 0.4830–1.3112 mm day−1, and EF ranged 0.6895–0.9579. So, for the test phase, the R2 ranged between 0.8057–0.9775, RMSE ranged 0.3701–1.1224 mm day−1, and E ranged 0.7744–0.9755 (Table 7). It is clear that the presence or absence of critical meteorological variables in the input combinations significantly affected GEP model performance. The results of Table 7 suggest that the GEP11 model, including Tmean, RH, WS, and GR parameters in the input combination, performed better than other input combinations and GEP based models with R2 values as 0.9606, 0.9775, RMSE values as 0.4830, 0.3701, and E values as 0.9579, 0.9755 for the training and testing stages, respectively. The performance and agreement plot among actual and predicted values of the GEP11 model for both training and testing stages are shown in Figure 7, which indicates that max values lie very close to the line of 450 and follow the same path as the actual values in both training and testing stages. Table 7 concludes that the GEP11 model is the best performing model with optimum input combinations.

3.4. Inter Comparison among Best and Optimum Input Combination Based Models

Table 8 shows that the FFNN2 based model works better than the RBFNN and GEP based models. Figure 8 indicates that predicted values using the FFNN2 model lie closer to the line of perfect agreement than the values predicted by the RBFNN and GEP based models.
The overall performance of the FFNN2 based model is reliable and suitable for the prediction of ET0. As such, Tmax, Tmean, (Tmax − Tmin), RH, I, WS, and the GR input combination-based FFNN model could be used for the prediction of ET0. However, the results mapped out in Table 9 of single-factor ANOVA suggest that there is no significant difference between observed and predicted values using FFNN, RBFNN, and GEP best combination-based models.
Figure 9 displays box plots for prediction errors for the best input combination-based models using the test period. The values of the descriptive statistics of prediction errors for the best input combinations are listed in Table 10. According to Table 10 and Figure 9, the FFNN2 model followed the corresponding observed values with lower minimum error (−0.8840), lower maximum error (1.4199), and the width of the first quartile is less than other best input combination based models.
The Taylor diagram of the observed and predicted ET0 by different best input combination-based models over the test period is depicted in Figure 10. It is clear that the representative points of all the applied models have nearly the same position. The FFNN2 model is located nearest to the observed point with the lower value of RMSE and SD and higher value of the coefficient of correlation, which picks out this model as the superior model.
Table 11 proposes that the RBFNN11-based model works better than FFNN and GEP based models. Figure 11 indicates that predicted values using the RBFNN11-based model lie closer to the line of perfect agreement than the values predicted by the FFNN and GEP based models. The overall performance of the RBFNN11 based model is reliable and suitable for the prediction of ET0, which suggests that Tmean, RH, WS, and GR input combination-based RBFNN model could be used for the prediction of ET0. The results in Table 12 of single-factor ANOVA suggest that there is no significant difference between observed and predicted values using FFNN, RBFNN, and GEP optimum input combination-based models.
Figure 12 displays the box plot for the prediction errors for the optimum input combination-based models using the test period. The descriptive statistical values of prediction errors for the optimum input combinations are listed in Table 13. According to Table 13 and Figure 12, the RBFNN11 model has followed the corresponding observed values with lower maximum error (1.3700), and the width of the first quartile (−0.0952) is less than other optimum input combination based models.
The Taylor diagram of the observed and predicted ET0 by different optimum input combination-based models over the test period is depicted in Figure 13. It is clear that the representative points of all the applied models have nearly the same position. The RBFNN11 model is located nearest to the observed point with the lower value of RMSE, SD, and higher value of the coefficient of correlation, making this model emerge as a superior model with the optimum number of input parameters.

4. Conclusions

This study aimed to investigate the potential of FFNN, RBFNN, and GEP to estimate daily evapotranspiration in a semi-arid region in Algeria using different combinations of input meteorological variables. The results pointed to the fact that both the neural network (i.e., FFNN and RBFNN) and GEP models make for optimal levels of agreement with the ET0 obtained by the FAO PM method. They yielded reliable estimations for the semi-arid area in question. The study also found that modeling ET0 utilizing the ANN technique leads to better estimates than the GEP model.
The current results suggested that the FFNN based model 2 outperformed all other applied models. Another major conclusion was that the RBFNN model 11 performed better than other applied models with a smaller number of required meteorological inputs. ANN and GEP based models suggest that Tmean, RH, WS, and GR parameters are the optimum parameters for the estimation of daily evapotranspiration in the semi-arid region of Algeria. The overall performance of all applied models is satisfactory, as there is no significant difference between actual and predicted values using the optimum number of input parameters in the models.

Author Contributions

Conceptualization, M.A.; methodology, M.A. and M.T.S.; software, M.A., M.T.S. and A.K.T.; validation, M.A., M.T.S., M.J., N.E., A.W. and N.K.; formal analysis, N.E.; investigation, M.A., M.T.S. and M.J.; resources, M.A., M.J. and N.E.; data curation, M.A.; writing—original draft preparation, M.A., M.T.S., A.W. and N.K.; writing—review and editing, M.A., M.J., N.E., A.W., N.K. and T.-W.K.; visualization, M.A., N.E. and A.K.T.; supervision, M.A., M.J., and T.-W.K.; project administration, M.J. and J.-Y.Y.; funding acquisition, J.-Y.Y. and T.-W.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (NRF-2020R1C1C1014636).

Data Availability Statement

Not applicable.

Acknowledgments

Thanks to peer reviewers who improved this manuscript. We also thank the ANRH agency for the collected data and the General Directorate of Scientific Research and Technological Development of Algeria (DGRSDT).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gilbert, M.E.; Hernandez, M.I. How should crop water-use efficiency be analyzed? A warning about spurious correlations. Field Crops Res. 2019, 235, 59–67. [Google Scholar] [CrossRef]
  2. Ghiat, I.; Mackey, H.R.; Al-Ansari, T. A review of evapotranspiration measurement models, techniques and methods for open and closed agricultural field applications. Water 2021, 13, 2523. [Google Scholar] [CrossRef]
  3. Allen, R.; Peirera, L.S.; Raes, D.; Smith, M. Crop Evapotranspiration—Guidelines for Computing Crop Water Requirements; FAO—Food and Agriculture Organization of the United Nations: Rome, Italy, 1998; ISBN 92-5-104219-5. [Google Scholar]
  4. Sun, G.; Alstad, K.; Chen, J.; Chen, S.; Ford, C.R.; Lin, G.; Zhang, Z. A general predictive model for estimating monthly ecosystem evapotranspiration. Ecohydrology 2011, 4, 245–255. [Google Scholar] [CrossRef]
  5. Amatya, D.M.; Muwamba, A.; Panda, S.; Callahan, T.J.; Harder, S.; Pellett, C.A. Assessment of spatial and temporal variation of potential evapotranspiration estimated by four methods for South Carolina, USA. JSCWR 2018, 5, 3–24. [Google Scholar] [CrossRef] [Green Version]
  6. Yu, H.; Cao, C.; Zhang, Q.; Bao, Y. Construction of an evapotranspiration model and analysis of spatiotemporal variation in Xilin River Basin, China. PLoS ONE 2021, 16, e0256981. [Google Scholar] [CrossRef]
  7. Thornthwaite, C.W. An approach toward a national classification of climate. Geogr. Rev. 1948, 38, 55–94. [Google Scholar] [CrossRef]
  8. Blaney, H.F.; Criddle, W.D. Determining Water Requirements in Irrigated Areas from Climatological and Irrigation Data; US Soil Conservation Service: Washington, DC, USA, 1950.
  9. Makkink, G.F. Testing the Penman formula by means of lysimeters. J. Inst. Water Eng. 1957, 11, 277–288. [Google Scholar]
  10. Priestley, C.H.B.; Taylor, R.J. On the assessment of surface heat flux and evaporation using large-scale parameters. Mon. Weather Rev. 1972, 100, 81–92. [Google Scholar] [CrossRef]
  11. Hargreaves, G.H.; Samani, Z.A. Reference crop evapotranspiration from temperature. Appl. Eng. Agric. 1985, 1, 96–99. [Google Scholar] [CrossRef]
  12. Kumar, M.; Bandyopadhyay, A.; Raghuwanshi, N.S.; Singh, R. Comparative study of conventional and artificial neural network-based ET0 estimation models. Irrig. Sci. 2008, 26, 531–545. [Google Scholar] [CrossRef]
  13. Mohawesh, O.E. Artificial neural network for estimating monthly reference evapotranspiration under arid and semi-arid environments. Arch. Agron. Soil Sci. 2011, 59, 105–117. [Google Scholar] [CrossRef]
  14. Eslamian, S.S.; Gohari, S.A.; Zareian, M.J.; Firoozfar, A. Estimating Penman–Monteith reference evapotranspiration using artificial neural networks and genetic algorithm: A case study. Arab. J. Sci. Eng. 2012, 37, 935–944. [Google Scholar] [CrossRef]
  15. Citakoglu, H.; Cobaner, M.; Haktanir, T.; Kisi, O. Estimation of monthly mean reference evapotranspiration in Turkey. Water Resour. Manag. 2014, 28, 99–113. [Google Scholar] [CrossRef]
  16. Sattari, M.T.; Apaydin, H.; Shamshirband, S. Performance evaluation of deep learning-based gated recurrent units (GRUs) and tree-based models for estimating ET0 by using limited meteorological variables. Mathematics 2020, 8, 972. [Google Scholar] [CrossRef]
  17. Sattari, M.T.; Mirabbasi, R.; Sushab, R.S.; Abraham, J. Prediction of level in Ardebil plain using support vector regression and M5 tree model. Groundwater 2018, 56, 636–646. [Google Scholar] [CrossRef] [PubMed]
  18. Rouzegari, N.; Hassanzadeh, Y.; Sattari, M.T. Using the hybrid simulated annealing-M5 tree algorithms to extract the If-Then operation rules in a single reservoir. Water Resour. Manag. 2019, 33, 3655–3672. [Google Scholar] [CrossRef]
  19. Apaydin, H.; Feizi, H.; Sattari, M.T.; Colak, M.S.; Shamshirband, S.; Chau, K.-W. Comparative analysis of recurrent neural network architectures for reservoir inflow forecasting. Water 2020, 12, 1500. [Google Scholar] [CrossRef]
  20. Shabani, S.; Samadianfard, S.; Sattari, M.T.; Mosavi, A.; Shamshirband, S.; Kmet, T.; Várkonyi-Kóczy, A.R. Modeling pan evaporation using Gaussian process regression K-nearest neighbors random forest and support vector machines; Comparative analysis. Atmosphere 2020, 11, 66. [Google Scholar] [CrossRef] [Green Version]
  21. de Oliveira Ferreira Silva, C.; de Castro Teixeira, A.H.; Manzione, R.L. Agriwater: An R package for spatial modelling of energy balance and actual evapotranspiration using satellite images and agrometeorological data. Environ. Model. Softw. 2019, 120, 104497. [Google Scholar] [CrossRef]
  22. Thorp, K.R.; Marek, G.W.; DeJonge, K.C.; Evett, S.R.; Lascano, R.J. Novel methodology to evaluate and compare evapotranspiration algorithms in an agroecosystem model. Environ. Model. Softw. 2019, 119, 214–227. [Google Scholar] [CrossRef]
  23. Guven, A.; Aytek, A.; Yuce, M.I.; Aksoy, H. Genetic programming-based empirical model for daily reference evapotranspiration estimation. Clean Soil Air Water 2008, 36, 905–912. [Google Scholar] [CrossRef]
  24. Rahimikhoob, A. Estimation of evapotranspiration based on only air temperature data using artificial neural networks for a subtropical climate in Iran. Theor. Appl. Climatol. 2010, 101, 83–91. [Google Scholar] [CrossRef]
  25. Ozkan, C.; Kisi, O.; Akay, B. Neural networks with artificial bee colony algorithm for modeling daily reference evapotranspiration. Irrig. Sci. 2011, 29, 431–441. [Google Scholar] [CrossRef]
  26. Cobaner, M. Reference evapotranspiration based on Class A pan evaporation via wavelet regression technique. Irrig. Sci. 2013, 31, 119–134. [Google Scholar] [CrossRef]
  27. Ladlani, I.; Houichi, L.; Djemili, L.; Heddam, S.; Belouz, K. Estimation of daily reference evapotranspiration (ET0) in the North of Algeria using adaptive neuro-fuzzy inference system (ANFIS) and multiple linear regression (MLR) models: A comparative study. Arab. J. Sci. Eng. 2014, 39, 5959–5969. [Google Scholar] [CrossRef]
  28. Wen, X.; Si, J.; He, Z.; Wu, J.; Shao, H.; Yu, H. Support-vector-machine-based models for modeling daily reference evapotranspiration with limited climatic data in extreme arid regions. Water Resour. Manag. 2015, 29, 3195–3209. [Google Scholar] [CrossRef]
  29. Gocić, M.; Motamedi, S.; Shamshirband, S.; Petković, D.; Ch, S.; Hashim, R.; Arif, M. Soft computing approaches for forecasting reference evapotranspiration. Comput. Electron. Agric. 2015, 113, 164–173. [Google Scholar] [CrossRef]
  30. Petković, D.; Gocic, M.; Shamshirband, S.; Qasem, S.N.; Trajkovic, S. Particle swarm optimization-based radial basis function network for estimation of reference evapotranspiration. Theor. Appl. Climatol. 2016, 125, 555–563. [Google Scholar] [CrossRef]
  31. Pandey, P.K.; Nyori, T.; Pandey, V. Estimation of reference evapotranspiration using data driven techniques under limited data conditions. Model. Earth Syst. Environ. 2017, 3, 1449–1461. [Google Scholar] [CrossRef]
  32. Fan, J.; Yue, W.; Wu, L.; Zhang, F.; Cai, H.; Wang, X.; Lu, X.; Xiang, Y. Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric. For. Meteorol. 2018, 263, 225–241. [Google Scholar] [CrossRef]
  33. Wu, L.; Peng, Y.; Fan, J.; Wang, Y. Machine learning models for the estimation of monthly mean daily reference evapotranspiration based on cross-station and synthetic data. Hydrol. Res. 2019, 50, 1730–1750. [Google Scholar] [CrossRef]
  34. Douaoui, A.E.K.; Nicolas, H.; Walter, C. Detecting salinity hazards with in a semiarid context by means of combining soil and remote-sensing data. Geoderma 2006, 134, 217–230. [Google Scholar] [CrossRef]
  35. Shiri, J.; Kisi, Ö.; Anderas, G.; López, J.J.; Nazemi, A.H.; Stuyt, L.C.P.M. Daily reference evapotranspiration modeling by using genetic programming approach in the Basque Country (Northern Spain). J. Hydrol. 2012, 414, 302–316. [Google Scholar] [CrossRef]
  36. Leahy, P.; Kiely, G.; Corcoran, G. Structural optimization and input selection of an artificial neural network for river level prediction. J. Hydrol. 2008, 355, 192–201. [Google Scholar] [CrossRef]
  37. Ioannou, K.; Myronidis, D.; Lefakis, P.; Stathis, D. The use of artificial neural networks (ANNs) for the forecast of precipitation levels of Lake Doirani (N. Greece). Fresenius Environ. Bull. 2010, 19, 1921–1927. [Google Scholar]
  38. Haciismailoglu, M.C.; Kucuk, I.; Derebasi, N. Prediction of dynamic hysteresis loops of nano-crystalline cores. Expert Syst. Appl. 2009, 36, 2225–2227. [Google Scholar] [CrossRef]
  39. Lin, F.; Chen, L.H. A non-linear rainfall-runoff model using radial basis function network. J. Hydrol. 2004, 289, 1–8. [Google Scholar] [CrossRef]
  40. Wang, S.; Fu, Z.; Chen, H.; Nie, Y.; Wang, K. Modeling daily reference ET in the karst area of northwest Guangxi (China) using gene expression programming (GEP) and artificial neural network (ANN). Theor. Appl. Climatol. 2015, 126, 493–504. [Google Scholar] [CrossRef]
  41. Dubovský, V.; Dlouhá, D.; Pospíšil, L. The calibration of evaporation models against the Penman–Monteith equation on Lake Most. Sustainability 2020, 13, 313. [Google Scholar] [CrossRef]
  42. Myronidis, D.; Ivanova, E. Generating regional models for estimating the peak flows and environmental flows magnitude for the Bulgarian-Greek Rhodope mountain range torrential watersheds. Water 2020, 12, 784. [Google Scholar] [CrossRef] [Green Version]
  43. Han, X.; Wei, Z.; Zhang, B.; Li, Y.; Du, T.; Chen, H. Crop evapotranspiration prediction by considering dynamic change of crop coefficient and the precipitation effect in back-propagation neural network model. J. Hydrol. 2021, 596, 126104. [Google Scholar] [CrossRef]
Figure 1. Lower Cheliff Plain situation.
Figure 1. Lower Cheliff Plain situation.
Water 14 01210 g001
Figure 2. MLP architecture.
Figure 2. MLP architecture.
Water 14 01210 g002
Figure 3. RBF architecture [39].
Figure 3. RBF architecture [39].
Water 14 01210 g003
Figure 4. General GEP model implementation and general structure [35].
Figure 4. General GEP model implementation and general structure [35].
Water 14 01210 g004
Figure 5. Performance of the best performing (a) FFNN M2 and (b) FFNN M11 models for both training and testing stages.
Figure 5. Performance of the best performing (a) FFNN M2 and (b) FFNN M11 models for both training and testing stages.
Water 14 01210 g005aWater 14 01210 g005b
Figure 6. Performance of the best performing (a) RBFNN M5 and (b) RBFNN M11 models for both training and testing stages.
Figure 6. Performance of the best performing (a) RBFNN M5 and (b) RBFNN M11 models for both training and testing stages.
Water 14 01210 g006aWater 14 01210 g006b
Figure 7. Performance of the optimum combination of inputs/ best input combination based model GEP M11 model for both training and testing stages.
Figure 7. Performance of the optimum combination of inputs/ best input combination based model GEP M11 model for both training and testing stages.
Water 14 01210 g007
Figure 8. Scatter plot among observed and predicted values using best input combination-based models using testing data set.
Figure 8. Scatter plot among observed and predicted values using best input combination-based models using testing data set.
Water 14 01210 g008
Figure 9. Box plot for best performing models.
Figure 9. Box plot for best performing models.
Water 14 01210 g009
Figure 10. Taylor diagram for best performing models.
Figure 10. Taylor diagram for best performing models.
Water 14 01210 g010
Figure 11. Scatter plot among observed and predicted values using optimum input combination-based models using testing data set.
Figure 11. Scatter plot among observed and predicted values using optimum input combination-based models using testing data set.
Water 14 01210 g011
Figure 12. Box plot for best performing optimum number of input combination-based models.
Figure 12. Box plot for best performing optimum number of input combination-based models.
Water 14 01210 g012
Figure 13. Taylor diagram for best performing optimum number of input combination-based models.
Figure 13. Taylor diagram for best performing optimum number of input combination-based models.
Water 14 01210 g013
Table 1. Unit and measuring range of the sensors.
Table 1. Unit and measuring range of the sensors.
Name of SensorMeasuring Unit
Psychrometer%
HeliographMinute
Anemometer0.3 to 50 m/s
Wind direction0° to 360°
Pyranometer0…1400 W/m2 (Max 2000)
Albedometer−2000 to 2000 W/m2
Air temperature−30 °C to 70 °C
Soil temperature−50 °C to 50 °C
Evaporation panMm of water
Rain gaugeMm of water (resolution 0.1 mm)
Table 2. Daily statistical parameters of data set.
Table 2. Daily statistical parameters of data set.
Data SetUnitXminXmaxXmeanSxCV (Sx/Xmean)
Tmin°C−4.3026.2911.686.870.59
Tmax°C6.9848.1627.288.890.33
Tmean°C3.8737.2319.487.530.39
RH%21.5095.6659.6914.390.24
WSm/s0.0028.946.663.810.57
SDh0.0014.107.214.140.57
GRmm9.721791.04969.52446.080.46
Table 3. Parameters used for FFNN with one hidden layer.
Table 3. Parameters used for FFNN with one hidden layer.
ParameterValue
Hidden layer transfer FunctionTangent sigmoid transfer function (tansig)
Output layer transfer FunctionLinear transfer function (purelin)
Training functionLevenberg-Marquardt
Maximum number of epochs to train1000
Maximum validation failures6
Minimum performance gradient1 × 10−7
Initial mu0.001
mu decrease factor0.1
mu increase factor10
Maximum mu1 × 1010
Maximum time to train in secondsInf
Table 4. Statistical criteria for an estimation of ET0 using different input variables for FFNN. The bold part shows that this model is superior to others.
Table 4. Statistical criteria for an estimation of ET0 using different input variables for FFNN. The bold part shows that this model is superior to others.
ModelInputNeuronsTraining PhaseTesting Phase
R2RMSEEFR2RMSEEF
FFNN1Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, WS, GR180.99030.23380.99010.99180.23890.9898
FFNN2Tmax, Tmean, (Tmax − Tmin), RH, I, WS, GR190.99030.23320.99020.99210.23420.9902
FFNN3Tmin, Tmean, (Tmax − Tmin), RH, I, WS, GR130.99050.23080.99040.99170.23680.9900
FFNN4Tmin, Tmax, (Tmax − Tmin), RH, I, WS, GR190.99030.23360.99010.99200.23780.9899
FFNN5Tmin, Tmax, Tmean, RH, I, WS, GR110.98990.23760.98980.99160.23930.9897
FFNN6Tmin, Tmax, Tmean, (Tmax − Tmin), I, WS, GR120.97820.34810.97810.98590.31020.9828
FFNN7Tmin, Tmax, Tmean, (Tmax − Tmin), RH, WS, GR190.98830.25660.98810.99000.25360.9885
FFNN8Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, GR140.93990.57990.93930.96030.50460.9544
FFNN9Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, WS140.96760.42450.96750.97480.40320.9709
FFNN10Tmean, RH, I, WS, GR160.98950.24350.98930.99070.25130.9887
FFNN11Tmean, RH, WS, GR190.98750.26560.98730.98920.26230.9877
FFNN12Tmean, RH, I, WS110.96720.42650.96710.97160.41240.9695
FFNN13RH, I, WS, GR80.92170.65930.92150.94650.55330.9452
FFNN14Tmean, RH, WS190.91720.67700.91720.91650.68450.9161
FFNN15Tmean, RH140.85280.90470.85220.89660.77450.8926
FFNN16Tmean, WS70.83260.96330.83240.85200.92240.8477
FFNN17RH, WS200.77711.11200.77670.84390.94880.8388
Table 5. Statistical criteria for an estimation of ET0 using different input variables for RBF. The bold part shows that this model is superior to others.
Table 5. Statistical criteria for an estimation of ET0 using different input variables for RBF. The bold part shows that this model is superior to others.
ModelInput Combination Training PhaseTesting Phase
SpreadR2RMSEEFR2RMSEEF
RBF1Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, WS, GR1187.550.99110.22150.99110.99090.24060.9896
RBF2Tmax, Tmean, (Tmax − Tmin), RH, I, WS, GR1187.550.99100.22380.99100.99100.23820.9898
RBF3Tmin, Tmean, (Tmax − Tmin), RH, I, WS, GR1385.470.99060.22790.99060.99100.23770.9899
RBF4Tmin, Tmax, (Tmax − Tmin), RH, I, WS, GR1385.470.99070.22650.99070.99100.23780.9899
RBF5Tmin, Tmax, Tmean, RH, I, WS, GR1385.470.99070.22700.99070.99110.23740.9899
RBF6Tmin, Tmax, Tmean, (Tmax − Tmin), I, WS, GR791.700.98050.32840.98050.98420.32160.9815
RBF7Tmin, Tmax, Tmean, (Tmax − Tmin), RH, WS, GR1187.550.98900.24660.98900.99010.24450.9893
RBF8Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, GR593.770.94560.54890.94560.95490.50760.9539
RBF9Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, WS1781.320.97530.36960.97530.96160.47290.9600
RBF10Tmean, RH, I, WS, GR791.700.99070.22670.99070.99010.25300.9885
RBF11Tmean, RH, WS, GR593.770.98860.25140.98860.98920.25510.9884
RBF12Tmean, RH, I, WS1583.400.97040.40470.97040.96990.42980.9669
RBF13RH, I, WS, GR593.770.93000.62240.93000.94000.58730.9382
RBF14Tmean, RH, WS1385.470.92140.65990.92140.91400.69410.9137
RBF15Tmean, RH791.700.85690.89020.85690.89150.78340.8901
RBF16Tmean, WS791.700.84000.94130.84000.84800.92790.8458
RBF17RH, WS791.700.77791.10890.77790.84340.94410.8404
Table 6. Used parameters in gene expression programming (GEP).
Table 6. Used parameters in gene expression programming (GEP).
ParameterValue
Number of chromosomes30
Head size8
Number of genes3
Linking functionAddition
Fitness function error typeRMSE
Mutation rate0.044
Inversion rate0.1
IS transposition0.1
RIS transposition0.1
One-point recombination rate0.3
wo-point recombination rate0.3
Gene recombination rate0.1
Gene transposition rate0.1
Table 7. Statistical criteria for an estimation of ET0 using different input variables for GEP.
Table 7. Statistical criteria for an estimation of ET0 using different input variables for GEP.
ModelInput CombinationTraining PhaseTesting Phase
R2RMSEEFR2RMSEEF
GEP1Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, WS, GR0.89590.77320.89200.91900.69450.9136
GEP2Tmax, Tmean, (Tmax − Tmin), RH, I, WS, GR0.90750.72270.90570.93230.62510.9300
GEP3Tmin, Tmean, (Tmax − Tmin), RH, I, WS, GR0.90260.73610.90210.93000.66520.9208
GEP4Tmin, Tmax, (Tmax − Tmin), RH, I, WS, GR0.83550.96920.83030.86270.94070.8416
GEP5Tmin, Tmax, Tmean, RH, I, WS, GR0.84150.97210.82940.90330.81510.8810
GEP6Tmin, Tmax, Tmean, (Tmax − Tmin), I, WS, GR0.93930.58040.93920.96290.46870.9607
GEP7Tmin, Tmax, Tmean, (Tmax − Tmin), RH, WS, GR0.96640.43230.96630.97620.37950.9742
GEP8Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, GR0.86360.86950.86350.91940.69250.9141
GEP9Tmin, Tmax, Tmean, (Tmax − Tmin), RH, I, WS0.93530.60810.93320.95370.56040.9438
GEP10Tmean, RH, I, WS, GR0.90850.71380.90800.92750.64350.9258
GEP11Tmean, RH, WS, GR0.96060.48300.95790.97750.37010.9755
GEP12Tmean, RH, I, WS0.94200.58850.93740.95970.50450.9544
GEP13RH, I, WS, GR0.85600.90040.85360.92360.70690.9105
GEP14Tmean, RH, WS0.83880.95630.83490.87970.84060.8735
GEP15Tmean, RH0.81361.05980.79720.86350.93310.8441
GEP16Tmean, WS0.77691.13120.76890.80571.05880.7993
GEP17RH, WS0.69731.31120.68950.80621.12240.7744
Table 8. Statistical criteria for the best combination of inputs.
Table 8. Statistical criteria for the best combination of inputs.
ModelTraining PhaseTesting Phase
R2RMSEER2RMSEE
FFNN20.99030.23320.99020.99210.23420.9902
RBFNN50.99070.22700.99070.99110.23740.9899
GEP110.96060.48300.95790.97750.37010.9755
Table 9. Single-factor ANOVA results for the best combination of inputs.
Table 9. Single-factor ANOVA results for the best combination of inputs.
Source of VariationFp-ValueFcritVariation among Groups
Actual-FFNN20.1717510.6786823.854264Insignificant
Actual-RBFNN50.1010360.7506813.854264Insignificant
Actual-GEP110.1264060.722293.854264Insignificant
Table 10. Statistical criteria for the best combination of inputs.
Table 10. Statistical criteria for the best combination of inputs.
StatisticFFNN2RBFNN5GEP11
Minimum−0.8840−1.2231−0.8671
Maximum1.41991.52041.9343
1st quartile−0.0681−0.0726−0.1503
Median0.06060.0250−0.0055
3rd quartile0.20910.17420.2176
Mean0.07130.05480.0630
Table 11. Statistical criteria for the optimum combination of inputs.
Table 11. Statistical criteria for the optimum combination of inputs.
ModelTraining PhaseTesting Phase
R2RMSEER2RMSEE
FFNN0.98750.26560.98730.98920.26230.9877
RBF0.98860.25140.98860.98920.25510.9884
GEP0.96060.48300.95790.97750.37010.9755
Table 12. Single-factor ANOVA results for the optimum combination of inputs.
Table 12. Single-factor ANOVA results for the optimum combination of inputs.
Source of VariationFp-ValueFcritVariation among Groups
Observed-FFNN110.1014660.7501693.854264Insignificant
Observed-RBFNN110.1194240.729763.854264Insignificant
Observed-GEP110.1264060.722293.854264Insignificant
Table 13. Descriptive statistic of prediction errors for the optimum combination of inputs.
Table 13. Descriptive statistic of prediction errors for the optimum combination of inputs.
StatisticFFNN11RBFNN11GEP11
Minimum−0.6918−0.7073−0.8671
Maximum1.52301.37001.9343
1st Quartile−0.1227−0.0952−0.1503
Median0.01190.0221−0.0055
3rd Quartile0.22280.21110.2176
Mean0.05480.05990.0630
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Achite, M.; Jehanzaib, M.; Sattari, M.T.; Toubal, A.K.; Elshaboury, N.; Wałęga, A.; Krakauer, N.; Yoo, J.-Y.; Kim, T.-W. Modern Techniques to Modeling Reference Evapotranspiration in a Semiarid Area Based on ANN and GEP Models. Water 2022, 14, 1210. https://doi.org/10.3390/w14081210

AMA Style

Achite M, Jehanzaib M, Sattari MT, Toubal AK, Elshaboury N, Wałęga A, Krakauer N, Yoo J-Y, Kim T-W. Modern Techniques to Modeling Reference Evapotranspiration in a Semiarid Area Based on ANN and GEP Models. Water. 2022; 14(8):1210. https://doi.org/10.3390/w14081210

Chicago/Turabian Style

Achite, Mohammed, Muhammad Jehanzaib, Mohammad Taghi Sattari, Abderrezak Kamel Toubal, Nehal Elshaboury, Andrzej Wałęga, Nir Krakauer, Ji-Young Yoo, and Tae-Woong Kim. 2022. "Modern Techniques to Modeling Reference Evapotranspiration in a Semiarid Area Based on ANN and GEP Models" Water 14, no. 8: 1210. https://doi.org/10.3390/w14081210

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop