Soil Moisture Mapping in Vegetated Area Using Landsat and Envisat ASAR Data

Physical model is always complicated to estimate soil moisture content, while machine learning algorithms have potential advantages in retrieving information from remote sensing data. This paper takes the middle stream of Heihe River Basin in China as the study area. The neural network, one of the most common machine learning algorithms, is used to retrieve soil moisture from active microwave data and optical data. Landsat data and Envisat ASAR data covered the study area were acquired in July 2008. The neural networks were trained with ground truth data and input parameters extracted from remote sensing data including bands information, Normalized Difference Vegetation Index (NDVI), Brightness Index (BI), the dual polarizations (HH and VV) and the ratio (HH/VV). Compared to an existing result using an empirical model with purely Envisat data in the same area, this study showed a slightly better correlation between the measured and estimated soil moisture (R 2 =0.75). It also revealed that the model with multi-source data had a better performance than the one with only a single source data. Finally, the verified model was applied to the whole study area, and it demonstrated that this method has operational potential for estimating soil moisture under the vegetated area in the middle stream of Heihe River Basin.


Introduction
Soil moisture plays a key role in various applications such as precision agriculture, soil erosion and hydrological modeling [1][2][3].Conventional In situ methods can accurately estimate soil moisture but it is time-consuming and difficult to satisfy the need of monitoring soil moisture over large areas.Remote sensing is becoming an increasingly important tool to retrieve soil moisture.While the research on the extraction of soil moisture using optical or microwave data has achieved fruitful results, combining these two different source data is likely capable to enhance the accuracy of soil moisture estimation.
Machine learning has been successfully applied in many areas.Among numerous machine learning algorithms, neural network have shown a great potential in different remotely sensed data processing [4].It can be easily adapted to different types of data and input configuration.In this study, Landsat TM, Envisat ASAR data and corresponding field data in Hehei River Basin were collected, and a soil moisture estimation model was built using neural network with input parameters derived from optical and microwave data.The model's accuracy (R 2 ) is 0.75.Finally, it was applied to the whole research area.The study area (Figure 1) was located in Linze county, Zhangye city in the middle stream of the Heihe River Basin (100°04′E, 39°15′N), which is the second largest inland river basin in an arid region of northwestern China [5].Land cover types are diverse in this region, with wetland, grassland, and farm-land distributed in the vicinity.The composition of soil was about 16.7% sand, 74.8% silt, and 8.5% clay.In the summer of 2008, an arid zone hydrology experiment was carried out in the Heihe River Basin [6].The objective was to provide dataset, for developing and validating the soil moisture inversion algorithm.

Satellite Data
The satellite data used in this study include Advanced Synthetic Aperture Radar (ASAR) dual-polarized data and Landsat TM data.ASAR operates on the C band with a 5.6 cm wavelength.In this study, a VV/VH polarized Level 1B image (in the Alternating Polarization (AP) mode with a spatial resolution of 30 m) of the middle stream of the Heihe River Basin was selected.The image was captured on 11 July 2008.The Next ESA SAR Toolbox (NEST) was used to pre-process the data.NEST is an open source software, developed for ESA and made available via its website.The Range-Doppler method was used to orthorectify the data with the SRTM 90 m void-filled Digital Elevation Model (DEM).Then, radiometric normalization and a 5 × 5 enhanced Lee filters was applied, and finally the backscatter values of study area were extracted.Landsat TM images contain 7 bands that simultaneously record reflected or emitted radiation from the Earth's surface in the blue-green, green, red, near-infrared, mid-infrared, and the far-infrared portions of the electromagnetic spectrum.Landsat TM Scene has an instantaneous field of view (IFOV) of 30m by 30m in bands 1 through 5 and band 7, and an IFOV of 120m on the ground in band 6.The Landsat5 TM data used in this paper was acquired from USGS archives website and imaged on 7 July 2008, which was four days before ASAR data.There were no precipitation in these days.According to the automatic monitoring instrument in the station, the variation of soil moisture was less than 0.02g/cm 3 .

Field data
Concurrently with radar overpass on 11 July 2008, ground measurements were carried out at two test sites, which planted Alfalfa and barley separately.The moisture content and soil conductivity were measured by time domain reflectometry (TDR).The elementary sampling plots covered an area of approximately 360*360 m 2 in a grid pattern at 60m spacing in each site.Measured soil moisture content may contain outliers that have unusual large or small values when compared with others.The SD (Standard Deviation) method was adopted to eliminate outliers.

Methodology
Neural network has been applied to a wide range of problems in many disciplines, and it can be trained to extract surface parameters from remotely sensed data.It has the advantage to identify subtle and nonlinear patterns, which is not always the case with traditional statistical methods.Neural network can be easily adapted to different types of data and can easily incorporate ancillary data which would be difficult or impossible with conventional techniques.The multilayer perceptron (MLP) is a feed-forward, supervised learning neural network.The MLP network is a function of one or more predictors that minimizes the prediction error of one or more targets.The training of a multilayer perceptron uses a method called back propagation of error, based on the generalized delta rule.For each record presented to the network during training, information (in the form of input fields) feeds forward through the network to generate a prediction from the output layer.This prediction is compared to the recorded output value for the training record, and the difference between the predicted and actual output(s) is propagated backward through the network to adjust the connection weights to improve the prediction for similar patterns.In this paper, a neural network model (Figure 2) with three layers was built using SPSS Modeler software.

Figure 2. Neural network model
Apart from optical and SAR features, Normalized Difference Vegetation Index (NDVI), Brightness Index (BI) were also calculated as input parameters.
There is a great deal of variability in the scale of these features.Therefore, these data should be normalized before working with neural networks, which is also crucial to obtain good results as well as to fasten significantly the calculations [7].The built neural network model has three layers, and the hidden layer has four nodes.There are 96 field sampling points and they are separated into two datasets with the proportion of 2:1 to train and test the model.The results demonstrate that the coefficient of determination (R 2 ), the root mean square error (RMSE) and the maximum error are 0.75, 0.059cm 3 cm -3 and 0.126cm 3 cm -3 , respectively.The comparison between soil moisture estimated from neural network model and in situ measurements can been seen from Figure 3.The accuracy is slightly better than the ones achieved from [8] and [9] using purely microwave data in the middle stream of the Heihe River Basin, the R 2 of which are 0.49 and 0.71.Besides, the improved model proposed by [9] requires surface roughness parameters, and the model may be not applicable when the surface roughness is comparatively large.In our case, the neural network model achieved a convincible result using parameters derived from Landsat TM and Envisat ASAR images without considering surface roughness.Besides, the like polarization (VV) has a higher weight value than the cross polarizations (VH) in the built model, and the reason is that the backscattering coefficient is more sensitive at like polarizations compared to cross polarizations.

Conclusions
Microwave remote sensing have been developed to retrieve the soil moisture in bare soil areas, but it is still a challenging in vegetated areas.When a vegetation layer is present over the soil surface, it attenuates radiation emitted by the soil and affects radar backscattering.Optical data can be utilized to derive information of vegetation properties and these information can be integrated with the microwave data to retrieve the soil moisture.A neural network model was built to estimate the soil moisture in the middle stream of the Heihe River Basin.The model was tested using field sampling data and the obtained results are quite encouraging.It demonstrates that the information derived from optical data and SAR data can be integrated to estimate soil moisture in vegetated areas.

Figure 1 .
Figure 1.Study area and the distribution of sampling points

Figure 3 .Figure 4 .
Figure 3.Comparison between soil moisture estimated from neural network model and in situ measurements