– Remote Sensing Application of Kernel Density Estimation for Mapping of Solar Potential in French Guiana

French Guiana located close to the equator, is a huge asset in the production of solar electrical power. The main focus of this study is to determine the solar energy potential in French Guiana using the Kernel density estimation. The knowledge of surface solar irradiance from satellite sensor may be obtained using atmospheric or empirical models that link satellite reflectance to surface solar irradiance (SSI). However, due to the multiple steps calculation, it is difficult to develop an accurate and reliable real-time estimation model. In this work we used a conditional Kernel density estimation (CKDE) method with a learning dataset of ground irradiance and satellite reflectance to directly estimate SSI for the years 2010-2013. The predicted solar irradiance values from the CKDE were given in the form of hourly, daily means and annual maps. The annual average maps obtained highlighted interesting trends: the western side of French Guiana receives the highest irradiance values. The mean bias percentage error and mean RMSE values were respectively found to be less than 5% and 13% for the testing stations on a daily basis. The developed method shows great accuracies for evaluating solar resource potential and for real-time map building of SSI in French Guiana.


Introduction
The use of renewable sources for electricity generation is rapidly increasing in the context of a changing planet.Solar power generation is a prominent feature of sustainable energy development and its successful exploitation requires reliable real-time data and real-time information on the state and dynamics of the solar radiation.This is particularly useful to ensure the management of the production of photovoltaic plants or to manage the energy consumption of buildings.SSI measurements are generally provided by ground-based radiometric networks.But, ground radiometric stations measuring SSI are relatively sparse in most regions of our planet, particularly in tropical areas, because of the installation and maintenance costs.Consequently the spatial coverage of SSI data in these areas is insufficient.Although interpolation of ground-based data is possible, studies carried out by Pérez et al. (1997) and Zelenka et al. (1999) [1,2] have pointed out a threshold distance between stations from which the accuracy is better using SSI estimates from satellite images.This threshold distance is 50 km for daily SSI and approximatively 30 km for hourly SSI.Earth observation data from meteorological satellites are often used to provide an alternative solution for solar resource assessment because they offer a better spatial resolution (up to 1km) with a good temporal resolution (up to 15 min) at a global scale.There are several methods for deriving SSI from satellite imagery which can be gathered into two main groups: physical and empirical models.Physical models are usually based on radiative transfert modelisation for determining atmospheric extinction effects on the solar irradiation and require information on the composition of the atmosphere (cloud fraction, optical depth, aerosol optical depth, aerosol type, precipitable water, ozone, etc.) [3][4][5][6].Empirical methods rely on a linear regression between irradiance at the top of the atmosphere and SSI.The coefficient of linear proportionality between these two parameters, the factor K, was estimated by using pyranometric measurements as test dataset [7][8].Several authors [9][10] found that the K factor was correlated to a parameter called cloud index, n, which represents the cloudiness of each pixel.Originally, n was calculated from digital counts of the pixels [11][12][13][14][15] but these methods required empirical adjustments and new settings at each satellite radiometer's change which occurs periodically in operational meteorological programs.Improvements have been made by Rigollier et al. [16] that involves calibration of digital counts pixels with information provided by satellite agencies to produce reflectance values observed by the satellite sensors.The SSI retrieval method proceed as follow: first, the calculation of the cloud index is achieved by normalization of the satellite observed reflectance of a pixel to the difference between the maximum reflectance of the satellite image (corresponding to the brightest pixel) and the minimum reflectance of the pixel (this is the lowest value of the pixel considering a period of several days which corresponds to the clear sky condition); after that, SSI is obtained by the combination of the cloud index, n, with a clear sky model.Then one can says that the knowledge of satellite observed reflectance associated with a pixel provides the SSI of the pixel.
However the above methods are limited to meet the growing need of real-time data and real-time information on the state and dynamics of the solar radiation because of the number of calculation steps necessary to achieve the estimation of the SSI.These methods require either solving radiative transfer equations (sometimes look-up tables are used but it is necessary to proceed to a dynamic parameterization with the use of precise information from various satellites and in situ sensors which is also time consuming) for the first category, or require the calculation of a cloud index via the dynamic parameterization of the surface's reflectance and the dynamic extraction of the cloud's reflectance, for the second category.These steps are time consuming as these operations must be ensured for each satellite image.
In order to improve the computation time for real time applications, we aim to obtain a direct relationship between the reflectance seen by the satellite over a pixel and the SSI.To do this we propose to determine the distribution of the SSI conditioned on each reflectance value.Subsequently, estimate of SSI is derived based on this estimated conditional distribution.The estimation of the distribution of a target variable conditional on the value of an explanatory variable can be achieved via the use of a two dimensional conditional Kernel density estimation (CKDE) [17].The CKDE method makes no parametric assumption regarding the form of the relationship between target and explanatory variables.Since SSI is a non-linear and stochastic process, two dimensional CKDE seems to be suitable for estimating SSI densities conditional on reflectance values.Several experiments using CKDE as estimation method and utilizing an historical dataset have been performed for dynamic estimation of target variables.Ruan [18] has shown that real-time prediction of the respiratory response with a CKD estimator is feasible.The CKD estimator was tested and validated with observations from ten patients.The results showed that the estimations for respiratory response from the CKD estimator yielded better predictions compared with the case with adaptive linear filter.In Bessa et al. [19], a recursive time-adaptive CKD estimator was used to dynamically adapt with the arrival of new observations and produces better results when benchmarked against a splines quantile regression model and evaluated with suitable quality metrics.Hyndman et al. [20] verified the effects of using CKD estimator to evaluate its abilities in the estimation of daily temperature, by conditioning current temperature on lagged temperature observations and a seasonal variable.Using a CKD estimator Jeon and Taylor [21] predict the density of wind power conditional on wind velocity densities.
In this work we used a two dimensional conditional Kernel density estimation method with a learning dataset of SSI and reflectance data seen by the satellite sensor at some fixed stations to estimate SSI on the entire territory of French Guiana.A previous study has already been conducted on the same area with a particle filter method and joint distribution [22].In order to build a real time application we reviewed the application of the method without using the particle filter.This allows us to consider a simpler process and verify the performance and the stability of the method on the spatial and temporal dimensions.The key idea is to estimate the joint probability distribution of the covariate and response variables using an efficient kernel density estimation method.Then, the problem of identifying the distribution of the SSI reduces to identifying the section in the joint pdf based on the observed covariate.We evaluated the prediction performance of the CKDE method, using the root mean squared difference and bias.
Section 2 describes the data used, Section 3 describes the method used, and Section 4 the validation of the CKD model.

Data
We used six ground weather stations (Figure 1) of the French national weather service (Meteo France) for the study.The stations measure hourly means of global irradiance on a horizontal plane.
All the weather stations (Table 1) are equipped with "Kipp and Zonen pyranometers of type CM6B and CMP11, both of which are equipped with a ventilation fan".The CM11 instruments and the CM6B fulfill the accuracy requirements of a secondary standard pyranometer defined in WMO [23], which are specified to be 3%.Meteo France provides preventive maintenance every two months (e.g., cleaning of the air filter and the glass dish, desiccant exchange).Standard exchange of the pyranometers is systematically carried out every two years.Each pyranometer is calibrated in the Radiometry National Center of Meteo France located in Carpentras, in France.Once installed, the coefficients of the new pyranometer are then entered into the data acquisition system of the in situ station.
The locations of these stations are shown in figure 1.Three stations are located on the Atlantic coast (Rochambeau, Ile Royale, Kourou stations) and three stations are located in the interior of the country (Saint-George, Saint-Laurent and Maripasoula stations).All of the stations are located in low-relief areas.The Rochambeau station is located at an altitude of 4 meters above the sea level and in the surroundings there is the Félix Eboué airport.The Ile Royale station is located at an altitude of 48 meters on the island of the same name located in the Atlantic Ocean, 14 km away from the coast of French Guiana.The St. Laurent and St. George stations are both located at an altitude of 6 meters in the municipalities of the same name.The Maripasoula Station is located at an altitude of 104 meters on a flat area inside the territory, which is 230 km from the ocean, and has houses and forest in the vicinity.In situ measurements were collected from the year 2010 to 2013.In situ observations have an hourly temporal resolution and are issued at a fixed time (UTC).
The satellite measurements used in this study correspond to a time series of GOES-13 satellite images from the year 2010 to 2013.The meteorological geostationary satellite GOES-13 is a part of the US National Oceanic and Atmospheric Administration's Geostationary Operational Environmental Satellite system and has been launched in April 14, 2010.The nadir spatial resolution of GOES-13 is 1 km ×1 km.We used satellite images from the visible channel (0.55 µm -0.75 µm) of GOES 13 acquired at the instants for which the sun elevation was greater than 15°.GOES-13 satellite images are taken every half hour: UTC + 15 min and UTC + 45 min while ground measurements are taken every UTC hour.So we managed the difference between temporal resolution of in situ data and satellite data.as follows: the satellite images were correlated with the ground data by considering the nearest UTC hour, ie: GOES satellite images taken at UTC + 15 min (resp.UTC + 45 min.)were correlated with in situ data taken at UTC (resp.UTC + 1H).

Method
The CKD estimator produces an estimate of the conditional density f(x|z) of a variable x, conditional on the variable z.The essential idea is that, for a given z, the density function at the value x is constructed by applying kernel density estimation to the set of observations {X 1 ; X 2 ; …; X n }, with each X t value weighted in accordance with the closeness of the corresponding Z t to the conditioning value z.Let f(x|z) be the conditional density function of X t , given Z t = z.The Rosenblatt CKD estimator [17] of f(x|z) is written as : In this work, we define the dependent variable X t as the clearness index and explanatory variable Z t as the reflectance observed by the satellite sensor.The solar radiation signal cannot be regarded as a stationary process due to the diurnal and annual variation related to the sun's changing angle.To remove these effects and obtain a weekly stationary stochastic process, a ratio called "clearness index" is often used.The clearness index is the global solar irradiance on the earth surface, SSI, divided by the corresponding extra-terrestrial irradiance at the top of the atmosphere.The extra-terrestrial solar irradiance is determined using the following equation: where I sc =1367 W/m² is the solar constant, the extra-terrestrial irradiance normal to the solar beam;  is the eccentricity correction factor, and s is the sun zenithal angle. and s depend on astronomical relationships and have been determined for each instant t.
The knowledge of the clearness index permits the calculation of solar radiation at the earth's surface.Let X t (i, j) denote the clearness index at time t for the pixel (i, j), the SSI, denoted by G t (i,j), is given by : where G 0t (i, j) is the horizontal irradiance outside the atmosphere for time t and pixel (i, j).Both are expressed in W.m -2 .The explanatory variable Z t is the reflectance Z t (i, j) observed by the satellite sensor for time k and for each cell has no unit and is equal to the bidirectional reflectance: where I sc is the solar constant,  is the eccentricity correction factor, and  s (i, j) is the sun zenithal angle.L(i, j) is the observed radiance averaged over the 25 x 25 pixels composing a cell and is calculated with the calibration formula given by NOAA (17).To convert a GOES-13 visible Imager "count" value to radiance, L, in Watt/(m².sr.µm), we applied the following formula: where m and b are coefficients given by NOAA.L is the radiance of the pixel whose numerical value is "count".Because the GOES 13 visible sensor degrades with time once in orbit, we applied successive correction factors given by NOAA to the coefficients m and b that take into account the degradation of the sensor over the time range of the study.The calculation of reflectance is done on cells of 25 x 25 pixels which are centered on each pixel (i, j) containing the ground stations.
Another observed variable was extracted from digital satellite images, it is the standard deviation sd(i, j) of the radiances L of the 25 x 25 pixels from each cell.We created C classes in which we spread the sd values.C is the number of classes of the standard deviation values.
We implemented the CKD-based method by using a bivariate kernel density estimator.We consider a learning set consisting of M available data pairs (X, Z) for each class of standard deviation values.The X t , (t = 1, 2,…, M), is the value of the clearness index obtained from an in situ station at time t associated with reflectance value Z t extracted from an satellite image over the in situ station at time t.The CKD estimator generate an approximation of the bivariate kernel density p(x, z) as the superposition of (equally weighted) local kernel densities centred about each sample (x i , z i ), drawn from the learning set.A common choice of Kernel density is the Gaussian Kernel.Each kernel can be propagated by using a local linearization (Fig. 1.) yielding a continuous output distribution p(x|z).We created a bivariate kernel density with the data pairs (x, z) by using the CKD estimator for each class of standard deviation values and there are as many joint distributions as there are classes.The two bandwidth parameters of the CKD estimator are chosen optimally without ever using a parametric model for the data.The CKD based estimation method of SSI is decomposed in five subtasks: 1. Obtain the bivariate kernel density p(x, z) between the historical clearness index data and historical reflectance data for each class of standard deviation value by using the CKD-based approach 2. Observe reflectance z t and standard deviation sd t for pixel (i, j) at time t 3. Identify the conditional distribution p(x|z t ) for reflectance z t by editing the bivariate kernel density p(x, z) corresponding to the class of the standard deviation sd t 4. Obtain an estimation of clearness index, x t , from the calculation of the median of the conditional distribution p(x|z t ).
5. Obtain an estimation of the SSI for pixel (i, j) at time t, G t (i, j), by multiplying the estimated clearnes index by the horizontal irradiance outside the atmosphere for the time t.

Validation
The CKD estimation method of SSI has an hourly resolution, it has been validated by comparison between SSI estimates derived from particle filter and SSI measurements performed at six ground radiometric stations by the French National Meteorological Service of French Guiana.Statistical performances of hourly estimated irradiance and daily averaged estimated irradiance were conducted using Root Mean Square Deviation (RMSD), Bias and correlation coefficient (CC).The formulation of RMSD and BIAS are given below: where R m is the measured ground solar radiation value, R est is the estimated solar radiation value, and n is the number of points.Relative BIAS and RMSE are also given as a percentage of the daily averaged measured irradiance.
In order to investigate how accurately the CKDE based method will perform in practice, we have analyzed its performance by using a cross-validation scheme.The whole data set has been separated into complementary subsets and we conducted a four-fold cross-validation: four simulations were performed with different training dataset.For example: the year 2010 was first used for training dataset, and the rest of the available time series was used for testing, and so on until the last year (year 2013).The results from the four folds are very similar, there is no noticeable change in the accuracy of each station estimates whatever the training year.We averaged the results of the four-fold crossvalidation obtained for each station to produce a single estimation and we presented the results in Table 2.The average relative bias and the average relative RMSD obtained with the CKDE method are located in the same range as the relative bias and relative RMSD obtained with other SSI estimation methods performed by different authors in several stations worldwide [2,11,15,24] Table 2. Comparisons between estimated and measured hourly means of SS obtained from the fourfold cross validation.The result of the cross-validation highlights the temporal insensitivity of the model over the studied period.The results of the four-fold cross-validation prove the model's generalization ability over temporal dimension.In order to investigate the performance of the model all simulations conducted in the remainder of the study were performed when the year 2011 was used for training and the rest of the available data was used for testing.We didn't choose the year 2010 because the GOES-13 dataset related to this year is incomplete.Tables 3. reports the statistical performances of daily SSI obtained for each station.The relative bias varies between -7% and 6%, and the relative RMSD varies between 25% and 29%.Two stations have a negative bias: Ile Royale and Kourou.They also have the same correlation coefficient and nearly the same RMSE.The similarity in performances of these two close to each other stations indicates the stability as well as the identity of the CKDE method.Concerning the negative bias of Ile Royale and Kourou stations, the analysis of meteorological data of the six stations over a period of thirteen years [25] indicates that the St-Georges, Saint-Laurent and Maripasoula stations receive less global solar radiation than Kourou and Ile Royale stations.One may suggest that the construction of a joint probability distribution formed by combining the probability distributions of the six stations will promote the average irradiance values of the six stations.Then the stations with a high solar potential will tend to be underestimated, while the stations with the lowest solar potential will tend to be overestimated.This could be an explanation of the bias differences between the stations.Figure 2 shows scatter plots between measurements and estimates from the CKD estimator to the daily average of SSI for the six stations.The scatter plots are drawn with a colorscale representing the data density.The dashed line with a slope of 1 is the ideal case, where the measures are equal to the estimates.The solid line represents the line of least squares; the parameters are written on the graph.There is an overall good fit between measurements and estimates: the points are well aligned along the 1:1 line indicating a low dispersion of deviations.There is a slight overestimation by the CKDE method during overcast days.One may also observe a slight underestimation of SSI for bright days.Table 4 reports the bias, RMSD and the correlation coefficient between daily measurements and estimates for all stations.The absolute bias for all stations is equal to 5%.The relative bias for individual station is small and ranges from -6% to 6%.The relative RMSE ranges between 11% and 13% and the correlation coefficient varies between 0.93 and 0.95.These statistical results are slightly better than those obtained on a daily basis by Marie-Joseph et al. [26] for the same stations when comparing ground measurements with Helioclim-3 database.This validation study shows a close agreement between CKDE data and ground measurements.Then these data may be used for mapping the distribution of global solar radiation over French Guiana.

Spatial distribution
The SSI time series of all pixels covering the French Guiana have been computed by using the CKDE method and GOES images for the years 2012 and 2013.The geographical distribution of monthly mean and annual mean of global solar-irradiation data obtained from CKDE method were examined.Figure 3. represents the geographical distribution of solar irradiation for the year 2012.
Annual mean values vary between 3.47 and 3.75 kW h m−2, and had an average of 3.6 kW h m−2.The extreme months detected were May, with an average of 2.2 kW h m −2 and October, with an average of 5.3 kW h m −2 .Geographical distribution of annual global solar-irradiation in Figure 3 shows longitudinal patterns: solar irradiation increases from eastern towards western longitudes (similar patterns are observed for the year 2013).The distribution of solar irradiation is closely related to atmospheric circulation between the subtropical high-pressure zones over the South Atlantic and South Pacific.These high-pressure zones determine the location of the inter-tropical convergence zone (ITCZ), a latitudinal belt of cloud and relatively low pressure [27].The ITCZ moves over the Guianas twice a year.It moves north during May-July to a position approximately 7°N and southwards during November-January to a position around 15 S placing it well over the Amazon [28].The seasonal movement of the ITCZ brings periods of prolonged and abundant precipitation.The average daily precipitation observed for 1958-2001 shows that the distribution of precipitations over French Guiana is higher on the eastern part and drier in western part.As a result of this, distribution of precipitations and associate clouds explains the low solar radiation over the eastern region.Spatial distributions of monthly solar radiations are similar to the annual distribution.

Conclusions
This study demonstrates that it is possible to obtain good quality estimations of irradiance over French Guiana when applying conditionnal Kernel density estimator.The non-linear and highly variable nature of SSI motivates nonparametric probability density estimation methods.The joint probability distribution between SSI index and reflectances is not restricted by any prior assumption and gives a probabilistic perspective based on conditional distribution estimates.
A new method for estimating the global solar radiation has been developed and tested.The developed method avoids multiple calculation steps while keeping estimation results in accordance with those of existing methods.The validation of the CKDE method by comparison between estimates and in situ measurements has been conducted.The daily root mean square difference of daily means ranges between 23 W/m² (11%) and 27 W/m² (15%).The daily means yield a bias ranging from -12 W/m² (-6% of the mean of measurements) to 12 W/m² (6%) depending on the stations.The correlation coefficient is close to 0.93.
The validation provides evidence that the method is sufficiently accurate for real-time map building of spatial variations of solar radiation in French Guiana.These maps are useful for dynamic managing of solar energy plants, but also for dynamic energy regulation of buildings.Because of its simplicity of implementation, this method opens new prospects for SSI real-time estimation and SSI forecasting.Future work is planned to exploit satellites images on a larger scale.

Figure 1 .
Figure 1.: Location of ground radiometric stations in French Guiana (South America)

Figure 3 .
Figure 3. : Geographical distribution of annual global solar irradiation for the year 2012 based on CKD estimation method

Table 1 .
Latitude, longitude and altitude of ground meteorological stations in FrenchGuiana

Table 3 .
: Comparisons between estimated and measured hourly means of surface solar irradiance obtained with year 2011 used as calibration set

Table 4 .
: Comparisons between estimated and measured daily means of surface solar irradiance with year 2011 used as calibration set