Asymptotic behaviour of the weighted Shannon differential entropy in a Bayesian problem

Consider a Bayesian problem of success probability estimation in a series of conditionally independent trials with binary outcomes. We study the asymptotic behaviour of the weighted differential entropy for posterior probability density function conditional on x successes after n conditionally independent trials when n → ∞. Suppose that one is interested to know whether the coin is approximately fair with a high precision and for large n is interested in the true frequency. In other words, the statistical decision is particularly sensitive in small neighbourhood of the particular value γ = 1/2. For this aim the concept of weighted differential entropy introduced in [1] is used when the frequency γ is necessary to emphasize. It is shown that when x is a proportion of n after an appropriate normalization the limiting distribution is Gaussian and the standard differential entropy of standardized RV converges to differential entropy of standard Gaussian random variable. Also, we found that the weight in suggested form does not change the asymptotic form of the Shannon and Renyi differential entropies, but changes the constants.


Introduction
Let U be a random variable (RV) that uniformly distributed in interval [0, 1].Given a realization of this RV p, consider a sequence of conditionally independent identically distributed ξ i where ξ i = 1 with probability p and ξ i = 0 with probability 1 − p.Let x i , each 0 or 1, be an outcome in trial i. Denote by S n = ξ 1 + . . .+ ξ n , by x = (x i , i = 1, ..., n) and by x = x(n) = n i=1 x i .The posterior PDF given the information that after n throws we observe x heads takes the form Note that conditional distribution given in ( 1) is a Beta-distribution B(x + 1, n − x + 1).The RV Z (n)  with PDF (1) has the following conditional variance: Recall the definition of a differential entropy h(f ) of a RV Z with the PDF f : with the convention 0log0 = 0.While referring to the differential entropy of a RV Z we mean the entropy of its PDF f .Consider a linear transformation X = b 1 Z + b 2 , then [3,7]: where g is a PDF of RV X.Let Z be the standard Gaussian RV with PDF ϕ, then the differential entropy of Z equals [7]: In our previous paper [8] the standard Shannon entropy of (1) was studied in three particular cases: x = αn , x ∼ n β , where 0 < α, β < 1 and either x or n − x is a constant.We had demonstrated that the limiting distributions when n → ∞ in the cases 1 and 2 are Gaussian.However, the asymptotic normality does not imply automatically the limiting form of differential entropy.In general the problem of taking the limits under the sign of entropy is rather delicate and was extensively studied in literature, cf., i.e., [4,6].In stated problem, it was proved that in the first and second cases the differential entropy is asymptotically Gaussian with corresponding variances.In the third case the limiting distribution is not Gaussian, but still the asymptotics of differential entropy can be found explicitly.We would like to extend the theory of the Shannon differential entropy.For this reason, let us consider the following statistical experiment with twofold goal: on the initial stage an experimenter mainly concerns whether the coin is approximately fair (i.e.p ≈ 1  2 ) with a high precision.As the size of a sample grows, he proceeds to estimate the true value of the parameter anyway.We would like to quantify the differential entropy of this experiment taking into account its two sided objective.It seems that quantitative measure of information gain of this experiment is provided by the concept of weighted differential entropy [1,2].
Let φ (n) ≡ φ (n) (α, γ, p) be a weight function that underlines the importance of some particular value γ (γ = 1/2 in the problem stated above).The goal of this work is to study the asymptotic behaviour of the weighted Shannon (5) differential entropy [3,8] of RV Z (n) with PDF f (n) given in (1) and particular RV Z α given in (1) with x = αn where 0 < α < 1: and generalisation of the weighted Shannon entropy, the weighted Renyi differential entropy When the weight function is uniform (φ ≡ 1) we will omit the superscript φ.Moreover, we would like to compare asymptotics of the weighted differential entropy and the standard differential entropy.Thus, the following special cases are considered:

both on n and p
We assume that φ (n) (x) ≥ 0 for all x.Choosing the weight function we adopt the following normalization rule: It can be easily checked that if weight function φ (n) satisfies ( 7) then the Renyi weighted entropy (6) tend to Shannon's weighted entropy as ν → 1.In this paper we consider the weight function of the following form: where Λ (n) (α, γ, p) is found from the normalizing condition (7) and is given explicitly in (16).This weight function is selected as a model example with a twofold goal to emphasize a particular value γ for moderate n, while preserving the estimate to be asymptotically unbiased

Main results
Theorem (c) The Kullback-Leibler divergence of ϕ from f (n) α tends to 0 as n → ∞: Theorem 2. For the weighted Shannon differential entropy of RV Z and weight function φ (n) given in (8) the following limit exists where h(f is the standard (φ ≡ 1) Shannon's differential entropy.
(a) When φ (n) ≡ 1 and both x and n − x tend to infinity as n → ∞ the following limit holds and for any fixed n (b) When the weight function φ (n) is given in (8) the following limit for the Renyi weighted entropy of f and for any fixed n Proposition 1.For any continuous random variable X with PDF f and for any non-negative weight function φ(x) which satisfies condition (7) and such that the weighted Renyi differential entropy H φ ν (f ) is a non-increasing function of ν and where

Proofs
The normalizing constant in the weight function ( 8) is found from the condition (7).We obtain that where Γ(x) is the Gamma function and B(x, y) is the Beta function.We denote by ψ (0) (x) = ψ(x) and by ψ (1) (x) the digamma function and its derivative respectively In further calculations we will need the asymptotics of digamma functions in two particular cases j = 0 and j = 1 only Recall also the Stirling formula [5]: as n → ∞. (18)

Theorem 1
Proof.The proof can be found in [8].
We will use the following result of Theorem 1.The asymptotics of the differential entropy of unstandartized RV Z

Theorem 2
The Shannon differential entropy of PDF f (n) α given in (1) with the weight function φ (n)   given in (8) takes the form: The integrals can be computed explicitly [5] (4.253.1): Applying this formula, we get where z = γ √ n.
Applying Stirling's formula (18) and using the asymptotics for digamma function we have that The leading term in (20) is the Shannon differential entropy of Gaussian RV with weight function φ (n) ≡ 1.Moreover, note that leading term of the asymptotics for the weighted differential entropy exceeds that for the classical differential entropy studied in [8].The only difference is constant which tends to zero as γ → α.

Theorem 3 (a)
In this case φ (n) (p) ≡ 1 the Renyi entropy has the form Consider the integral: Applying Stirling formula, we obtain that So, we have that Note that the leading terms in (22) looks like Renyi differential entropy of Gaussian RV with variance x(n−x) n 3 .
Taking the limit when ν → 1 and applying L'Hopital's rule we get that For example, when x = αn , 0 < α < 1 the Renyi entropy: where the leading term is Shannon's differential entropy of Gaussian RV with corresponding variance.
(b) When φ (n) is given in ( 8) and x = αn , the weighted Renyi differential entropy of PDF f (n) α takes the following form where where z = γ √ n as before.Applying Stirling's formula for each term and taking all parts together, we obtain that Taking the limit when ν → 1 and applying L'Hopital's rule we get that So, for any fixed n the weighted Renyi differential entropy tends to Shannon's weighted differential entropy as ν → 1.

Proposition 1
We need to show that Note that z(x) ≥ 0 for any x and R z(x)dx = 1.

Conclusion
The behaviour of the weighted and standard differential entropies was studied in this paper.We had shown that in the binary outcomes trial the difference of the weighted differential entropy of (1) with the weight function ( 8) and the standard differential entropy of (1) tends to some constant as n → ∞.This constant depend on the distance between the point of special interest γ and true parameter α.The same conclusion holds of the weighted Renyi differential entropy.The problem of sensitive estimation is quite common.Thus, it seems that both these facts can find a variety of applications.Moreover, all results can be straightforwardly generalized for larger class of the weight function and the work in this direction should be continued.