Fisher Information Geometry for Shape Analysis

: The aim of this study is to model shapes from complex systems using Information Geometry tools. It is well-known that the Fisher information endows the statistical manifold, deﬁned by a family of probability distributions, with a Riemannian metric, called the Fisher–Rao metric. With respect to this, geodesic paths are determined, minimizing information in Fisher sense. Under the hypothesis that it is possible to extract from the shape a ﬁnite number of representing points, called landmarks, we propose to model each of them with a probability distribution, as for example a multivariate Gaussian distribution. Then using the geodesic distance, induced by the Fisher–Rao metric, we can deﬁne a shape metric which enables us to quantify differences between shapes. The discriminative power of the proposed shape metric is tested performing a cluster analysis on the shapes of three different groups of specimens corresponding to three species of ﬂatﬁsh. Results show a better ability in recovering the true cluster structure with respect to other existing shape distances.


Introduction
Shape analysis is a timely and interesting research field.Applications of shape analysis have been involved in various areas such as morphometry, image analysis, biology, database retrieval, and so on.In all these fields the aim could be to get an unsupervised classification of the objects in different clusters such that objects within a group are more similar in terms of shapes than they are in other groups.Then, the clustering of shapes is a longstanding challenge in the framework of geometric morphometrics [1], since the recognition of groups of similar morphologies, and then of the differences among these groups, is a key step of the analysis when geometric morphometrics protocols are applied.Shapes must be invariant to rotation, scale and translation so that a straightforward way to proceed is first to align the objects by using Procrustes analysis and then to apply standard clustering algorithms minimizing a given distance or dissimilarity measure evaluated within each cluster [2,3].Similarly, [4] proposed to use a dissimilarity measure based on the inter-landmark distances and then apply standard statistical procedures such as hierarchical clustering or k-means clustering.
Under the hypothesis that it is possible to extract from the shape a finite number of representing points, called landmarks, we propose new statistical modeling of 2-dimensional shapes by representing each landmark by a bivariate Gaussian random variable.The means and the variances parameters becomes the coordinates of the statistical manifold.In particular, the variances reflect the uncertainty in the landmark's placement and the variability across a family of shapes.Within this framework , we derive a distance between two shapes using tools from Information Geometry [5,6], which considers statistical models as Riemannian manifolds with the Fisher-Rao metric.However, the associated geodesic distance has not yet been derived for the family of general multivariate normal distributions.Closed form expressions have been obtained only for isotropic and diagonal Gaussian distributions [7].
The paper is organized as follows: Section 2 recalls the main notions of Information Geometry.Section 3 reviews the statistical modeling of 2-dimensional shapes provided by Information Geometry.In Section 4, a new shape distance derived from the Fisher-Rao metric is proposed.Its discriminative power is evaluated through a real application.

Information Geometry
Let I = [0, 1] and P a k-dimensional family of positive probability density functions In classical Information Geometry, the Fisher information matrix g, with generic (i, j) -entry is regarded as the most natural Riemannian structure on the parameter space [5,6].
From differential geometry we know that a metric matrix g defines an inner product on the tangent space of the manifold as follows: u, v = u T g ij v with associated norm u = u, u .Then the distance between two points P, Q of the manifold is given by the minimum of the lengths of all the piecewise smooth paths γ joining these two points.Precisely the length of a path is calculated by using the inner product: A curve that encompasses this shortest path is called a geodesic.In particular we consider the family of p-variate normal distributions: where x = (x 1 , x 2 , ..., x p ) T , µ = (µ 1 , µ 2 , µ p ) T is the mean vector and Σ the covariance matrix.Note that the parameter space has dimension k = p + p(p+1)

2
. We have three sub-cases [7]: (i) Round Gaussian distributions: Σ = σ 2 I In this case the family can identified with the (p + 1)-dimensional half space parameterized by (µ 1 , µ 2 , ..., µ p , σ), σ > 0, and the Fisher information matrix is: Using a similarity transformation with the following matrix of the hyperbolic metric in the (p + 1)-dimensional half space the closed form for the distance is given by where ( μ1 , σ 1 ) = ((µ 11 , ..., µ 1p , σ 1 ) and ( μ2 , σ 2 ) = ((µ 21 , ..., µ 2p , σ 2 ) besides |.| is the usual Euclidean norm.The geodesics in the parameter space are contained in planes orthogonal to the hyperplane σ = 0 and are either lines or half ellipses centered at this hyperplane.The curvature of the family is equal to The family of all independent multivariate normal distributions is the intersection of half-spaces parameterized by (µ 1 , σ 1 , µ 2 , σ 2 , ..., µ p , σ p ), σ i > 0, so the Fisher information matrix is: In this case the metric is a product metric, the curvature of the family is −1 2(2p−1) , and, using again the similarity with the hyperbolic metric, we have the following closed form for the distance d((µ 11 , σ 11 , ..., µ 1p , σ 1p ), (µ 21 , σ 21 , ..., (iii) General Gaussian distributions: Σ any symmetric positive definite covariance matrix The analysis is much more difficult and it is not known a closed form for the associated distance.

Modeling of 2-Dimensional Shapes
We will consider only planar objects, as for example a section of the skull.The shape of the object consists of all information invariant under similarity transformations, that is translations, rotations and scaling [8].Data from a shape are often realized as a set of points.Many methods allow us to extract a finite number of points, which are representative of the shape and are called landmarks.One way to compare shapes of different objects is to first register them on some common coordinate system for removing the similarity transformations [9,10].Alternatively, Procrustes methods [11] may be used in which objects are scaled, rotated and translated so that their landmarks lie as close as possible to each other.Suppose we have a sample of n planar shapes.Let us denote the shape coordinates of the j-th configuration S j , j = 1, ..., n, via its K landmarks µ j = µ j 1 , µ j 2 , . . ., µ j K with generic element µ jk = µ j k1 , µ j k2 , for k = 1, . . ., K. For the k-th landmark an estimate of the coordinates covariance matrix Σ k is given by where μk denotes the k-th landmark coordinates of the mean shape μ = 1 n ∑ n j=1 µ j and vec is the vectorization operator.
Our proposal is to model each landmark k, having shape coordinates µ k = {µ k1 , µ k2 }, with a bivariate Gaussian density.Assuming a round Gaussian distribution, case i) of section (2), we have Σ k = Σ = σ 2 I 2 obtaining the following representation for the k-th landmark, for k = 1, . . ., K: where x is a generic 2-dimensional vector.In the landmark representation ( 5), σ 2 is a free parameter isotropic across all the K landmarks.Therefore, only the means are used as coordinates of the statistical manifold.We can relax the isotropic hypothesis assuming diagonal Gaussian distribution, case ii) of section (2).To this end, the new model for the representation of the k-th-landmark, for k = 1, . . ., K, is given by where } is the vector containing the variances of the k-th landmark coordinates.Representation ( 6) allows us to express the k-th landmark coordinates as θ k = (µ k , σ k ) on a 4-dimensional manifold which is the Cartesian product of two half-planes.

Shape Metrics Based on Geodesic Distance
The landmarks representation as probability distributions enables to perform various type of analysis as for example quantifying the difference between shapes.Let S 1 and S 2 be two planar shapes.Denote the length of the geodesic path connecting the k-th landmarks of the two configurations by d(θ k ).A shape metric for measuring the difference between S 1 and S 2 can be obtained evaluating the geodesic distances between the corresponding landmarks of the two configurations as follows: In Section 2 we provided closed form for the geodesic distance d(θ S 1 k , θ S 2 k ) according to the type of landmarks representation adopted.Under the isotropic variance assumption -landmarks representation (5) -the geodesic distance is computed with (2) while with the varying variance model -landmarks representation (6) -the geodesic distance is computed with (3).In order to apply (3) we need the covariance matrix in (6) to be diagonal.To this purpose we can apply an orthogonal transformation to the original landmarks coordinates which does not affect the analysis because it induces a rotation on the plane which leaves invariant the shape of a configuration.
The discriminative power of the proposed shape metrics is tested performing a cluster analysis on the shapes of three different groups of specimens corresponding to three species of flatfish.Namely, 60 individuals of plaice (Pleuronectes platessa L.) and 63 of flounder (Platichthys flesus L.) were collected in North Bull Island [12].In addition, a group of 14 individuals of the species S. solea were used.These individuals were collected during the activities for the SOLEMON survey carried out the in the Adriatic Sea (Mediterranean Sea) during the autumn 2014.This last group of individuals would represent a kind of out-group with respect to the other two ones, since it belongs to a different and phylogenetically distant family (Soleidae) [13].Each fish was photographed in lateral aspect and digital images were extensively described in [12].In summary, the scheme of 21 landmarks described in Figure 1 was digitized for each individual of the three species.Three different shape metrics are computed: • shape metric under the varying variance model (6) (dG) • shape metric under the fixed variance model (5) The Procrustes distance is one of the main measures of difference between shapes.It is calculated by minimizing the Euclidean sum of squares between the landmark configurations using translation, rotation and scale.The computation of the geodesic distance within the model ( 5) requires the choice of the free parameter σ 2 .In order to test the sensitiveness of the final clustering with respect to changes in the value of σ 2 , we adopted three different values of σ 2 given by the first (rG Q 1 ), the second (rG Q 2 ) and the third quartile (rG Q 3 ) of the values of the variances of each landmark and for each dimension.For each metric we computed the pairwise distances of all shapes.The obtained distance matrices are then used in a hierarchical clustering algorithm.The behaviour of the clustering has been evaluated by means of the a-Rand index [14].Results for the solutions with three clusters are reported in Table 1 below.Results show that the shape metric with the varying variance representation leads to the best performance in recovering the true cluster structure.The fixed variance model results in a worse cluster recovery with different behaviours depending on the value of the free parameter σ 2 .For this particular data set, the Procrustes distance turns out to have a very poor performance in terms of cluster recovery.
As a conclusion, we only remark that the proposed shape representation allows also to study and predict the evolution in time of a shape [15].

Table 1 .
a-Rand index and number of miss-classified fishes.