An Information-theoretic Approach to Unsupervised Feature Selection for High-Dimensional Data

Shao-Lun Huang

doi:10.3390/ecea-5-06697

Previous Article in event

Interpreting the High Energy Consumption of the Brain at Rest

Previous Article in session

Reverse Weighted-Permutation Entropy: A Novel Complexity Metric Incorporating Distance and Amplitude Information

Next Article in event

Entropy fluctuations reveal microscopic interactions

An Information-theoretic Approach to Unsupervised Feature Selection for High-Dimensional Data

Shao-Lun Huang

¹ TBSI

Published: 18 November 2019 by MDPI in 5th International Electronic Conference on Entropy and Its Applications session Information Theory, Probability, Statistics, and Artificial Intelligence

https://doi.org/10.3390/ecea-5-06697

Abstract:

In this talk, we propose an information theoretic approach to design the functional representations to extract the hidden common structure shared by a set of random variables. The main idea is to measure the common information between the random variables by the Watanabe's total correlation, and then find the hidden attributes of these random variables such that common information between these random variables is reduced the most given these hidden attributes. We show that these hidden attributes can be characterized by an exponential family specified by the eigen-decomposition of some pairwise joint distribution matrix. Then, we adopt the log-likelihood functions for estimating these hidden attributes as the desired functional representations of the random variables, and show that these functional representations are informative to describe the common structure. Moreover, we design both the multivariate alternative conditional expectation (MACE) algorithm to compute the proposed functional representations for discrete data, and a novel neural network training scheme for continuous or high-dimensional data. Finally, the performances of our algorithms are validated by numerical simulations in the MNIST digital recognition.

Keywords: Information theory; unsuperevised learning

View paper

86 Reads

Comments on this paper

Geert Verdoolaege

19 November 2019

Relation to other methods

Dear author,

This appears to be a powerful method for feature extraction. Which other state-of-the-art methods could your method be compared to in terms of performance of extraction of informative features? I am thinking for instance of manifold learning techniques.

Thank you for your reply.

Shao-Lun Huang

20 November 2019

This is a very nice comment. In fact, more real experiments are important to demonstrate the effectiveness of this approach, and some experiments to apply the proposed neural network architecture in MNIST problem can be found in https://arxiv.org/abs/1910.03196. Unfortunately, we do not have much experimental comparison with other approaches like manifold learning now. We are still working on more real experiments to compare with related approaches like information sieve, which will be the future direction.

Shao-Lun Huang