Please login first
Variational AutoEncoder (VAE) and Lie-SVM Approaches in capturing Static and Dynamic Generative-Discriminative Features of Visual Datasets
1  Department of Mathematics, The Chinese University of Hong Kong, Hong Kong, China
2  Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong, China
Academic Editor: Marjan Mernik

Abstract:

For better quantification of static and dynamic texture features and visual analysis within videos and animations, the combination of geometry, probabilistic formulation, and the latest machine learning and artificial intelligence mechanisms has become essential. Due to the potential and capability in image generation and dimensionality reduction, this study unites the mathematical underpinnings of Variational AutoEncoder (VAE) models for producing video frames with the geometric and algebraic framework of Lie group manifolds for dynamic video texture classification via Support Vector Machines (SVMs). Using VAE models, theoretical foundations of variational inference in autoencoding and decoding processes, the decomposition of the VAE loss function into latent KL divergence, and reconstructing loss for regularizing latent space distribution and preserving input feature fidelity will be explored. Case studies generating new interfaces and validating the effectiveness of data clustering mechanisms with the MNIST database will be used for performance assessments.

Despite VAE’s effectiveness in latent variable disentanglement and categorizing latent spaces, it is necessary to classify the dynamic texture of videos due to their continuous moving nature. Thus, a geometric approach combining an autoregressive moving average (ARMA) model, Lie group manifold, and matrix shape of Gaussian (SOG) descriptors was adopted to formalize video dynamic textures; then, a kernel function based on the Riemannian distance in the Lie group manifold was incorporated into the traditional SVM model to capture non-Euclidean manifold structure and implement the Lie-SVM multi-classifier with rigorous geometric regularization. Empirical validation based on sequences of images from a dynamic video confirms that our proposed algorithm yields superior classification performance, while preserving geometric invariants of manifold in kernel space.

The synergy of these mathematical and artificial intelligence methodologies can effectively handle and analyze both static generative visual content and dynamic texture videos, eventually bridging the gap between latent space design and discriminative manifold-aware feature classification for interactive user interface applications in the future.

Keywords: Variational AutoEncoder (VAE); Lie-SVM; Image and Video generation; Generator and Discriminator; Dynamic Texture Classification; SOG Descriptors; Data Clustering; Geometric Regularization; User Interface

 
 
Top