Invariance of semantic projections under changes in word universes in NLP

Ana Coronado Ferrer

Previous Article in event

Studying the stability of variable order caputo fractional differential equations

Previous Article in session

A Communication-Free Parallel Screened Poisson Solver for Incompressible Navier–Stokes

Next Article in event

An Optimized Accelerator for Option Pricing Using Monte Carlo Simulation on a GPU

Next Article in session

Temporal Graph Neural Architectures for Predicting State-Administered Energy Prices: A Deep Learning Framework for Geopolitically Volatile Markets

Invariance of semantic projections under changes in word universes in NLP

Ana Coronado Ferrer

¹ Instituto Universitario de Matemática Pura y Aplicada, Universitat Politècnica de València, Valencia 46022, Spain

Academic Editor: Marjan Mernik

Published: 04 June 2026 by MDPI in The 2nd International Online Conference on Mathematics and Applications session Mathematics, Computer Science and Artificial Intelligence

Abstract:

This communication presents a foundational framework for semantic projections in Natural Language Processing (NLP), focusing on word embeddings and projection-based semantic indices. We formalize two complementary perspectives: (i) geometric embeddings, where terms are represented in $\mathbb{R}^d$ and semantic proximity is measured through a metric; and (ii) set-based representations, where meaning is modeled in a finite measure space and projections arise as normalized overlap ratios.

Under Lipschitz regularity assumptions, we show that projection estimators admit explicit error bounds, ensuring stability and consistency across representations. Building on these foundations, we address two practical stability questions in NLP pipelines: \emph{projection coherence} (consistency of projections across sources) and \emph{universe transfer} (estimation of projections when the semantic universe changes). To quantify and control instability, we exploit Lipschitz regularity assumptions and metric-based estimators, including McShane–Whitney-type extensions.

We introduce computable metric-based estimators and prove that they preserve Lipschitz constants, yielding controlled transfer errors. In particular, we show that any Lipschitz semantic projection defined on a universe $U_0$ can be extended to a larger universe $U_1 \supset U_0$ by using the McShane--Whitney formulas and interpolating, preserving the Lipschitz constant and enabling stable semantic transfer.

Empirically, we validate the framework on multi-source semantic projections (DOAJ, Scholar, Google, and Arxiv), showing that coherence can be quantified using correlation and clustering, and that universe transfer can be evaluated through RMSE-type errors. These results demonstrate that modeling choices (data source, universe design, and estimator selection,) have measurable effects on semantic conclusions, providing both theoretical guarantees and practical diagnostics for robust NLP pipelines. This work extends our previous study published in \emph{Axioms} (14(5), 389).

Keywords: Semantic projections; Word embeddings; Lipschitz regularity; Semantic stability; Metric measure spaces

View Poster

13 Reads
0 Recommendations

Ana Coronado Ferrer