Please login first

List of accepted submissions

 
 
Show results per page
Find papers
 
  • Open access
  • 7 Reads
Large Deviations and Applications
, ,

This study examines rare event behavior using the principle of large deviations, offering a more nuanced probabilistic framework than the law of large numbers and the central limit theorem. Although the classical results characterize typical fluctuations around the mean, large deviation theory is concerned with the probabilities of atypical events and their asymptotic decay rates. This method is applied to two models originating from applied probability. The first pertains to the sojourn time within an M/M/1 queueing system, characterized by exponentially distributed service times that are crucial for assessing congestion and system performance. The second model deals with the count of erroneous seconds seen in telecommunications systems, which can be modeled by a Poisson distribution. We investigate the empirical mean for both contexts and produce theoretical large deviation estimates that are grounded in the respective log-Laplace transforms. To compare empirical probabilities with their theoretical counterparts, numerical simulations are conducted. The outcomes demonstrate a distinct exponential reduction of deviation probabilities with the growth of sample size, which aligns closely with large deviation predictions. The findings demonstrate the importance of large deviation principles for quantifying rare yet critical events and show their applicability in analyzing the performance and reliability of queueing and communication systems.

  • Open access
  • 9 Reads
A STUDY ON THE DEVELOPMENT OF A DEFECTIVE GOMPERTZ-G FAMILY OF DISTRIBUTIONS
, , , ,

This study develops an advanced lifetime modeling framework through the formulation of the defective Gompertz-G family of distributions. Classical survival and reliability models often assume that all experimental units will eventually experience the event of interest. This assumption is frequently violated in biomedical, demographic, and industrial reliability studies, where a proportion of individuals may remain event-free indefinitely, representing a cured or long-term survivor group. To address this limitation, the defective distribution concept is integrated into the flexible Gompertz-G family, thereby extending its applicability to lifetime data characterized by incomplete failure.

The methodology employed in developing the proposed model is based on the T-X transformation technique. In this approach, a baseline random variable X with a specified distribution is transformed through a generator variable T following the Gompertz distribution. The T-X framework provides a systematic mechanism for constructing generalized families of distributions by compounding the cumulative distribution function of the baseline model within the generator structure. To introduce defectiveness, a defect parameter is incorporated into the transformed distribution, modifying the cumulative distribution function so that the total probability mass is less than unity, thereby capturing cure-fraction behavior.

Closed-form expressions for the probability density function, cumulative distribution function, survival function, and hazard rate function are derived. Additional statistical properties, including quantile function formulation and reliability measures, are established. Parameter estimation is developed to support inferential analysis and future empirical applications.

  • Open access
  • 6 Reads
CLOSED-FORM EXPRESSIONS AND THE ASYMPTOTICS TO MOMENTS FOR THE EXCESS GOMPERTZ–MAKEHAM DISTRIBUTION
, , ,

Gompertz's law, published 200 years ago, became the main demographic model. It postulated an exponential increase in human survival risk with age. Since then, this model and its extensions have been successfully used in biology, actuarial science, and other fields. These models can be used in applications related to reliability theory. This article derives an exact formula for Gompertz–Makeham residual life expectancy and for residual variance that can serve for computational convenience. The moments are expressed in a closed form by using a generalized integro-exponential function or Meyer's G function. This allows direct calculation using standard software. In addition, data analysis of complex systems shows that datasets can often be characterized by the behavior of probability distributions at great age. The asymptotic expansions of the first and second residual moments and the error estimate are obtained for large argument values. The results presented in the article can serve as a tool for applications in the theory of risk, reliability and extreme events.

  • Open access
  • 9 Reads
Sharp Null Hypothesis Testing and Jeffreys-Lindley Paradox

Introduction

Frequentist and Bayesian hypothesis testing frameworks are used extensively in empirical research for drawing scientific conclusions. However there are some instances where confusions arise e.g., Jeffreys-Lindley paradox is a case where the two frameworks contradict with each other. This has caused confusions among data analysts for selecting a methodology for their statistical inferences. Though the paradox goes back to 1950's there hasn't been a satisfactory resolution to it so far, especially for the empirical researcher.

Method

We show that the paradox arises mainly due to the fact that, in the frequentist approach, it is allowed to have type-I errors and difference between hypothesized parameter value and its observed estimate is assessed in terms of standard error of the estimate, no matter what actual numerical difference between them is and how small the standard error is, whereas in the Bayesian methodology this has no effect due to its definition of Bayes factors. In fact, the paradox is an instance of conflict between statistical and practical significance. This can be seen as a result of using sharp null hypothesis to approximate acceptable small range of values for the parameter. We also show how that the frequentist null hypothesis testing should be modified so that its conflicting conclusions with the Bayesian method can be avoided. We also show why and how any uncertainty in p-values can be addressed.

Results

We have shown how to resolve the Jeffreys-Lindley paradox through a mathematical analysis. And it is also shown how to perform sharp null hypothesis tests so that undesirable rejections of null hypotheses are avoided.

Conclusion

It is possible to give a mathematical explanation to the Jeffreys-Lindley paradox, thus resolving it, and modify the frequentist null hypothesis testing methodology so that it has no conflict with the Bayesian hypothesis testing.

  • Open access
  • 5 Reads
Kernel-Based Nonparametric Tests for Exponentiality Against Decreasing or Increasing Residual Entropy Alternatives

Introduction. Testing for exponentiality plays a central role in reliability theory and survival analysis, since the exponential distribution is uniquely characterized by a constant hazard rate and memoryless property. Classical goodness-of-fit procedures often rely on moment-based methods or distribution-function approaches. More recent developments emphasize information-theoretic measures such as entropy. In particular, residual entropy has proven useful for detecting departures from exponentiality under decreasing or increasing residual life uncertainty alternatives (see Ebrahimi, 1997; Benaoudia and Aissani, 2023), since under exponentiality, the residual entropy is constant.

Methods. In this paper, we propose a nonparametric test for exponentiality based on kernel estimators of Shannon entropy and residual entropy. The approach follows the framework introduced by Belzunce, Navarro, and Guillamon (2001), replacing histogram-based estimators with kernel smoothing techniques. This approach improves the smoothness and convergence of the estimators and reduces dependence on bin selection. A test statistic is constructed to test exponentiality, which exhibits constant residual uncertainty against decreasing or increasing uncertainty. Theoretical properties of the statistic are established under regularity assumptions, including almost-sure convergence of the empirical statistic to the theoretical one under the null hypothesis, applying the standard result on convergence of Stieltjes integrals and using the almost-sure convergence of the empirical kernel distribution to the theoretical distribution.

Results. Critical values of the proposed test are obtained via Monte Carlo simulations for various sample sizes and significance levels. Under regularity conditions, the test converges almost surely. Power studies under Weibull and Gamma alternatives show that the procedure achieves high sensitivity against increasing residual life uncertainty alternatives. Moreover, Pitman asymptotic efficiency comparisons indicate that the proposed kernel-based statistic consistently outperforms several competing entropy-based tests. It is not only effective in finite samples but also asymptotically more efficient in detecting exponentiality against monotonic residual entropy alternatives.

Conclusions. Overall, the proposed test provides a robust and efficient tool for assessing exponentiality. Its strong finite-sample behavior, superior asymptotic efficiency, and stability induced by kernel estimation make it especially suitable for applications in reliability and survival analysis. Future work may extend the methodology to multivariate lifetime models.

  • Open access
  • 6 Reads
A Stochastic Epidemic Model with Deep Learning Correction for Real-Time Estimation of COVID-19 Under-Reporting

Accurate real-time estimation of key epidemic parameters, particularly the effective reproduction number (Rₜ), is critically undermined by the pervasive and variable issue of case under-reporting in public health surveillance data, leading to biased models and flawed policy insights. To address this fundamental problem of unobserved true incidence, we develop a novel hybrid computational framework that synergistically integrates a mechanistic stochastic epidemiological model with a data-driven deep learning corrector. Our methodology first constructs a stochastic Susceptible–Exposed–Infectious–Removed (SEIR) model where the time-varying transmission rate is modeled as a flexible latent function using a Gaussian Process. The core innovation is the seamless coupling of this model to a Temporal Convolutional Network (TCN) module, which is trained jointly via amortized variational inference to learn the complex, non-stationary mapping between the model's simulated true incidence and the officially reported cases, thereby explicitly correcting for biases stemming from fluctuating testing capacity, healthcare access, and reporting behavior. Applied to COVID-19 case and mortality data from Italy, Germany, and France, our hybrid model significantly improved the accuracy and stability of Rₜ estimation, reducing 14-day-ahead forecast error for hospital admissions by 40% compared to a pure stochastic SEIR model and outperforming standard Bayesian filtering techniques. The TCN module provided interpretable, time-varying reporting probabilities, identifying periods of severe under-reporting (up to 70% correction) that correlated strongly with independent indicators of testing coverage. This paradigm successfully merges principled epidemiological theory with flexible machine learning, creating a robust tool for real-time situational awareness that provides reliable estimates of true transmission dynamics from incomplete data, with direct applicability to managing future outbreaks of emerging infectious diseases.

  • Open access
  • 4 Reads
A Bayesian Structural Time Series Model with Graph-Based Regularization for Forecasting Urban Traffic Flow

Accurate multi-step traffic forecasting in urban networks is essential for intelligent transportation systems, yet a key challenge remains: standard models often treat road segments independently, ignoring the critical spatial dependencies imposed by the road network's topology. To address this, we introduce a novel forecasting framework that integrates network structure directly into a probabilistic time series model. Our method develops a Bayesian Structural Time Series (BSTS) model for each key road segment, incorporating local trend, daily seasonality, and dynamic regressors. The central innovation is the application of a graph Laplacian prior on the posterior distributions of the contemporaneous coefficients across the network. This regularization, informed by the actual connectivity graph, facilitates information sharing between neighboring segments, thereby penalizing implausible spatial discontinuities in the learned parameters. Inference is performed via Markov Chain Monte Carlo (MCMC) sampling. Applied to high-frequency data from a 50-node subnetwork of a major European city, our graph-regularized model reduced the mean absolute percentage error (MAPE) for 60-minute forecasts by 18% on average compared to independent BSTS models and significantly outperformed vector autoregression (VAR) and LSTM benchmarks, especially during non-recurrent congestion. This demonstrates that formally incorporating network science principles via a graph-Laplacian prior into a state-space statistical framework yields substantially improved and spatially coherent forecasts, a methodology generalizable to other networked time series problems such as in economics or epidemiology.

  • Open access
  • 4 Reads
Efficient Computation of Complementary Set Partitions with Applications to Generalized Cumulants

This talk presents a new combinatorial framework for the efficient computation of complementary set partitions, which play a central role in the theory of generalized cumulants. Generalized cumulants are used to express joint cumulants of polynomial functions of random variables and arise in several areas of statistics, including likelihood theory, bootstrap methods, and time series analysis. Despite their broad applicability, the practical use of generalized cumulants is often limited by the high computational complexity involved in enumerating complementary set partitions.

Most existing approaches rely on connected graph representations, Laplacian matrix calculations, or symbolic algebra systems. Although theoretically sound, these methods quickly become impractical as the number of indices increases and are difficult to implement efficiently in non-symbolic programming environments. In this talk, we propose a novel and purely combinatorial algorithm that overcomes these limitations. The key idea is to characterize all non-complementary partitions using only two-block partitions of the block index set. Complementary set partitions are then obtained by exclusion, avoiding graph-based constructions altogether.

The resulting algorithm is conceptually simple, computationally efficient, and well suited for implementation in open-source, non-symbolic languages such as R. Numerical comparisons with existing methods show a substantial reduction in computation time for moderate and large partition sizes, confirming the scalability of the proposed approach.

From a statistical perspective, the talk also introduces an extension of generalized cumulants to settings involving repeated random variables. This extension is based on multiset subdivisions and their representation via multi-index partitions, allowing the inclusion of powers of random variables and more complex dependence structures. By exploiting a suitable labeling rule and a representation of set partitions as binary vector partitions, we derive closed-form expressions for generalized multivariate cumulants in terms of products of multivariate cumulants, together with a combinatorial interpretation of the associated coefficients.

  • Open access
  • 10 Reads
A Probabilistic Fuzzy Framework for Decision Making Under Uncertainty in Complex Systems
,

Decision-making problems in real-world systems are often affected by ambiguity, incomplete information, and uncertainty that cannot be adequately represented using classical deterministic or purely probabilistic models. While probabilistic approaches are effective in modeling random variability, they are limited to some situations where vagueness and subjectivity play a dominant role. This paper proposes a new probabilistic fuzzy decision-making framework to address such uncertainty in complex decision environments.

The proposed approach combines fuzzy set theory with probabilistic concepts and operations research principles, allowing the simultaneous treatment of stochastic uncertainty and linguistic imprecision. A generalized aggregation mechanism is developed to integrate fuzzy evaluations with probabilistic criterion importance, resulting in a comprehensive decision score for each alternative. Key theoretical properties of the framework, including consistency, boundedness, and stability with respect to uncertainty variations, are analytically investigated.

The applicability of the proposed framework is illustrated through representative multi-criteria decision-making scenarios involving conflicting criteria and imprecise information. The results demonstrate that the proposed approach provides more flexible and reliable decision outcomes compared to traditional crisp and purely probabilistic methods.

This study contributes to the advancement of decision theory and fuzzy systems, offering a mathematically sound and adaptable framework with potential applications in operations research, data analysis, and interdisciplinary decision-support problems.

  • Open access
  • 4 Reads
A Beta-New XLindley Lifetime Model with Applications


Lifetime data arise in reliability engineering, medical studies, and biological sciences, where flexible probability models are required to describe skewed and positive observations. Classical lifetime distributions are often too restrictive and may fail to capture diverse hazard rate behaviors observed in practice. In this paper, we introduce a new three-parameter lifetime model constructed by applying the beta-generated mechanism to the polynomial exponential distribution. The proposed beta—new XLindley distribution extends the baseline family by incorporating additional shape parameters that enhance modeling flexibility. Several fundamental properties of the new model are derived, including the probability density function, cumulative distribution function, reliability function, and hazard rate function. Parameter estimation is carried out using the maximum likelihood method, and numerical optimization techniques are employed to obtain the estimators. The practical performance of the proposed distribution is illustrated using a real data set consisting of luteinizing hormone measurements. Model adequacy is assessed through standard information criteria such as AIC, BIC, AICC, and the log-likelihood. The results demonstrate that the beta-new XLindley distribution provides a better fit than several well-known competing lifetime models. Overall, the proposed distribution offers a flexible and effective tool for modeling positive lifetime data and is particularly suitable for applications in medical statistics and reliability analysis.

Top