Please login first

List of accepted submissions

 
 
Show results per page
Find papers
 
  • Open access
  • 13 Reads
Hybrid Federated Learning with Client-Side Personalization for Privacy-Preserving and Scalable Medical Imaging Analytics

The rapid growth of healthcare data across hospitals, imaging centers, and wearable devices creates opportunities for data-driven clinical decision support, yet strict privacy regulations prevent centralized aggregation of sensitive records. Federated Learning (FL) enables decentralized model training without sharing raw data; however, traditional FL is still shown to degrade performance under heterogeneous client distributions and shows limited adaptability to local environments. This feasibility study presents a Hybrid Federated Learning Framework that combines global model aggregation with local, client-based personalization to promote scalability and accuracy at the local level in multi-institutional contexts. A ResNet-18 backbone was trained on the NIH ChestX-ray14 dataset with patient-level data partitioning. The hybrid FL model achieved mean AUROC scores between .74 and .87, which closely approximated a centrally developed model. Localized, personalized improvements were also found to augment holdout AUROC by 2-6% as compared to a global-only baseline. To bolster privacy, we employed differential privacy on the client-level and showed that moderate differential privacy budgets (ε approximated 2-5) provided a similar level of accuracy as the global model, with less than minimal utility loss. The results suggest that personalized hybrid FL is a secure, privacy-preserving, and scalable framework for healthcare analytics and achieves near-centralized computational performance while maintaining the privacy of patient information. Future work will expand this framework for real-time Internet of Things (IoT) medical devices and utilize communication-efficient aggregation methods and compression techniques, including blockchain-assisted secure protocols.

  • Open access
  • 13 Reads
Efficient Glaucoma Detection through a Custom CNN Architecture on Retinal Fundus Datasets

Glaucoma is a progressive optic neuropathy and one of the major causes of irreversible blindness worldwide. Early and accurate detection is essential to prevent vision loss, and retinal fundus imaging provides a non-invasive modality for clinical screening. In this paper, a customized convolutional neural network (CNN) is proposed for automated glaucoma classification using two publicly available datasets: Drishti-GS1 and ACRIMA. The proposed CNN consists of multiple convolutional and pooling layers with dropout regularization, designed to extract hierarchical features while avoiding overfitting. Data augmentation techniques such as rotation, shear, zoom, and brightness adjustment were applied to increase dataset diversity, and hyperparameters including batch size and learning rate were systematically tuned for optimal performance. Experimental results demonstrate that the proposed CNN achieved an accuracy of 90.32% on the Drishti-GS1 dataset and 96.45% on the ACRIMA dataset under an 80:20 split. The model also showed particularly high sensitivity, reaching 100% on Drishti-GS1 and 96.20% on ACRIMA, which is critical for minimizing false negatives in clinical screening. Furthermore, the proposed network outperformed widely used pre-trained CNNs such as AlexNet and ResNet-50, surpassing them in accuracy, sensitivity, and AUC. On the ACRIMA dataset, the model achieved an AUC exceeding 0.99, demonstrating its robustness and effectiveness as a reliable tool for automated glaucoma detection and screening.

  • Open access
  • 6 Reads
Enhancing Candidate Generation in Recommendation Systems through LLM-Powered Semantic Enrichment in a Distributed Environment

Introduction
Effective candidate generation is critical for two-stage recommender systems; however, traditional methods, such as TF-IDF, often fail to capture the deep semantic context. This limitation leads to suboptimal recall, particularly for new or niche items (the "cold start" problem), negatively impacting the overall quality of the recommendations and user experience. This study addresses the need for a more semantically aware approach to the initial recall phase.
Methods
We propose a novel methodology that integrates a Large Language Model (LLM) into a distributed Apache Spark pipeline for large-scale content enrichment. This process generates high-quality vector embeddings and concise, context-aware summaries for each content item in the feed. These enriched data points were then indexed into Elasticsearch to facilitate efficient and semantically aware vector-based retrieval during the candidate generation phase.
Results
Our quantitative analysis compared the LLM-enriched method against a traditional TF-IDF baseline using the Recall@10 metric. The proposed method achieved a Recall@10 of 62%, representing a 37% relative improvement over the baseline's 45%. This demonstrates a substantial increase in the relevance of generated candidates. Furthermore, the resulting candidate pool showed a marked improvement in semantic diversity, better covering niche user interests and improving the quality of items passed to the ranking stage.
Conclusions
Leveraging LLMs for semantic enrichment in a distributed environment provides a powerful solution for enhancing the recall stage of recommender systems. This method provides a richer, more contextually aware input for downstream ranking models and effectively mitigates the cold-start problem, paving the way for more accurate and personalized content discovery.

  • Open access
  • 3 Reads
Explainable Artificial Intelligence for Social Sciences and Humanities: A Systematic Review

This systematic review examines the integration of Explainable Artificial Intelligence (XAI) methodologies within social sciences and humanities research, focusing on three principal approaches, feature attribution, counterfactual analysis, and model-agnostic visualization, and their application across diverse empirical domains. Feature attribution techniques, such as SHAP and LIME, have been adopted in archival text studies to quantify the contribution of individual lexical elements to topic model outputs, thereby elucidating latent thematic structures. Counterfactual analysis has proven instrumental in social media sentiment research, wherein minimally perturbed inputs expose classifier decision boundaries and reveal embedded biases. Model-agnostic visualization tools further enable scholars to interactively explore decision surfaces in network models of historical social structures, facilitating critical interrogation of community detection and relational dynamics. By synthesizing documented methodological workflows and available open-source toolkits, we identify best practices for harmonizing disciplinary expertise with computational frameworks, including guidelines for model selection, XAI implementation, and domain-expert validation. Evaluation metrics, explanation fidelity, coherence, and end-user interpretability are extracted from empirical studies to benchmark transparency and reproducibility. The proposed workflow begins with the articulation of a precise research question, proceeds through iterative model development and explanation generation, and culminates in collaborative validation with subject-matter experts. This integrated approach advances robust, accountable, and contextually informed computational inquiry, thereby fostering the maturation of XAI as an indispensable instrument in social sciences and humanities scholarship.

  • Open access
  • 10 Reads
AutoML with Explainable AI Analysis: Optimization and Interpretation of Machine Learning Models for the Prediction of Hysteresis Behavior in Shape Memory Alloys

Shape memory alloys (SMA) belong to the class of smart materials characterized by unique properties – shape memory effect and superelasticity. Due to its superelasticity, the material can withstand significant strains (up to 8-10%) and completely recover its original shape after the load is removed. Under cyclic loading and unloading, a characteristic hysteresis loop forms due to reversible phase transformations between martensite and austenite, reflecting the nonlinear behavior of the material. Reproducing and predicting this behavior is crucial for assessing the durability of structures, but traditional analytical models often fail to provide adequate accuracy. This research employs an automated approach (AutoML) to build machine learning models for predicting the hysteretic behavior of SMA. AutoML offers a systematic and reproducible approach for selecting optimal algorithms and hyperparameters, eliminating the need for manual intervention. Training and testing were performed based on experimental data from 150 cycles of NiTi alloy loading and unloading. For a frequency of 1 Hz, the model showed high prediction accuracy with an MSE equal to 0.0012, an MAE equal to 0.0282, an R2 equal to 0.9975, and an MAPE equal to 0.0124, confirming its consistency with experimental data. Machine learning models were also built for other load frequencies. The interpretation of the models’ results was facilitated by Explainable AI tools, specifically the SHAP method, which enabled us to evaluate feature contributions on both a global and local scale. The results confirm the effectiveness of combining AutoML and Explainable AI for accurate and explainable prediction of SMA hysteresis behavior.

  • Open access
  • 8 Reads
Analyzing Seasonal Vegetation Variations in Southwestern Madagascar with Unsupervised Classification of Long-Term MODIS Data
, , , ,

Satellite data have become an essential tool in environmental monitoring and ecosystem assessment. This study investigates the application of unsupervised classification to characterize the spatio-temporal dynamics of vegetation in southwestern Madagascar, a region highly vulnerable to climatic variability. MODIS Collection MOD13Q1 products were selected despite their relatively coarse spatial resolution, due to their dense temporal coverage, enabling the analysis of a long time series from 2001 to 2024. The methodological framework is based on clustering pixels according to their monthly growth profiles derived from the Normalized Difference Vegetation Index (NDVI). Seasonal variations, including the wet and dry seasons, were explicitly considered. To ensure robustness, results from K-means clustering were cross-validated with Hierarchical Ascendant Classification (HAC), allowing us to compare and consolidate class stability. The classification identified seven distinct profile classes, reflecting both seasonal phenological patterns and dominant vegetation cover types. These results provide crucial insights for spatio-temporal monitoring and mapping of ecosystems, contributing to improved environmental surveillance in the region. Overall, the study demonstrates the effectiveness of unsupervised classification in extracting meaningful information from satellite time series. By offering a detailed understanding of vegetation dynamics over two decades, this approach highlights valuable opportunities for sustainable management and conservation of natural resources in southwestern Madagascar.

  • Open access
  • 12 Reads
Phase-Aware and Sensor-Level Interpretability in Human Activity Recognition via Consistency-Regularized CNN-LSTM-SE Networks
, , , , , ,

Human Activity Recognition (HAR) has applications in healthcare, assistive technology, and security, where both interpretability and accuracy are necessary for real-world implementation. Existing deep learning methods such as CNN-LSTM hybrids have a tendency to behave as uninterpretable "black boxes," lowering the confidence of users in real deployments. In this work, we present a novel HAR framework with explainability built directly into the architecture and training process. Our approach integrates Squeeze-and-Excitation (SE) attention into a CNN-LSTM backbone for recalibrating feature importance, and introduces a hierarchical interpretability strategy that uncovers both sensor-level and temporal phase-level relevance for activity recognition. To render explanations reliable, we design a consistency-based regularization objective that fosters stable and sparse attention patterns across samples, making interpretability intrinsic to the learning process rather than an afterthought. Furthermore, we present a phase-aware visualization method that maps attention weights to sensor modalities and activity phases, offering intuitive and actionable insights to domain experts. Experimental evaluation on a real-life HAR dataset demonstrates that the proposed framework achieves above 96% classification accuracy, outperforming conventional multiheaded CNN-LSTM, while offering robust and interpretable explanations of activity patterns. This work takes HAR to the next level by integrating high predictive power with intrinsic trust and interpretability, paving the way for deployment in safety-critical domains.

  • Open access
  • 7 Reads
Challenges Associated With the Composition and Nutritional Value of Oatmeal Products

Oatmeal is widely recognised for its health benefits and is often used in breakfast cereals due to its favourable nutritional profile and naturally gluten-free nature. Quick-to-prepare cereals with a variety of flavours are particularly popular among consumers. Instant oatmeal is often enriched with ingredients, such as freeze-dried raspberries, apple, chocolate, coconut, etc., that enhance taste and palatability. Nutritional labelling can reveal unexpectedly high sugar content and, in some cases, the presence of gluten. This study compared the declared nutritional values ​​of oatmeal (10 packs) with those of instant oatmeal (65 packs) available in Belgrade, Serbia, in July 2025. Sugar content was significantly higher in instant cereals (median 16 g/100 g) than in oatmeal (median 0.85 g/100 g). Protein content was higher in oatmeal (median 14 g/100 g) than in oatmeal products (median 11 g/100 g). The quantities of total fat (median 7.8 g/100 g vs. 6.85 g/100 g) and saturated fatty acids (median 2.1 g/100 g vs. 1.0 g/100 g) were slightly higher in instant cereals. Carbohydrate content was slightly lower in oatmeal (median 56 g/100 g) than in oatmeal products (median 63 g/100 g), and energy values ​​followed the same trend (median 1624 kJ/100 g versus 1534.5 kJ/100 g). Despite oats being the main ingredient in both products, the addition of different ingredients significantly affects the nutritional profile and consumer perception of instant cereals compared to plain oatmeal. These findings are relevant both for consumers, who are concerned about the sugar content of flour, and for manufacturers developing instant oatmeal formulations.

  • Open access
  • 9 Reads
Safety Boundary of Driving Force for Electric Trailers: Stability Analysis of Articulated Vehicles via Co-Simulation
, , , , ,

Introduction

Electric trailers enhance the tractive performance of conventional articulated vehicles, yet pose significant instability risks (e.g., jack-knifing) during high-torque maneuvers​due to inappropriate driving force intervention. This study systematically quantifies the impact of electric trailer propulsion on vehicle stability through dynamic co-simulation and defines its safety-critical operational boundaries to inform real-time control strategies.

Methods

A high-fidelity vehicle model integrating a tractor and electric trailer was developed in TruckSim, incorporating suspension dynamics and Pacejka tire models. Co-simulation with Simulink enabled bidirectional data exchange: TruckSim provided real-time vehicle states, while Simulink implemented driving force allocation algorithms. Stability criteria included steering angle threshold () and yaw rate deviation (). Critical scenarios (e.g., cornering at 0.4g lateral acceleration, µ-split braking) were tested.

Results

  1. Electric trailers improved tractive performance by ​18%​​ in straight-line acceleration but increased jack-knifing risk by ​120%​​ during low-friction cornering when driving torque exceeded 1,200 N·m.
  2. The safety boundary was characterized by dynamic constraints: ​articulation angle and ​yaw rate error ​. Model Predictive Control (MPC) enforcing these boundaries reduced instability incidents by ​67%​​ in emergency maneuvers.

Conclusions

Electric trailers require strict driving force constraints to mitigate instability. The proposed safety boundary, validated through TruckSim-Simulink co-simulation, provides a foundational framework for real-time control systems. Future work should address sensor latency and road uncertainty.

  • Open access
  • 13 Reads
A Machine Learning-Integrated Decision Tree and AHP Multi-Criteria Decision-Making Approach for High-Temperature Thermochemical Energy Storage Materials
,

High-temperature thermochemical energy storage (HT-TCES) materials are essential to enable efficient industrial waste heat recovery and widespread deployment of renewable energy. However, determining the most appropriate material requires addressing several interrelated criteria, including thermal stability, reaction enthalpy, cycling behavior, cost, and environmental impact. To manage this complexity, this study proposes a hybrid framework between Decision Tree (DT) and Analytical Hierarchy Process (AHP) as a multi-criteria decision-making (MCDM) methodology integrated with machine learning for systematic HT-TCES material selection. The framework begins with a DT-based feature selection stage, which automatically determines the relative importance of evaluation criteria and filters out less significant attributes from a large initial set. DTs are chosen for their high interpretability and ability to provide explicit, rule-based explanations, allowing decision makers to understand why certain criteria are prioritized. The refined set of critical criteria is then analyzed through AHP, which structures pairwise comparisons and calculates consistent priority weights to rank candidate materials. Applied to high-temperature industrial waste-heat recovery (above 500 °C), the integrated DT–AHP model identifies the most suitable materials by simplifying the decision process, clarifying the choices, and ensuring a robust selection method. Sensitivity analysis shows that the material rankings remain consistent even when input conditions vary. This interpretable ML+MCDM approach offers a scalable decision-support tool for energy planners and policymakers, facilitating the sustainable deployment of thermochemical storage technologies and supporting global decarbonization objectives.

Top