Developing predictive models for drug efficacy is challenged by the complexity and heterogeneity of bioassay data. Here, we present LIFE.PTML, a methodology integrating drug Lifecycle (L), Information Fusion (IF), Encoding (E), Perturbation Theory (PT), and Machine Learning (ML), to predict compound activity across diverse experimental conditions. Using a dataset of 3748 molecule-assay combinations targeting calmodulin (CaM) and related proteins, LIFE.PTML combines chemical and protein descriptors, quantifies experimental variability via perturbation operators, and trains non-linear classifiers, including XGBoost and Gradient Boosting. XGBoost achieved the best performance, with 88.9% test accuracy and ROC AUC of 0.959, while feature importance analysis highlighted contributions from both drug- and protein-level descriptors. The results demonstrate that LIFE.PTML provides a robust, flexible, and interpretable framework for predictive chemoinformatics, facilitating the integration of multi-source data for drug discovery applications.
Previous Article in event
Next Article in event
LIFE.PTML Model Development Targeting Calmodulin Pathway Proteins
Published:
13 November 2025
by MDPI
in The 29th International Electronic Conference on Synthetic Organic Chemistry
session Computational Chemistry
https://doi.org/10.3390/ecsoc-29-26890
(registering DOI)
Abstract:
Keywords: drug discovery; calmodulin; chemoinformatics; machine learning; LIFE.PTML
