Predicting the activities of the chemical compounds by using in silico methods has been shown to be a cost- and time-effective way of aiding chemists in synthesizing new biological active compounds. MCF-7 is a commonly used breast cancer cell line, that has been propagated for many years by multiple groups. In this study a quantitative structure–activity relationship (QSAR) model [1-4] was developed to predict the anticancer activity for a diverse set of organic compounds. A number of models were developed, where a seventeen-variable model showed the best predictive performance with r2 = 0.887 and q2LOO = 0.828. The robustness and predictability of the best model was validated using the leave-one-out technique, external set and y-scrambling methods. The predictive ability of the model was confirmed with the external set, showing the r2ext = 0.817. The developed model can be used in the prediction of the anticancer activity of new and untested organic compounds.
Materials and methods
The dataset of the compounds for the present research work was collected from several published experimental data [5-7] with anticancer activity (AA). All original activity data has been converted into molar 1/log(AA) response variables.
Results and discussion
The whole set of 105 compounds was divided into the training set consisted of 84 compounds and a test set (predicting set) of 21 compounds. GA-MLRA technique has identified several models. Statistical characteristics with seventeen descriptors variables models are obtained.
The following equation represent the developed model towards the AA:
1/Log(AA)= 0.001(±0.0005)T(N..F)+6.858(±9.022)X2A+ 9.937(±3.472)BELm1+1.955(±1.510)BELv3+0.029(±0.018)RDF080m+ 0.264(±0.211)Mor18u-0.343(±0.211)Mor21u-0.030(±0.097)Mor07m-0.155(±0.055)Mor09m+21.201(±18.218)G2e-10.253(±7.711)ISH-3.915(±2.455)HATS3m-57.537(±31.154)R7u+-3.630(±1.665)R1e-0.081(±0.054)n=CR2-0.099(±0.126)nCOOR+0.087(±0.261)nNHR-8.969(±12.499)
This model shows the best r2 and q2 values for the training set, and the best predictive potential for the test set for AA.
Conclusion. A QSAR study has been performed on the set of 105 organic compounds to analyze and predict IC50 values of a series of compounds related to anticancer activity. QSAR analysis was performed using a combination of machine learning methods, such as GA for variable selection and MLRA.
As a result, a transparent, mechanistic model to predict IC50 values related to anticancer activity is proposed. The best overall performance is achieved by seventeen-variable QSAR model, where r2 values for the training and test sets are 0.887 and 0.817, respectively. The significant molecular descriptors related to the compounds with anticancer activity are: T(N..F), X2A, BELm1, BELv3, RDF080m, Mor18u, Mor21u, Mor07m, Mor09m, G2e, ISH, HATS3m, R7u+, R1e, n=CR2, nCOOR, and nNHR. Obtained model can be used to estimate the anticancer activities.
REFERENCES
[1] Puzyn T., Rasulev B., Gajewicz A., Hu X., Dasari T.P., Michalkova A., Hwang H.M., Toropov A., Leszczynska D., Leszczynski J. Using Nano-QSAR to predict the cytotoxicity of metal oxide nanoparticles, Nature Nanotechnology, 2011, 6, 175-178
[2] Turabekova, M.A., Rasulev, B., Dzhakhangirov, F.N., Leszczynska, D., Leszczynski, J., Aconitum and Delphinium alkaloids of curare-like activity. QSAR analysis and molecular docking of alkaloids into AChBP, European Journal of Medicinal Chemistry, 2010, 45 (9), 3885-3894
[3] Gajewicz A., Rasulev B., Dinadayalane T., Urbaszek P., Puzyn T., Leszczynska D., Leszczynski J. Advancing risk assessment of engineered nanomaterials: Application of computational approaches, Advanced Drug Delivery Reviews, 2012, 64 (15), 1663-1693
[4] Patnode K., Demchuk Z., Johnson S., Voronov A., Rasulev B. Combined Computational Protein-ligand Docking and Experimental Study of Bioplastic Films from Soybean Protein, Zein and Natural Modifiers, ACS Sustainable Chemistry and Engineering, 2021, 9, 10740-10748,
[5] Abdulrahman, H.L., Uzairu, A. & Uba, S. QSAR, Ligand Based Design and Pharmacokinetic Studies of Parviflorons Derivatives as Anti-Breast Cancer Drug Compounds Against MCF-7 Cell Line. Chemistry Africa, 2021, 4, 175–187. doi.org/10.1007/s42250-020-00207-7
[6] Bohari, M. H., Srivastava, H. K., & Sastry, G. N. Analogue-based approaches in anti-cancer compound modelling: the relevance of QSAR models. Organic and Medicinal Chemistry Letters, 2011, 1(1), 3. doi.org/10.1186/2191-2858-1-3
[7] Xu-Yan Wang, Chuang-Jun Li, Jie Ma, Chuan Li, Fang-You Chen, Nan Wang, Cang-Jie Shen, Dong-Ming Zhang. Cytotoxic 9,19-cycloartane type triterpenoid glycosides from the roots of Actaea Dahurica. Phytochemistry, 2019, 160, 48-55, doi.org/10.1016/j.phytochem.2019.01.004.