Prediction of Neurological Enzyme Targets for Known and New Compounds with a Model using Galvez ' s Topological Indices

Alzheimer's Disease (AD), Parkinson, and other neurodegenerative diseases are a major health problem nowadays. However, in many cases, current therapies are merely palliative and only temporarily slow cognitive decline. In this sense, the discovery of new drugs for the treatment of neurodegenerative diseases is a goal of the major importance. Public databases, like ChEMBL, contain a large amount of data about multiplexing assays of inhibitors of a group of enzymes with special relevance in central nervous system. Mono Amino Oxidases (MAOs), Acetyl Cholinesterase (AChE), Glycogen Synthase Kinase-3 (GSK-3), AChE (AChE), and 5α-reductases (5αRs). This data conform an important information source for the application of multi-target computational models. However, almost all the computational models known focus in only one target. In this work, we developed linear multi-target QSAR models (mt-QSAR) for inhibitors of 8 different enzymes promising in the treatment of different neurodegenerative diseases. In so doing, we combined by the first time the software DRAGON with Moving Average parameters with this objective. The best DRAGON model found predict with very high accuracy, specificity, and sensitivity >90% a very large data set >10000 cases in training and validation series.


INTRODUCTION
The discovery of new compounds for the treatment of neurodegenerative diseases is a goal of the major importance for medicinal chemistry and biopharmaceutical industry.In fact, neurodegenerative diseases have a high negative impact in personal and public health.For instance, Alzheimer´s disease [1] is a serious and degenerative disorder that causes a gradual loss of neurons, and in spite of the efforts realized by the big pharmaceutical companies of the world, the origin of this pathology is still not very clear.β-amyloid (Aβ) is an important protein implicated in the pathogenesis of AD, but the mechanism by which it causes neurotoxicity is still unknown [2,3].Recent research efforts have led to several hypotheses to explain AD.Amyloid β toxicity is believed to play a primary role in the development of AD [4].A group of neuronal enzymes stands out between the potential targets of drugs useful in the treatment of these diseases.In fact, the functions of neuronal enzymes and its implication in various human diseases have triggered an active search for potent and selective neuronal enzyme inhibitors [5] in the last years.Mono Amino Oxidases (MAOs), Acetyl Cholinesterase (AChE), Glycogen Synthase Kinase-3 (GSK-3), Acetyl Cholinesterase (AChE), and 5α-reductases (5αRs); stand out among the more promising enzymes in this sense.GSK-3 has two isoforms, GSK-3α and GSK-3β [6].In particular, GSK-3β is well known to play critical roles in oxidative stress-induced neurodegenerative diseases such as AD [7,8].A more comprehensive understanding of the mechanistic basis for GSK-3 isoformspecific functions could lead to the development of isoform-specific inhibitors [9].MAOs are important flavoenzymes with two isoforms (MAO-A and MAO-B); which are responsible for the oxidative deamination of neurotransmitters and dietary amines and thus involved in neurodegenerative diseases.MAO-A has a higher affinity for serotonin and norepinephrine whereas MAO-B preferentially deaminates phenylethylamine and benzylamine.Therefore, selective MAO-A inhibitors (i.e.chlorgyline) are used in the treatment of neurological disorders such as depression, whereas the MAO-B inhibitors (i.e.selegiline) are useful in the treatment of AD and Parkinson's.All of these aspects have led to an intensive search for novel MAO inhibitors (MAOIs) and this effort has increased considerably in recent years.On the other hand, AChE activity is associated with neuritic plaques (NPs) and neurofibrillary tangles (NFTs) in AD brains.AChE inhibitors (AChIs) are promising drugs in clinical practice for the treatment of AD.Recently, Bond et al reviewed the effectiveness and costeffectiveness of donepezil, galantamine, rivastigmine and memantine for the treatment of AD [10].Last, 5αRs are a family of isozymes expressed in a seveal tissues including the central nervous system.Very recently, Traish [11] performed a comprehensive literature search from 1970-2011 via PubMed and summarized the relevant information with special emphasis on the central nervous systems.
Thus, the rational design of new enzyme inhibitors for the treatment of neurodegenerative disease constitutes a major goal.In our opinion, Quantitative Structure-Activity Relationships (QSAR) techniques may be of help in this sense.Regrettably, almost current QSAR techniques are able to predict new outcomes only for one specific assay.In our opinion, we can evade this problem developing new Multi-target/Multiplexing QSAR models (mt-QSAR/mx-QSAR).These methods are especially powerful when we need to process very large collections of compounds assayed against multiple molecular or cellular targets in different assay conditions (m j ) as is the case of CHEMBL [12,13].This step may be of the major relevance for the future of QSAR.Notably, mt-QSAR models are able to predict the results of the assay of different drugs for multiple targets.However, mt-QSAR models are unable to predict different results for a given series of targets when we change the set of specific assay conditions for each target.Fortunately, the new class of mx-QSAR models applies not only to different targets but also to different multiplexing assay conditions (m j ) for all targets.Specifically, we have reported the first mx-QSAR model for multiplexing assays of anti-Alzheimer, anti-parasitic, anti-fungi, and anti-bacterial activity of GSK-3 inhibitors in vitro, in vivo, and in different cellular lines [14].
QSAR techniques are based on the use of molecular descriptors, which are numerical series that codify useful chemical information and enable correlations between statistical and biological properties [15,16].We can use different topological indices (TIs) of molecular graphs (G) to speed up the process of codification of the molecular structure of drugs in QSAR studies.In particular, Galvez's charge transfer indices and their different variants (G-like indices) have been demonstrated to be very useful in QSAR/QSPR studies of small molecules.For instance, the HIV-1 RT inhibitory activity of thiazolidinones has been analyzed with different TIs and Glike values together with other TIs have been found to be accurate predicting the activity of these compounds [17].G-like indices also overtook 11 kinds of molecular descriptors in one study of inhibitors of flavonoids aldose reductase enzyme [18].For a detailed revision of many QSAR models using TIs (including G k indices) see the recent in-depth review published in Chemical Reviews by Galvez's group [19].In addition, G-like indices have been found to be also useful for study complex bio-molecular networks and not only chemical structure.For instance, González-Díaz et al. [20] used G-like indices to study different classes of networks found in drug research, nature, technology, and social-legal sciences.Important QSAR studies of the enzymes under study in this work have been published before.For instance, Santana and Uriarte et al. [21][22][23] have published several QSAR and experimental studies of new MAO inhibitors.However, there are not mt-QSAR models based on G-like indices for all these enzymes together.In this work, we used by the first time molecular descriptors calculated developed by Galvez et al. to develop one mt-QSAR model for inhibitors of 8 different enzymes.The best model found use as input we used moving average operators to predict with very high accuracy, specificity, and sensitivity >90% a very large data set >10,000 cases in training and validation series.

Computational methods
Regrettably, almost current Quantitative Structure-Activity Relationships (QSAR) techniques are able to predict new outcomes only for one specific assay.In our opinion, we can evade this problem developing new Multi-target/Multiplexing QSAR models (mt-QSAR/mx-QSAR).These methods are especially powerful when we need to process very large collections of compounds assayed against multiple molecular or cellular targets in the different j th assay conditions (c j ); as is the case of CHEMBL [12,13].This step may be of the major relevance for the future of QSAR.Notably, mt-QSAR models are able to predict the results of the assay of different drugs for multiple targets.However, mt-QSAR models are unable to predict different results for a given series of targets when we change the set of specific assay conditions for each target.Fortunately, the new class of mx-QSAR models applies not only to different targets but also to different multiplexing assay conditions (c j ) for all targets.Specifically, we have reported the first mx-QSAR model for multiplexing assays of anti-Alzheimer, anti-parasitic, anti-fungi, and anti-bacterial activity of GSK-3 inhibitors in vitro, in vivo, and in different cellular lines [14].In a first step, we need to calculate the molecular descriptors using D i of a given i th compound using one or more software for generation of molecular descriptors.In a second step, we expand the raw dataset of molecular descriptors adding new variables ΔD ij = D i -<D ij >.Next, we upload this preprocessed data to one Statistics or Machine Learning software to seek the model.The linear mt-QSAR model based on moving averages and LDA has the following general form: Where, S ij is a numerical score of the biological activity of the i th compound measured under the j th assay defined by the set of conditions c j .In these models, the average <D ij > = <D i (c j )>, used to calculate ΔD ij values, is the average of the D i for different compounds and do not runs over a time domain but over a set of molecular descriptors that obey a given boundary condition c j .These deviation-like parameters ΔD ij are inspired in the idea of moving averages used in time series analysis [24].The idea of using moving average operators comes from the seminar works on time series analysis published by Box and Jenkins [25].More recently, González-Díaz et al. [26,27] have used moving average operators to construct mt-QSAR models.See also the excellent works published by Speck-Planche and Cordeiro et al. [28][29][30][31][32][33].

Multi-target DRAGON model of drug-neuroenzyme interaction
The outcome of multiplexing neural enzyme inhibition assays depend both on drug structure and the set of assay conditions selected (m j ) [34].In this work, we report the first mx-QSAR model capable of predict whether a drug with a determined molecular structure may give or not a positive result in different multiplexing assay conditions m j .These models are expected to give different classification probabilities of the compound for different: organisms (o t ), biological assays (a u ), molecular or cellular targets (t e ).The best mt-QSAR model found using DRAGON was the following: ) is a real-valued variable that scores the propensity of the drug to be active in multiplex pharmacological assays of the drug d i carried out on the conditions selected m j => assay = a u , organism = o t , and target enzyme t e .The statistical parameters for the above equation in training are: Number of cases used to train the model (N), Canonical Regression Coefficient (Rc), Sensitivity (Sn), Specificity (Sp), and Accuracy (Ac) [24].The probability cut-off for this LDA model is i p 1 (m j ) > 0.5 => C i (m j ) = 1.It means that the i th drug (d i ) predicted by the model with probability > 0.5 are expected to inhibit the enzyme present in the j th assays carry out under the given set of conditions m j .In Table 1, we explain in detail the different terms of this equation.Online supplementary material files contain detailed lists of results for all cases analyzed.
Table 1 comes about here This linear equation presented good results both in training and external validation series with overall Accuracy in training series above 90% (see Table 2).According to previous reports [35][36][37][38][39][40][41][42][43] values accuracy higher than 75% are acceptable for LDA-QSAR models.The reader should be aware that N here is not number of compounds but number of statistical cases.One compound may lead to 1 or more statistical cases because it may give different outcomes for alternative biological assays carried out in different organisms and used different enzymes as targets [44].
Table 2 comes about here 3.1.1Prediction of interaction with other neuro-enzymes An additional use of mt-DRAGON model was to carry out the "in silico" or virtual screening of the new interactions with respect to all other enzymes used in this model.It may help to found new interactions for these drugs or discard possible toxicological effects depending on the other interactions predicted and/or discarded for these compounds.This type of experiment is of the major importance due to the cost in terms of animal sacrifice, time, materials and human resources of the experimental assay of all compounds against all these targets, see recent reviews by Duardo-Sanchez et al. [45][46][47][48].Using this model we can predict the different relationships between the drug-protein interactions [49,50].We can reach this goal because the model predicts all neuro enzymes as non-active or moderate activity with respect to all compounds that are used in the model.Another important goal, is that the model predicted all the before results on Homo sapiens organism.In Table 3, we depict all the labels of the experimental parameters, target enzymes, and organism.
Table 3 comes about here CONCLUSION The functions of neural enzymes and its implication in various human diseases have triggered an active search for potent and selective neural enzyme inhibitors.Theoretical mt-QSAR models based on LDA and DRAGON descriptors may become a useful tool in this sense.Nowadays, theoretical studies such as QSAR models have become a very useful tool in this context to substantially reduce time and resources consuming experiments.In this work we developed a new LDA model using the Dragon descriptors, with a large data base using about 20000 different drugs obtained from the ChEMBL server.We conclude that a large database gives a much more precise model; the use of tools such as ChEMBL database enables us to develop models with large data bases, and this helps us to make the results more reliable.

Table 1 .
Details of the DRAGON mt-QSAR model for neural enzyme inhibitors

Table 2 .
Results of different DRAGON multi-target classification models Sensitivity = Sn = Positive Correct/Positive Total; Specificity = Sp = Negative Correct / Negative Total; Accuracy = Ac = Total Correct / Overall Total a