Hypertension : A mt-QSAR Model for Seeking New Drugs for the Hypertension Treatment using Multiple Conditions

Hypertension is a multifactorial disease in which blood vessels are extensively exposed to a higher voltage than usual, this tension endures more strain on the heart leading to greater cardiac output to pump blood to the body. Hypertension is classified by the World Health Organization (WHO) as one of the main risk factors for disability and premature death in the world population. WHO has strengthened various health services around the world, listing the groups of basic medicines for high blood pressure such as: angiotensin-converting enzyme inhibitors, thiazide diuretics, beta blockers, long-acting calcium channel blockers, among other groups for drug treatment to the population with this condition. The discovery of new drugs with better activity and less toxicity for the treatment of Hypertension is a goal of the major importance. In this sense, theoretical models as QSAR can be useful to discover new drugs for hypertension treatment. For this reason, we developed a new multi-targetQSAR (mt-QSAR) model to discover new drugs. A public databases ChEMBL contain Big Data sets of multitarget assays of inhibitors of a group of receptors with special relevance in Hypertension was used. However, almost all the computational models known focus in only one target or receptor. In this work, Beta-2 adrenergic receptor, Adrenergic receptor beta, Type-1 angiotensin II receptor, Angiotensin-converting enzyme, Betaadrenergic receptor, Cytochrome P450 11B2 and Renin were used as receptor inputs in the model. An ANN is our statistical analysis. In that way, we used as input Topological Indices, in specific Wiener, Barabasi and Harary indices calculated by Dragon software. These operators quantify the deviations of the structure of one drug from the expected values for all drugs assayed in different boundary conditions such as type of receptor, type of assay, type of target, target mapping. Overall training performance was 90%. Overall Validation predictability performance was 90%.


INTRODUCTION
Hypertension is a multifactorial disease in which blood vessels are extensively exposed to a higher voltage than usual, this tension endures more strain on the heart leading to greater cardiac output to pump blood to the body.Hypertension is classified by the World Health Organization (WHO) as one of the main risk factors for disability and premature death in the world population.WHO has strengthened various health services around the world, listing the groups of basic medicines for high blood pressure such as: angiotensin-converting enzyme inhibitors, thiazide diuretics, beta blockers, long-acting calcium channel blockers, among other groups drug treatment (1,2).As said before, the field in the research of better hypertension treatments is getting bigger as time passes.Bioinformatics is an area that has been widely used in these researches, making it as one of the most important fields.Basing on Bioinformatics, there exists a multi-target model (mt) where by a big and heterogeneous database of several compounds is designed.In this model, the compounds are categorized as being active or inactive.Form these molecules, a portion of them were obtain in order to calculate there was a positive relationship (3).
In this study, some proteins that were found to be related to hypertension and used in this mt-QSAR are Beta-2 adrenergic receptor, Adrenergic receptor beta, Type-1 angiotensin II receptor, Angiotensin-converting enzyme, Beta-adrenergic receptor, Cytochrome P450 11B2 and Renin were used as receptors inputs in the model.The design of new enzyme inhibitors for the treatment of hypertension creates a main objective.From our point of view, QSAR techniques may be very helpful in this case.Unfortunately, some QSAR techniques predict new outcomes only for one specific condition.We can avoid this by developing a new Multi-target/Multiplexing QSAR models.These approaches are useful to process very large collections of compounds assayed against multiple molecular or cellular targets under different assay conditions (cj) as is the case of ChEMBL (4,5).QSAR models can foretell the results of the assay of different drugs for multiple targets.One compound may lead to 1 or more statistical cases because it may give different outcomes (statistical cases) for alternative biological assays carried out in diverse sets of multiple conditions.In this work, we defined cj according to the ontology rt => (au, cj, rt, te, sx).The different conditions that may change in the dataset are: receptors (rt), biological assays (au), molecular or cellular targets (te), or standard type of activity measure (sx).Notably, multi-target QSAR models are able to predict the results of the assay of different drugs for multiple targets.However, QSAR models cannot foretell diverse results for a given sequence of targets when changed under a set of definite assay, organism, targets and assay type conditions for each target.Fortunately, the new mt-QSAR is not only useful for different targets but also to different multiplexing assay conditions (cj) for all targets.Definitely, we have stated the first QSAR model for multiplexing assays of anti-Alzheimer, anti-parasitic, anti-fungi, and anti-bacterial activity (6)(7)(8)(9)(10)(11)(12)(13)(14)(15).
QSAR methods are related on the use of molecular descriptors, which are known as mathematical sequence that codify useful chemical data and enable associations between statistical and biological properties (16,17).A diverse topological indices (TIs) of molecular graphs (G) can be used to speed up the procedure of codification of the molecular arrangement of drugs in Bioinformatics studies.By the first time we used TI molecular descriptors developed by Wiener, Barabasi and Harary indices to develop one multi-target/multiplexing QSAR model for inhibitors of 7 different enzymes relates to hypertension.

Nonlinear classifiers
A database from ChEMBL database (18) containing assayed anti-hypertension drugs was used to conduct this experiment.The DRAGON software 4.0 (19) was utilized and provided 1664 descriptors classified as zero-(0D) one-(1D), two-(2D) and three-dimensional (3D) descriptors depending on the fact that they are computed from the chemical formula, substructure list representation, molecular graph or geometrical representation of the molecule, respectively (20).In this research, the following descriptors were calculated: Wiener, Barabasi and Harary indices.The data was processed with different Artificial Neural Networks (ANNs) using the STATISTICA 6.0 software (21) looking for the better model to predict anti-hypertension activity.Five types of ANNs were used, namely, Probabilistic Neural Network (PNN), Radial Basis Function (RBF), Three Layers Perceptron (MLP-3), and Four Layer Perceptron (MLP-4) and Linear (LNN) (22)(23)(24)(25)(26)(27).
A very simple type of ANN called Four Layer Perceptron (MLP-4) can be used to fit this discriminant function.The model deals with the classification of a compound set with or without affinity of different receptors.A dummy variable Affinity Class (AC) was used as input to codify the affinity.This variable indicates either high (AC = 1) or low (AC = 0) affinity of the drug of the receptor.S(DTP)pred or DTP affinity predicted score is the output of the model and it is a continuous dimensionless score that sorts compounds from low to high affinity to the target coinciding DTPs with higher values of S(DTP)pred and nDTPs with lowest values.A Forward Stepwise algorithm was used for a variable selection (28).
Let kχ(G) be drugs molecular descriptors and kξ(R) receptor or drug target descriptors for different drugs (d) with different receptor; the group attempted to develop a simple linear classifier of mt-QSAR type with the general formula: The quality of models was assessed with different statistical parameters like Specificity (see Equation 2), Sensitivity (see Equation 3), Accuracy (see Equation 4) and ROC curve (Receiver Operating Characteristic curve) which is a graphical plot of the sensitivity, or true positives, vs. (1−specificity), or false positives, Where NTN means number of true negatives, NFP is the number of false positives, NTP is the number of true positives, NFN is the number of false negatives, FN is false negatives, FP is false positives and TN is true negatives.
The data set used in this article was obtained from ChEMBL database (29)(30)(31)(32)(33).It has more than 11000 cases and more than 6500 different compounds.ChEMBL normalizes the bioactivities into a uniform set of endpoints and units where possible, and also tags the links between a molecular target and a published assay with a set of varying confidence levels.The data is abstracted and curated from primary scientific literature, and covers a significant fraction of the structure activity relationship (SAR) and discovery of modern drugs.

ANN Multi-target model of drug-hypertension receptor interaction.
The ANN models are non-linear models used to predict the biological activity of a large dataset of molecules.This technique is an alternative to linear methods such as LDA (34,35).However, one must note that the profiles of each network indicate that these are highly nonlinear and complicated models.Different types of networks were compared to obtain a better model; Table 1 shows the classification matrix of the ANN network.MLP 42: 42-27-1:1 was taken as the main network because it presented a wider range of variables, 42 inputs in the first layer and 27 neurons in the second layer, and two sets of cases (Training and Validation).The network found was MLP and it showed a training performance higher than 89.9%.The best model correctly classifies 667 out of 794 active compounds (84.00 %) and 2971 out of 3251 non-active compounds (91.39 %) in the training series.Overall training performance was 89.94%.Validation of the model was carried out by means of external predicting series, the validation correctly classifies 6083 out of 6733 non-active compounds (90.34%) and 1325 out of 1527 active compounds (86.77%).Overall Validation predictability performance was 89.7%.The values of accuracy higher than 75% are acceptable for ANN models; according to previous reports (36)(37)(38)(39)(40)(41)(42)(43)(44)(45).Notably, the model presented had a ROC curve higher than 0.5.The model presented an area greater than 0.9, see Figure 1.One compound may lead to 1 or more statistical cases because it may give different outcomes for alternative biological assays carried out in different organisms with different enzymes as targets (46).We used a big data from ChEMBL database, only using anti-hypertension drugs and the model have good results.It is the first work on hypertension in the mt-QSAR is used within the model using different enzymes, assays, and types of proteins.

Table 1 .
Results of the MLP classification model