SciForum MOL 2 NET An approach toward the identification of new antileishmaniasic compounds .

Herein we present results of a quantitative structure–activity relationship (QSAR) study to identify new antileishmaniasic compounds (Leishmania amazonensis) by using a set of more than 2000 0D-2D Dragon ́s molecular descriptors and machine learning techniques. A data set of organic chemicals, with antileishmaniasic activity against promastigote forms of the parasite, is used to develop four QSAR models based on k-nearest neighbors, Support Vector Machine, Multi-Layer Perceptron and classification tree techniques. External validation procedures were developed to demonstrate the predictive power of the models. Promastigote ́s models correctly classify more than 89% chemicals in both training and external prediction groups, respectively. In addition to the individual techniques an assembled system of majority vote was personalized with the aim of improving the results of the obtained models. To identify new compounds with potential activity against this parasite, a virtual screening was performed using DrugBank international database. There were identified more than five hundred new potential antileishmaniasic compounds. The current results constitute a step forward in the search for efficient ways to discover new antileishmaniasic lead


Graphical Abstract: Introduction:
Leishmaniasis, is a disease caused by obligate intracellular protozoa of the genus Leishmania, is an old but largely unknown disease that afflicts the World's poorest populations [1].It presents a broad spectrum of clinical forms and is transmitted to humans and animals through the bite of insects of the Psychodidae family [2].There have been reported by WHO (World Health Organization) more than 20 species of Leishmania and between them Leishmania amazonensis is of vital importance for the American continent because it is the cause of a wide variety of clinical manifestations, some of them potentially fatal [3][4].
"In silico" methods are useful tools for screening chemicals, especially in early stages of the drug discovery process [5][6][7][8].In the last two decades these studies have played a fundamental role in the development of a number of drugs that are currently on the market [9].

Materials and Methods:
All the compounds included in the research were gathered from published in PubChem bioassays.We select specifically studies carried out against promastigotes of Leishmania amazonensis.Different researchers have reported them and publish it in the last years, in several journals with high impact on the Web of Sciences.To verify the structural diversity of the compounds of the database a Cluster Analysis (CA) implemented in the software STATISTICA 8.0 was performed [10].
Models by using k-nearest neighbors (IBK), classification trees (J48), artificial neural network (MLP for its acronym MultiLayer Perceptron) and support vector machine (SMO for Sequential Minimal Optimization) techniques were obtained for promastigote form of the parasite, with the employee of WEKA software [11].External validation procedures were developed to demonstrate the predictive power of the four resultants models.Virtual screening was performed using DrugBank international database.

Results and Discussion:
A new database of antileishmaniasic compounds with a high degree of structural variability was performed.The parameterization of the structures was carried out using 2489 molecular descriptors 0D-2D implemented in the DRAGON software.WEKA's selection procedures were used to obtain a subset of variables for models development.
Active and inactive compounds were divided into different subsets using k-MCA so we could obtain Training and Test sets following the procedure shown in the Figure 1.
Promastigote´s models correctly classify more than 82% and 80% of chemicals in both training and external prediction groups, respectively.Figure 2 shows the Accuracy percentages obtained for the models SVM results the higher accuracy model followed by IBK and MLP respectively.The external validation of the four models for promastigotes using a new set of 22 compounds previously evaluated in PubChem bioassays showing positive results.In addition to the individual techniques an assembled system of majority voting was personalized with the aim of improving the results of the obtained models [12].

Figure 2. Accuracy percentages obtained for training and test series on final models
To identify new compounds with potential activity against promastigotes forms of L. amazonensis a virtual screening using DrugBank international database was performed.There were identified more than five hundred new potential antileishmaniasic compounds.The use of the assembly by the majority vote enabled us to reduce the screening compounds identified as potentially active.These compounds can be experimentally evaluated to corroborate their activity against L. amazonensis with favorable repercussion in a time saving and use of chemical reagents, so the current results constitute a step forward in the search for efficient ways to discover new antileishmaniasic lead compounds.

Conclusions
With the use of artificial intelligence techniques we develop four validated models with good statistical parameters.The obtained models were able to identify new compounds with potential activity against promastigote forms of L. amazonensis through virtual screening of databases.This work constitutes a useful tool in the search of new leading compounds against this parasite.

Figure 1
Figure 1 Training and Test Sets obtention procedure.