Machine Learning and drug repositioning

Carlos Santolaria

doi:10.3390/mol2net-07-11829

Abstract:

Machine learning has been proved to be useful on a pharmacological scale, too. Drug
repositioning is a very well-known method for big pharma, by which the use of an alreadyapproved drug (whose purpose is the treatment of a given disease) is extended to a different
disorder against which its efficacity has been proved.
In the most part of cases where drug response is predicted through machine learning models,
it is necessary to perform a first step where information is selected and filtered, in which
researchers must take into consideration that there is always a set of patients who will show
an extreme response, which makes almost compulsory to use a wider range of simples or cell
lines. When it comes to classifying and selecting information, three of the most used methods
are elastic net regression models, random forest ones and specifically-designed algorithms.
A training phase must always be incorporated to the process in order to callibrate the model.
Following this stage, an independent evaluation must be carried out by performing multiple
tests. This process is conducted in order to ensure that the putative model is accurate in its
predictions. Lastly, it is necesary to test the model by using clinical-resembling data.
• The evaluation can be performed via two processes: K-Fold or Leave-one-out. The first
one divides the “raw” dataset in two parts, using the first as a training dataset and the
second one as a testing dataset. Leave-one-out, however, works similarly but it leaves
only a single sample from the “raw” dataset as a test, making it compulsory to repeat
this stage many times.
• Nevertheless, general machine learning techniques can be divided in two types:
supervised machine learning, which uses already-created gruoups whithin the traning
data, or unsupervised one, which creates these groups from the trainig data.
On behalf of the building process for drug repositioning approaches, it comprises several steps,
as well. When it comes to seeking for relationships between drugs and diseases, networkbased methods can be used in any of its forms: clustering (searching relationships between
drugs and targets among clusters of these) or propagation approach. The last one can eximine
a network in a sigle region or in its entirety. Anyways, networks can include homogeneous or
heterogeneous data.
• One example of this is the Zhao and So essay, where they used several algorithms on
transcriptomic data to examine the effects on protein synthesis and expression of
various drugs and examine other potential applications for them.
Nevertheless, until today, only a few machine learning approaches have been applied on
clinical trials. This is mainly due to the difficulties that must be faced when filtering the huge
amounts of data that are used. Moreover, data-filtering procedures are sometimes not
systematic, which limits its possible uses. However, machine learning offers very interesting
benefits compared to clinical trials, as it can save researchers much time and money.
Extract from:
1. Rethinking Drug Repositioning and Development with Artificial Intelligence, Machine
Learning, and Omics [Internet]. [citado 17 de noviembre de 2021]. Disponible en:
https://www.liebertpub.com/doi/epdf/10.1089/omi.2019.0151