Please login first
A proposal for distributed processing for large scale virtual screening using Python
* 1 , * 2


Bioinformatics can be defined as the use of computational tools in the study of problems and biological issues, covering applications related to several areas of knowledge, such as biochemistry and computer science. Thus, virtual screening acts in a standard step before laboratory experiments, for the discovery of new drugs. This technique involves calculating the estimated affinities (molecular docking) and the plausible binding modes of many drug candidates. The main focus of this work is the use of AutoDock Vina, a tool for molecular docking and virtual screening, as well as a distributed implementation using the Python programming language. For the configuration of the experiments VS Framework was used, this is a tool of preparation of virtual screening in which it allows the user to configure in a simplified way an experiment through a web platform. RPyC (Remote Python Call), a Python library for symmetric remote procedure calls was used to execute jobs. Basically, RPyC uses the Remote Procedure Call (RPC) protocol for a client-server implementation. The distributed virtual screening architecture proposed in this work is defined in 5 steps: Step 1 (configuration of the experiment), the VS Framework is used for the initial configuration of the experiment, with the other steps are automated by a Python script; Step 2 (preparing files), all virtual screening entries are converted to the default formats of the rest of the run, for which AutoDock Tools is used; Step 3 (upload the files), the converted files are submitted to the computers that integrate the distributed execution; Step 4 (perform virtual screening) and Step 5 (download the results), the virtual screening is processed and the results are sent to the computer that started the experiment. The results were listed among four combinations of simplified distributed computing infrastructures, divided by 5 virtual machines. The processing of each receptor-ligand in a distributed way had a better performance with the increase of computers in the structure. Finally, the analysis showed that these platforms are suitable for virtual screening executions with different receptor/ligand sizes. These considerations may guide scientists in choosing the best computing framework for their large-scale experiments.

Keywords: Virtual screening; AutoDock; Python; Distributed processing
Comments on this paper
Omari Leroy
Really special blog post, Thanks for your time designed for writing It education. Outstandingly drafted guide, if only every webmasters marketed the exact same a better standard of subject matter whilst you, cyberspace was obviously a better set. Satisfy continue the good work!