Surveying Alignment-free Features for Ortholog Detection in Related Yeast Proteomes by using Supervised Big Data Classifiers

Deborah Galpert Cañizares; Alberto Fernández; Francisco Herrera; Agostinho Antunes; Reinaldo Molina Ruiz; Guillermin Agüero-Chapin

doi:10.3390/mol2net-04-05468

Previous Article in event

Prediction of Activity for Antimalarial Nanoparticle Delivery Systems

Previous Article in congress

Gaussian basis set of triple zeta quality for atoms Fr through Lr: Application in DFT calculations of molecular properties

Next Article in event

Clients profile evaluation attended in Barra do Garças municipality aesthetic clinics, Mato Grosso, Brazil

Next Article in congress

Plasma-Based Water Purifier: Design And Testing Of Prototype with Different samples of water

Surveying Alignment-free Features for Ortholog Detection in Related Yeast Proteomes by using Supervised Big Data Classifiers

Deborah Galpert Cañizares

¹,

²,

²,

^{3, 4},

⁵,

Guillermin Agüero-Chapin

^{*

4, 6}

¹ Departamento de Ciencias de la Computación, Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), Santa Clara, 54830, Cuba
² Department of Computer Science and Artificial Intelligence, Research Center on Information and Communications Technology (CITIC-UGR), University of Granada, 18071 Granada, Spain
³ CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Rua dos Bragas, 177, 4050-123 Porto, Portugal
⁴ Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
⁵ Centro de Bioactivos Químicos, Universidad Central “Marta Abreu” de Las Villas (UCLV), Santa Clara, 54830, Cuba
⁶ CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n 4450-208 Matosinhos, Porto, Portugal.

Published: 01 August 2018 by MDPI in MOL2NET'18, Conference on Molecular, Biomed., Comput. & Network Science and Engineering, 4th ed. congress USEDAT-04: USA-Europe Data Analysis Training Program Workshop, Cambridge, UK-Bilbao, Spain-Miami, USA, 2018

https://doi.org/10.3390/mol2net-04-05468

Abstract:

Methods for pairwise ortholog detection (POD) usually relies on alignment-based (AB) similarity measures. However, AB algorithms are still limited in POD since they may fail in the presence of certain evolutionary and genetic events. In this sense, POD is an open field in bioinformatics demanding either constant improvements in existing methods or new effective scaling algorithms to deal with Big Data.

In a previous paper, we developed a Big Data supervised POD approach considering several AB pairwise gene features and the low ortholog pair ratios found between two proteomes (Galpert, del Río et al. 2015). Although the higher sensitivity achieved for our supervised POD models in relation to classical POD methodologies, when were comparatively evaluated on the Saccharomycete yeast benchmark dataset (Salichos and Rokas 2011); they were implemented in MapReduce framework and tested on a single yeast genome pair.

In (Galpert, Fernández et al. 2018) (https://doi.org/10.1186/s12859-018-2148-8), we propose some improvements to our supervised POD approach by i) surveying the incorporation of alignment-free pairwise similarity measures ii) evaluating other classifiers under the Big Data Spark platform and iii) extending the test set to other related Saccharomycete yeast proteomes.

Keywords: Pairwise ortholog detection; Alignment-free similarity measures; Big data; Supervised classification; Yeast

View Poster

167 Reads