Using the RRegrs R package for automating predictive modelling

¹ School of Chemical Engineering, National Technical University of Athens, 15780, Greece
² RNASA-IMEDIR Group, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
³ Stanford Cancer Institute, Stanford University, C. J. Huang Building, 780 Welch Road, Palo Alto, CA
⁴ Department of Bioinformatics‑BiGCaT, NUTRIM, Maastricht University, P.O. Box 616, UNS50 Box 19, 6200 MD Maastricht, The Netherlands

Published: 04 December 2015 by MDPI in MOL2NET'15, Conference on Molecular, Biomed., Comput. & Network Science and Engineering, 1st ed. congress CHEMBIO.INFO-01: Cheminfo., Chemom., Comput. Quantum Chem. & Bioinfo. Congress, Cambridge, UK-Chapel Hill and Richmond, USA, 2015

https://doi.org/10.3390/MOL2NET-1-F009

Abstract:

Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of methodologies such as data splitting, cross-validation methods, best model criteria and Y-randomization. RRegrs is a new R package, available at https://www.github.com/enanomapper/RRegrs (0.05 release), which suggests an integrated framework to assist model selection and speed up the process of predictive model development. The tool proposes a fully validated scheme by employing repeated 10-fold and leave-one-out cross-validation for ten linear and non-linear regression methods. Standardized reports are produced to compare the output of modelling algorithms and assess cross-validation results for selected models. Here, we demonstrate RRegrs capabilities in terms of performance using five well-established data sets.

Keywords: Multiple regression; QSAR; cross-validation; model selection

View paper

331 Reads

Comments on this paper

Humbert G. Díaz

8 December 2015

Are you planning to release a version without necessity of installing R?

Dear authors,

Thank you by your strong support to the conference.

Are you planning to release a user friendly version for experimental scientists?
It means, a version with user interface without necessity of installing R package, etc.?

Once again, thank you by submitting this very interesting contribution and by your support to mol2net!!!
Please, participate also in the conference by sign up/login and making scientific questions/comments to other papers as well.
link to other works: https://sciforum.net/conference/MOL2NET-1/page/allcontributions

Georgia Tsiliki

8 December 2015

Hello,
Thanks for your comment. Currently we are focusing on polishing the code and possibly including extra models in RRegrs. But I think it's useful to point out, following your argument, that RRegrs suggests a quite user friendly automated process for all models and cross-validation options. A single function call is needed and the output is summarized in well structured excel sheets. So, the user, if not comfortable with model fitting, practically only needs to upload the data.

Georgia Tsiliki

Cristian Munteanu

Jose Seoane

Carlos Fernandez-Lozano

Haralambos Sarimveis

Egon Willighagen