Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of methodologies such as data splitting, cross-validation methods, best model criteria and Y-randomization. RRegrs is a new R package, available at https://www.github.com/enanomapper/RRegrs (0.05 release), which suggests an integrated framework to assist model selection and speed up the process of predictive model development. The tool proposes a fully validated scheme by employing repeated 10-fold and leave-one-out cross-validation for ten linear and non-linear regression methods. Standardized reports are produced to compare the output of modelling algorithms and assess cross-validation results for selected models. Here, we demonstrate RRegrs capabilities in terms of performance using five well-established data sets.
Using the RRegrs R package for automating predictive modelling
Published: 04 December 2015 by MDPI AG in MOL2NET, International Conference on Multidisciplinary Sciences session Scientific Software
Keywords: Multiple regression; QSAR; cross-validation; model selection