Efficient genomic selection using ensemble learning and ensemble feature reduction

Banerjee, Rohan and Marathi, Balram and Singh, Manish (2020) Efficient genomic selection using ensemble learning and ensemble feature reduction. Journal of Crop Science and Biotechnology, 23 (4). pp. 311-323. ISSN 1975-9479

Full text not available from this repository. (Request a copy)


Genomic selection (GS) is a popular breeding method that uses genome-wide markers to predict plant phenotypes. Empirical studies and simulations have shown that GS can greatly accelerate the breeding cycle, beyond what is possible with traditional quantitative trait locus (QTL) approaches. GS is a regression problem, where one often uses SNPs to predict the phenotypes. Since the SNP data are extremely high-dimensional, of the order of 100 K dimensions, it is difficult to make accurate phenotypic predictions. Moreover, finding the optimal prediction model is computationally very costly. Out of thousands of SNPs, usually only a few influence a particular phenotypic trait. We first of all show how ensemble-based regression techniques give better prediction accuracy compared to traditional regression methods, which have been used in existing papers. We then further improve the prediction accuracy by using an ensemble of feature selection and feature extraction techniques, which also reduces the time to compute the regression model parameters. We predict three traits: grain yield, time to 50% flowering and plant height for which the existing methods give an accuracy of 0.304, 0.627 and 0.341, respectively. Our proposed regression model gives an accuracy of 0.330, 0.674 and 0.458 for these traits. Additionally, we also propose a computationally efficient regression model that reduces the computation time by as much as 90% and gives an accuracy of 0.342, 0.580 and 0.411, respectively.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Banerjee, RohanUNSPECIFIED
Marathi, BalramUNSPECIFIED
Singh, Manishhttp://orcid.org/0000-0001-5787-1833
Item Type: Article
Uncontrolled Keywords: Dimensionality reduction; Genomic selection; Machine learning; Rice
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: . LibTrainee 2021
Date Deposited: 09 Jul 2021 10:29
Last Modified: 09 Jul 2021 10:29
URI: http://raiith.iith.ac.in/id/eprint/8200
Publisher URL: http://doi.org/10.1007/s12892-020-00039-4
OA policy: https://v2.sherpa.ac.uk/id/publication/14385
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 8200 Statistics for this ePrint Item