Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean
- PMID: 33510761
- PMCID: PMC7835636
- DOI: 10.3389/fpls.2020.624273
Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean
Abstract
Recent substantial advances in high-throughput field phenotyping have provided plant breeders with affordable and efficient tools for evaluating a large number of genotypes for important agronomic traits at early growth stages. Nevertheless, the implementation of large datasets generated by high-throughput phenotyping tools such as hyperspectral reflectance in cultivar development programs is still challenging due to the essential need for intensive knowledge in computational and statistical analyses. In this study, the robustness of three common machine learning (ML) algorithms, multilayer perceptron (MLP), support vector machine (SVM), and random forest (RF), were evaluated for predicting soybean (Glycine max) seed yield using hyperspectral reflectance. For this aim, the hyperspectral reflectance data for the whole spectra ranged from 395 to 1005 nm, which were collected at the R4 and R5 growth stages on 250 soybean genotypes grown in four environments. The recursive feature elimination (RFE) approach was performed to reduce the dimensionality of the hyperspectral reflectance data and select variables with the largest importance values. The results indicated that R5 is more informative stage for measuring hyperspectral reflectance to predict seed yields. The 395 nm reflectance band was also identified as the high ranked band in predicting the soybean seed yield. By considering either full or selected variables as the input variables, the ML algorithms were evaluated individually and combined-version using the ensemble-stacking (E-S) method to predict the soybean yield. The RF algorithm had the highest performance with a value of 84% yield classification accuracy among all the individual tested algorithms. Therefore, by selecting RF as the metaClassifier for E-S method, the prediction accuracy increased to 0.93, using all variables, and 0.87, using selected variables showing the success of using E-S as one of the ensemble techniques. This study demonstrated that soybean breeders could implement E-S algorithm using either the full or selected spectra reflectance to select the high-yielding soybean genotypes, among a large number of genotypes, at early growth stages.
Keywords: artificial intelligence; data-driven model; ensemble methods; high-throughput phenotyping; random forest; recursive feature elimination.
Copyright © 2021 Yoosefzadeh-Najafabadi, Earl, Tulpan, Sulik and Eskandari.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures









Similar articles
-
Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.PLoS One. 2021 Apr 30;16(4):e0250665. doi: 10.1371/journal.pone.0250665. eCollection 2021. PLoS One. 2021. PMID: 33930039 Free PMC article.
-
Genome-Wide Association Studies of Soybean Yield-Related Hyperspectral Reflectance Bands Using Machine Learning-Mediated Data Integration Methods.Front Plant Sci. 2021 Nov 22;12:777028. doi: 10.3389/fpls.2021.777028. eCollection 2021. Front Plant Sci. 2021. PMID: 34880894 Free PMC article.
-
High-throughput phenotyping using VIS/NIR spectroscopy in the classification of soybean genotypes for grain yield and industrial traits.Spectrochim Acta A Mol Biomol Spectrosc. 2024 Apr 5;310:123963. doi: 10.1016/j.saa.2024.123963. Epub 2024 Feb 1. Spectrochim Acta A Mol Biomol Spectrosc. 2024. PMID: 38309004
-
Hyperspectral reflectance-based phenotyping for quantitative genetics in crops: Progress and challenges.Plant Commun. 2021 May 27;2(4):100209. doi: 10.1016/j.xplc.2021.100209. eCollection 2021 Jul 12. Plant Commun. 2021. PMID: 34327323 Free PMC article. Review.
-
Technical workflows for hyperspectral plant image assessment and processing on the greenhouse and laboratory scale.Gigascience. 2020 Aug 1;9(8):giaa090. doi: 10.1093/gigascience/giaa090. Gigascience. 2020. PMID: 32815537 Free PMC article. Review.
Cited by
-
Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.PLoS One. 2021 Apr 30;16(4):e0250665. doi: 10.1371/journal.pone.0250665. eCollection 2021. PLoS One. 2021. PMID: 33930039 Free PMC article.
-
Choosing an appropriate somatic embryogenesis medium of carrot (Daucus carota L.) by data mining technology.BMC Biotechnol. 2024 Sep 27;24(1):68. doi: 10.1186/s12896-024-00898-7. BMC Biotechnol. 2024. PMID: 39334143 Free PMC article.
-
Comparative Analysis of Machine Learning and Evolutionary Optimization Algorithms for Precision Micropropagation of Cannabis sativa: Prediction and Validation of in vitro Shoot Growth and Development Based on the Optimization of Light and Carbohydrate Sources.Front Plant Sci. 2021 Oct 21;12:757869. doi: 10.3389/fpls.2021.757869. eCollection 2021. Front Plant Sci. 2021. PMID: 34745189 Free PMC article.
-
Entropy Weight Ensemble Framework for Yield Prediction of Winter Wheat Under Different Water Stress Treatments Using Unmanned Aerial Vehicle-Based Multispectral and Thermal Data.Front Plant Sci. 2021 Dec 20;12:730181. doi: 10.3389/fpls.2021.730181. eCollection 2021. Front Plant Sci. 2021. PMID: 34987529 Free PMC article.
-
Study on the Classification Method of Rice Leaf Blast Levels Based on Fusion Features and Adaptive-Weight Immune Particle Swarm Optimization Extreme Learning Machine Algorithm.Front Plant Sci. 2022 May 6;13:879668. doi: 10.3389/fpls.2022.879668. eCollection 2022. Front Plant Sci. 2022. PMID: 35599890 Free PMC article.
References
-
- Aghighi H., Azadbakht M., Ashourloo D., Shahrabi H. S., Radiom S. (2018). Machine learning regression techniques for the silage maize yield prediction using time-series images of landsat 8 OLI. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 11 4563–4577. 10.1109/JSTARS.2018.2823361 - DOI
-
- Albetis J., Duthoit S., Guttler F., Jacquin A., Goulard M., Poilvé H., et al. (2017). Detection of flavescence dorée grapevine disease using unmanned aerial vehicle (UAV) multispectral imagery. Remote Sens. 9:308 10.3390/rs9040308 - DOI
-
- Alexandratos N., Bruinsma J. (2012). World Agriculture Towards 2030/2050: the 2012 Revision. Rome: Food and Agriculture Organization of the United Nations, Agricultural Development Economics Division (ESA).
-
- Ali I., Cawkwell F., Green S., Dwyer N. (2014). “Application of statistical and machine learning models for grassland yield estimation based on a hypertemporal satellite remote sensing time series,” in Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, (IEEE; ), 5060–5063. 10.1109/IGARSS.2014.6947634 - DOI
LinkOut - more resources
Full Text Sources
Other Literature Sources