MGG_2025v16n3

Maize Genomics and Genetics 2025, Vol.16, No.3, 139-148 http://cropscipublisher.com/index.php/mgg 140 learning (GS-ML) framework, focused on methodological progress, and emphasized the importance of these integrated methods for accelerating the breeding of drought-tolerant maize varieties and ensuring sustainable crop production under the background of climate change. This study hopes to promote the development of drought-resistant maize varieties and better cope with agricultural production under the challenges of climate change. 2 Progress in Drought-Tolerance-Oriented Genomic Prediction 2.1 Advances in genetic mapping for drought traits In order to figure out how corn resists drought, scientists used two common methods: genome-wide association analysis (GWAS) and quantitative trait loci (QTL). These two methods have found many gene locations related to drought resistance. Some improved GWAS models have also found hundreds of nucleotide variations (QTNs) related to grain yield and flowering time, many of which are related to transcription factors such as AP2-EREBP and TCP (Li et al., 2016; Yuan et al., 2019; Zhang et al., 2023; Amadu et al., 2025). Later, researchers combined high-throughput phenotyping analysis with GWAS, and found thousands of gene locations related to drought resistance traits. These results have given us a deeper understanding of how corn copes with drought (Wu et al., 2021; Li et al., 2024). 2.2 Molecular markers used in drought tolerance selection When breeding drought-resistant maize varieties, scientists use some molecular markers, such as SNP, SilicoDArT, RFLP, SSR and AFLP (Hao et al., 2011; Zhang et al., 2022; Chen et al., 2024). Among them, SNP markers are the most commonly used because they are numerous and contain a lot of information. They can help us discover useful genetic variants and provide a reliable basis for seed selection (Wang et al., 2019). Some studies have also combined QTL analysis with transcriptome data to further narrow the location range of drought-related genes, so that breeding goals are clearer (Marino et al., 2009; Li et al., 2024). 2.3 Limitations of traditional genomic prediction methods Old genomic prediction methods such as RR-BLUP perform generally well in predicting drought resistance. This is because drought resistance is complex in itself, involving not only many genes but also environmental influences (Amadu et al., 2025). Moreover, these methods usually cannot accurately capture the interactions between genes, and it is difficult to deal with differences caused by environmental changes (Dias et al., 2018; Zhang et al., 2022). Although we can now try to add some markers related to the trait, or use models to consider the interaction between genotype and environment, the effect is still limited. To improve the accuracy of predictions, especially in areas where drought is more severe, we must rely on stronger algorithms and more advanced models. 3 Machine Learning Approaches for Yield Prediction 3.1 Typical ML models used in crop prediction (RF, XGBoost, ANN) When predicting crop yield, the three commonly used machine learning methods are: random forest (RF), extreme gradient boosting (XGBoost) and artificial neural network (ANN). Many studies have found that XGBoost and RF usually perform better than other models. In particular, XGBoost often gives higher R2 values, which means that the prediction is more accurate and the error is smaller (Dhaliwal et al., 2022; Shawon et al., 2023; Gharakhanlou and Perez, 2024). Artificial neural networks are also very popular, especially when complex relationships need to be handled. However, it has higher requirements for data and computing (Van Klompenburg et al., 2020; Malphedwar et al., 2024). Sometimes, combining several models together, such as making a hybrid model or an integrated model, can further improve the prediction accuracy (Oikonomidis et al., 2022). 3.2 Data normalization and overfitting prevention Before starting modeling, it is important to do some data processing. For example, standardize the values or do some feature processing, so that the model will be smoother during training and learn faster (Abbasi et al., 2025). We can first scale the original data, such as unifying the values into a similar range, or adding some new indicators, such as "soil fertility index". In addition, weather, soil and field management data can be combined and used together (Nossam et al., 2024). In order to prevent the model from "remembering too much", which is the

Made with FlippingBook

RkJQdWJsaXNoZXIy MjQ4ODYzNA==