Maize Genomics and Genetics 2025, Vol.16, No.5, 239-250 http://cropscipublisher.com/index.php/mgg 243 4.2 Integration strategies Nowadays, integrating GS and ML is not uncommon, and the practices are also increasing. Some research studies choose to process the data uniformly first, and then feed the genomic, phenotypic and environmental data together using models such as random forests or deep neural networks. This approach is also often referred to as "multimodal fusion", with the aim of enhancing the overall prediction performance. Some teams also adopt the approach of transfer learning, first training their models on datasets related to output and then fine-tuning them to the target data-the TrG2P framework does it this way. The advantage of this strategy is that it can utilize data that are not direct production indicators, helping the model learn useful information more quickly. In addition, some people are more sensitive to the "explanatory power of models" and tend to use deep learning models with attention mechanisms. This way, they can see which variables have the greatest impact (Togninalli et al., 2023). Another approach is simply to start with variable screening. Through feature selection techniques, SNPS or environmental factors with large amounts of information can be picked out in advance, which not only reduces the dimension but also lowers the risk of overfitting (Bayer et al., 2021; Sirsat et al., 2022). Of course, which features are useful varies from data to data and there is no absolute answer. 4.3 Current research trends At present, many studies are attempting to use GS in combination with ML to enhance the accuracy of yield prediction for different crops in various environments. Compared with traditional methods, multi-omics and multi-modal machine learning models perform more stably overall, especially when considering high-throughput phenotypic and environmental data simultaneously, their advantages are more obvious. Methods like transfer learning and deep frameworks are receiving increasing attention because they can utilize data with complex structures and related traits. It has recently been reported that the accuracy rate of predicting corn yield through these methods has increased by approximately 6.8% (Li et al., 2024), which is not a small figure. But don't overlook some hidden issues either: for instance, although the model is accurate, can it still work well in other environments? How can different types of data be unified? And whether breeders feel "handy" when using it-all these require corresponding tools and platforms to be addressed. Not to mention the issues of infrastructure such as policies, data platforms and hardware. If these links are not kept up, if this set of combined measures is to be applied to large-scale breeding projects, many detours may still be needed. 5 Drought-Specific Modeling Approaches 5.1 Incorporating drought response traits Not all yield prediction models take drought resistance traits into account, but this has become increasingly common now. Especially those physiological or agronomic indicators that can directly reflect the plant's response to water stress, such as SPAD (relative chlorophyll content), LAI (leaf area index), flowering and silk production time, as well as stress resistance index (STI, DTI), etc., are often regarded as important variables and incorporated into statistical or machine learning models. The combination of the values SPAD and LAI works quite well. Some studies have observed that during the VT stage of corn, the correlation between them and yield is the most obvious (Szeles et al., 2023). But this does not mean that other stages are unimportant; it's just that this correlation may change. In breeding practice, some studies have gone further by introducing the multitrait index and the calculation method of "distance between genotype and ideal type" to pick out more drought-resistant materials (Kumar et al., 2022). In addition, incorporating the mark-trait association information related to drought resistance into genomic prediction models has indeed improved the accuracy of yield prediction under stress conditions, especially for those complex polygenic control traits. 5.2 Environment-specific modeling Drought is not the same every year or everywhere, which poses a challenge to modeling. Environment-specific modeling aims to address this issue. It hopes to make the prediction framework more detailed by taking into account the temporal and spatial variations in drought occurrence. Remote sensing or process indicators such as solar-induced chlorophyll fluorescence (SIF), soil moisture simulation values, and cumulative drought index (CDI) have all been widely used as tools to describe drought conditions in recent years. If these variables are added to the model, the prediction accuracy can be significantly improved-especially in years with particularly severe
RkJQdWJsaXNoZXIy MjQ4ODYzNA==