LGG_2025v16n2

Legume Genomics and Genetics 2025, Vol.16, No.2, 91-99 http://cropscipublisher.com/index.php/lgg 92 and evaluated the effect of integrating phenotypic and genomic data for trait prediction. And evaluate its practical significance for soybean breeding projects. This research will promote the modernization and precision of soybean breeding. By enhancing the efficiency and accuracy of trait prediction, machine learning methods are expected to accelerate the breeding process of high-yield and high-protein soybean varieties, thereby contributing to the development of sustainable agriculture and the realization of global food security goals. 2 Genetic Basis of Soybean Yield and Protein Traits 2.1 Major and minor QTLs associated with yield and quality in soybean The two types of traits, yield and protein, are not determined by a single gene. Behind them, there is usually a group of QTLS working together - some large and some small (Figure 1). Years of mapping and GWAS studies have actually identified many key loci. Some QTLS have obvious effects, and many are micro-effect QTLS with long-term effects (Diers et al., 2018). For instance, major QTLS related to protein content are often located on chromosomes 20 and 15. Those QTLS that regulate yield and quality tend to cluster in the same area. This overlap is not accidental and is likely related to pleiotropy or close linkage (Tayade et al., 2023). The POWR1 gene is a case in point - it is precisely the "protagonist" in a QTL on chromosome 20, regulating proteins while influencing oil and yield (Goettel et al., 2022). Similar candidate genes are still being discovered, adding many new "parts" to the breeding toolbox (Doszhanova et al., 2024; Dong et al., 2025). Figure 1 Composition of stored mature soybean seeds. The percentage value indicates the relative weight of the corresponding component in a seed (Liu, 1997) (Adopted from Duan et al., 2023) 2.2 Genotype-phenotype interactions and their impact on complex traits Not every genotype behaves the same in every environment. Soybean yield and protein traits are precisely the "typical representatives" under the influence of this G×E interaction. Some QTLS are highly active in a certain environment but their performance weakens in a different location. Even the "temperament" of alleles can change when the genetic background is altered (Gao et al., 2024). So, it is not enough to just look at the QTL itself; one also needs to consider which genotype it falls on and what environment it encounters. Vymyslicky et al. (2025) emphasized the significance of introducing G×E analysis into the breeding process. The results of online analysis are also quite interesting. Many genes do not act alone but rather have complementary or superior effects on each other. Some genes even have multiple traits (Fang et al., 2017). These intricate interaction relationships make it more difficult to predict traits and further highlight the necessity of multi-environment tests and higher-order models.

RkJQdWJsaXNoZXIy MjQ4ODYzNA==