Legume Genomics and Genetics 2026, Vol.17, No.1, 49-67 http://cropscipublisher.com/index.php/lgg 51 stress tolerance and yield (Viana et al., 2022). Whole-genome resequencing and large-scale SNP discovery in global soybean panels have provided millions of variants that can be mined for signatures of selection, adaptive introgression, and domestication sweeps, as well as for fine mapping of quantitative trait loci underlying agronomic traits (Song et al., 2015). For emerging production regions such as Kazakhstan and sub-Saharan Africa, SNP-based characterization of local germplasm in the context of global collections offers actionable guidance on whether to prioritize introgression of exotic and wild alleles, or to intensify selection within existing adapted gene pools (Zatybekov et al., 2025). Against this backdrop, a comprehensive genomic diversity and population structure analysis of global soybean germplasm using SNP markers is both timely and necessary to support strategic conservation and accelerate the development of high-performing, resilient cultivars for diverse agroecological zones. 2 Current Status of Global Soybean Germplasm Resources and Genetic Diversity Research 2.1 Distribution and conservation of global soybean germplasm resources Soybean germplasm is conserved in large ex situ collections as well as in situ in traditional farming systems and natural habitats of wild relatives. Major global repositories, such as the USDA Soybean Germplasm Collection, Asian national gene banks, and international centers, collectively maintain tens of thousands of accessions representing cultivated soybean (Glycine max), its wild progenitor (Glycine soja), and breeding lines from diverse agroecological zones (Nawaz et al., 2020). Regional collections, including those in Africa, South America, Central and Eastern Europe, and Central Asia, increasingly capture germplasm adapted to local environments and emerging production regions (Shaibu et al., 2021; Zatybekov et al., 2023). Wild soybean populations remain especially important reservoirs of adaptive variation for biotic and abiotic stress tolerance, and targeted collections in centers of diversity such as East Asia are recognized as priorities for long-term soybean improvement (Nawaz et al., 2020). Conservation strategies emphasize both safeguarding genetic resources and generating characterization data that enable efficient use. Large collections often show substantial redundancy and uneven representation of geographic regions, maturity groups, and end-use types, underscoring the need for systematic molecular characterization to rationalize holdings and identify gaps (Rani et al., 2023). Recent SNP- and SSR-based surveys in Africa, India, Kazakhstan and other regions highlight contrasting patterns: some collections exhibit relatively broad diversity (e.g., TGx lines in sub-Saharan Africa), while others display narrow genetic bases linked to repeated use of a few elite parents (Zatybekov et al., 2023). Genomic data are therefore being used not only to inform core and mini-core set development and to flag duplicate accessions, but also to guide targeted introgression of wild and exotic germplasm into locally adapted backgrounds to counteract genetic erosion and enhance resilience (Rani et al., 2023). 2.2 Main methods for studying soybean genetic diversity Research on soybean genetic diversity has evolved from reliance on phenotypic descriptors to extensive use of DNA marker technologies. Early studies used morphological and agronomic traits (e.g., plant height, maturity, seed size, yield) to estimate diversity and relationships among accessions, but these traits are strongly influenced by the environment and often provide limited resolution (Rani et al., 2023). Biochemical markers and multi-environment field evaluations have helped to refine phenotypic clustering, yet environmental noise and the small number of measurable characters constrained their utility for detailed population structure analysis and germplasm management (Ullah et al., 2021). Consequently, morphological data are now typically combined with molecular information to capture both adaptive differentiation and underlying genomic variation (Perić et al., 2025). DNA-based markers have become central tools for characterizing soybean diversity and population structure. A wide range of marker systems—including RAPD, AFLP, ISSR, SSR, EST-SSR, DArT and SNPs—has been applied to differentiate cultivars, landraces, and wild accessions, estimate allelic richness, and dissect within- and among-population variation (Wibisono et al., 2025). SSR markers, in particular, have been extensively used
RkJQdWJsaXNoZXIy MjQ4ODYzNA==