Cotton Genomics and Genetics 2025, Vol.16, No.6, 259-268 http://cropscipublisher.com/index.php/cgg 266 7.3 Pan-genomics and SV diversity If genome sequencing has revealed the full picture of cotton genetics, then pan-genomics tells us that the genome of each variety is not the same. By integrating hundreds or even thousands of cotton germplasm data, researchers have drawn a more complete genetic picture: tens of thousands of structural variations have been identified, a considerable number of which are related to key traits such as fiber quality and yield. These works not only reveal the non-reference genes that were lost during domestication and selection, but also bring rare and linea-specific variations back into view. For breeders, this means being able to mine materials from a broader genetic pool. Meanwhile, the construction of graph-based pan-genomes and graph lineage genomes has also made the association between structural variations and phenotypes clearer. Such tools are helping to design a new generation of breeding strategies-those that can maximize the utilization of beneficial SV alleles are gradually becoming possible. 8 Concluding Remarks and Future Perspectives After years of research, people have gradually come to realize that the genetic diversity of cotton is far more complex than imagined. The latest genomic sequencing and pan-genomic studies have shown that there are a large number of structural variations (SVs) in the genomes of upland cotton and Pema cotton, ranging from insertions and deletions to inversions and presence/deletion variations (PAVs), which are almost ubiquitous. These variations not only enrich the genomic structure but also directly promote phenotypic differentiation. The division of labor among different subgenomes in trait regulation is particularly evident: subgenome D is more associated with fiber quality, while subgenome A leans towards yield-related traits. Many SVS can also regulate gene expression and network structure. The trait differences that are often "invisible" in SNP analysis are often precisely what they are at play. Gene infiltration and long-term domestication have further shaped the distribution pattern of these variations and also left valuable allele resources for breeding. However, there are also quite a few problems. The cotton genome is large and complex, especially for polyploid species, which often makes comprehensive detection and verification of structural variations difficult. Many SVS are located near repetitive sequences or centromeres-assembly and annotation work in these areas remains challenging to this day. What is more troublesome is that although the identification of SV is increasing, the functional analysis cannot keep up, and it is often still undetermined which variations truly affect the traits. Furthermore, the true application of SV data to the breeding process is still in its infancy. The current typing technology is costly and has limited throughput, and there is still a long way to go before it can be widely promoted in practice. Future work should focus more on "filling in the gaps". Constructing a more complete and gap-free reference genome is crucial, especially enhancing the ability to identify SVS in complex genomic regions. Meanwhile, functional genomics methods, such as CRISPR editing or transcriptome analysis, can be employed to verify the specific impact of candidate SVS on gene expression and phenotype. Expanding the coverage of pan-genome resources and combining SV information with other genomic, epigenomic and phenotypic data is also expected to discover new superior alleles. Finally, if efficient and user-friendly SV typing and marker-assisted selection tools can be developed, these research achievements will be able to enter the actual breeding process more quickly, truly achieving the transformation from genomic discovery to field application. Acknowledgments We are grateful to Mr. Ma for critically reading the manuscript and providing valuable feedback that improved the clarity of the text. Conflict of Interest Disclosure The authors affirm that this research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest. References Chang X., He X., Li J., Liu Z., Pi R., Luo X., Wang R., Hu X., Lu S., Zhang X., and Wang M., 2024, High-quality Gossypium hirsutum and Gossypium barbadense genome assemblies reveal the landscape and evolution of centromeres, Plant Communications, 5(2): 100722. https://doi.org/10.1016/j.xplc.2023.100722
RkJQdWJsaXNoZXIy MjQ4ODYzNA==