CGG_2025v16n5

Cotton Genomics and Genetics 2025, Vol.16, No.5, 210-221 http://cropscipublisher.com/index.php/cgg 215 promoters for example. They are the key parts that control gene switches. Even in a small area inside, such as the TATA-box, as long as there is a variation, the strength of the expression can immediately create a gap. PRE1 is a very typical example. It affects fiber elongation and is particularly "talkative" on the At side, probably because the promoter part is designed more appropriately (Zhao et al., 2018). Of course, merely looking at the changes at the DNA sequence level is not enough. The state of chromatin can also affect whether genes can be "read". Studies have found that on the At subgenome, the contents of histone modification markers symbolizing activity (such as H3K4me3) are relatively high, while those representing suppression (such as H3K27me3) are relatively low (Zhang et al., 2021). That is to say, the genes on the At side are more likely to encounter the "green light" on the "allowed expression" channel. Another point that is often overlooked by people is that the three-dimensional structure of the genome within the cell nucleus is not static either. The positions of cis-regulatory elements (CREs) will be adjusted during domestication, and these changes will also indirectly affect the expression patterns of different subgenomes. But things are never handled by a single link alone; often, multiple regulatory factors work together in coordination. In addition to cis factors, trans regulation is also quite crucial. Especially trans eQTL, they often act across subgenomes, which is equivalent to building a bridge to enable better communication and coordination between two sets of genomes (Bao et al., 2019; You et al., 2023; Yang et al., 2024). So, what is seen is the expression bias, but behind it, multiple levels are simultaneously "taking action", including the sequence of promoters, the activity of chromatin, the reorganization of spatial structure, and the coordinated cooperation of regulatory networks. 5.3 Functional implications for fiber development, stress responses, and reproduction Often, when we focus on gene expression bias, it is not merely to figure out "who expresses more", but to know what actual changes it brings. The genes that are more inclined to be expressed in the At subgenome are generally more concentrated in the fiber initiation and early elongation stages. The biased expression of Dt often occurs in secondary wall synthesis and stress response (Xing et al., 2024). The two are not in opposition but in division of labor. This division of labor also endows cotton with the ability to cope with challenges under different ecological conditions. For instance, the functional division of the CesA gene family well demonstrates this coordination mechanism (Wang and Zhang, 2024). Their expression coordination between At and Dt supports the entire development process of fibers. And this kind of expression pattern is not merely a "natural phenomenon". Some traits related to photoperiod sensitivity and fiber quality have been found to be associated with subgenome-specific regulatory networks and epigenetic modifications (Han et al., 2023). So, during the domestication process, the expression bias might have been "quietly selected" long ago. 6 Case Study 6.1 Data integration from multiple cultivars and wild relatives What a single reference genome can do is indeed limited. To understand the full genetic picture of cotton, studies have integrated data from as many as 1,961 germplasm resources, including mainstream cultivated varieties, local varieties and some wild "relatives". This wave of operation is not only large-scale, but more importantly, it can capture those genetic variations that have been overlooked by a single reference sequence, especially presence/absence variations (PAVs) and some signals related to domestication and improvement. Through such extensive integration, researchers identified over 450 Mb of selectable regions and locked down 162 loci associated with 16 agronomic traits, including 84 previously unreported new loci (Figure 2) (Li et al., 2021). These data illustrate a problem-only when the sample is wide enough can there be a chance to discover the hidden genetic diversity that has been "averaged out". 6.2 Discovery of key genes absent in reference genomes but linked to elite traits The reference genome is undoubtedly important, but it is not omnipotent. For instance, in the whole-genome analysis of upland cotton (G. hirsutum), researchers discovered over 30,000 "new genes" that had never appeared in the reference genome at all. These genes, although missing in the "mainstream" version, are actually present in some cultivated species or wild materials. More notably, among these non-reference genes, 124 are related to fiber quality and yield traits, and 47 are directly associated with multiple superior agronomic traits. Such findings suggest that if we remain confined within the framework of reference sequences, we are likely to miss some

RkJQdWJsaXNoZXIy MjQ4ODYzNA==