RGG_2025v16n3

Rice Genomics and Genetics 2025, Vol.16, No.3, 159-179 http://cropscipublisher.com/index.php/rgg 171 a minor indel in one of the HSA genes), preventing full fertility in hybrids. Pan-genome analysis has pinpointed such incompatibility regions and even guided the discovery of “neutral” alleles (so-called wide-compatibility genes) that breeders use to enable indica–japonicacrosses. Some rice varieties clearly show traces of hybridization in their genomes. For example, aromatic or Basmati rice is a genetic blend of both indica and japonica types. Data from the rice pan-genome reveals that while aromatic rice mostly carries indica background, it also contains sections inherited fromjaponica and has its own unique gene variations. These special genetic combinations-including chromosomal rearrangements and introgressed blocks-reflect how human breeding created a distinct subgroup. In recent breeding work, parts of the japonica genome have been introduced into indica varieties and vice versa, forming genomes that are a patchwork of both. This kind of mosaic structure can be difficult to spot using a single reference genome. But with a pan-genome approach, we can track where each DNA segment comes from by looking at variations and haplotypes tied to specific lineages. In short, the rice pan-genome maps out both ancient crossings and modern breeding efforts that have shaped today’s diverse rice types. 7 Case Studies in Rice Pan-genome Research 7.1 Case 1: the 3 000 rice genomes project The 3,000 Rice Genomes Project (3K RGP), published in 2018, was one of the first major efforts to explore rice pan-genomics at scale. It focused on sequencing over 3,000 rice varieties from Asia and Africa using low-coverage, short-read methods. Although these genomes weren’t fully assembled from scratch, the project still uncovered an incredible amount of genetic variation. Researchers identified more than 29 million SNPs and hundreds of thousands of small insertions and deletions, all by comparing the sequences to the Nipponbare reference genome. They also looked into larger structural differences by analyzing read depth and assembling unmapped reads. One of the key outcomes was a draft version of the rice pan-genome. This included core genes shared by all varieties and others that appeared in only some. About 12 000 gene families were consistently present across all accessions, while nearly half of all gene families were found to vary between varieties-a clear sign of rice’s genomic diversity. But the project didn’t just produce data. It shed light on rice population structure, revealing nine distinct subgroups within cultivated rice. It also enabled genome-wide association studies for traits like grain size, pericarp color, and disease resistance. Importantly, the team made their findings publicly available, including a user-friendly rice pan-genome browser (RPAN) for tracking gene presence or absence. Despite its limitations in capturing large structural variants, the 3K RGP paved the way for more advanced pan-genome work. It revealed how much valuable genetic information had been missed by relying on a single reference genome and underscored the importance of sequencing rice more deeply and broadly. For many researchers, it was a wake-up call-and a strong foundation for what came next. 7.2 Case 2: graph-based pan-genome A representative case of rice pan-genome research is the work by Song et al. (2021), who used a graph-based genome approach to better understand complex traits. Their team brought together 12 different rice genomes-spanning indica, japonica, and wild types-into one variation graph. This setup made it possible to align sequencing data from over 400 rice lines to a more comprehensive reference, instead of relying on just one genome like Nipponbare. This broader reference uncovered a wider range of genetic variation, including structural changes and presence/absence variants (PAVs) that typical single-reference methods often miss. One of the most significant findings was a new QTL linked to grain weight, tied to a gene the authors called qGW candidate. This gene wasn’t detectable when using a standard linear reference, but it became visible within the graph-based framework. Beyond discovering traits, this method also sharpened the accuracy of variant detection, especially in repetitive or hard-to-map regions. By offering multiple alignment paths, the graph genome reduced the bias that comes from forcing data to fit a single template. Overall, this study highlights how graph-based pan-genomes can move rice

RkJQdWJsaXNoZXIy MjQ4ODYzNA==