Rice Genomics and Genetics 2025, Vol.16, No.3, 159-179 http://cropscipublisher.com/index.php/rgg 162 like PacBio and Oxford Nanopore. These tools read much longer stretches of DNA, helping researchers’ piece together more complete genomes. They’ve made it possible to detect large insertions and deletions that used to go unnoticed. A good example is rice-scientists used long-read data to assemble over 30 genomes and found many hidden structural differences (Shang et al., 2022). Newer techniques, like HiFi reads and linked-read sequencing, along with scaffolding tools such as Hi-C, have made it much easier to build high-quality reference genomes for numerous plant lines. A standout example is the 251-genome rice pan-genome, which was built using high-coverage Nanopore sequencing and Hi-C, resulting in very large, high-contiguity assemblies. Computational tools now let us compare multiple genomes more efficiently. Programs like MUMmer and Minimap2 are commonly used to spot structural differences and figure out which genes are missing or present. Recently, instead of using one fixed genome as a reference, researchers have started working with graph-based genome models. These graphs show the diversity across many genomes in a single structure. With tools like the VG toolkit, it’s easier to find structural variants in plants. This shift is changing the way we study crop genomes-especially in species where genetic diversity plays a big role in breeding and adaptation. 3 Construction Strategies for Rice Pan-genomes 3.1 Rice as a model crop: diversity and genome complexity Rice has long been a model for cereal genomics due to its relatively small genome (~390 Mb for the japonica subspecies) and its enormous agricultural importance. Despite its compact genome (especially compared to polyploid crops like wheat), rice exhibits tremendous genetic diversity. Asian cultivated rice (Oryza sativa) was domesticated from the wild progenitor O. rufipogon and consists of two major subspecies: indica (also called O. sativa subsp. indica) and japonica (O. sativa subsp. japonica). These subspecies further subdivide into distinct genetic subpopulations (such as aus, aromatic/basmati, tropical japonica, temperate japonica, etc.), which differ in their geographic origins and traits. Indica rices are typically grown in tropical regions and have broad genetic variation, whereas japonica rices are adapted to temperate climates and tend to be more genetically uniform. The divergence and partial reproductive isolation between indica and japonica (likely thousands of years ago) created deep structural variations and allelic differences between their gene pools. For instance, certain genomic segments are known to cause hybrid sterility when indica and japonica varieties are crossed, reflecting accumulated structural incompatibilities (Figure 1) (Wu et al., 2023). Beyond O. sativa, the genus Oryza contains over 20 wild species with AA, BB, CC (and other) genome types. Some wild relatives (like O. nivara and O. rufipogon) readily cross with O. sativa and have contributed alleles for traits such as disease resistance and flood tolerance in breeding programs. The African cultivated rice (O. glaberrima) is another domesticated species, independently domesticated in West Africa, with a separate but overlapping gene pool. This rich tapestry of subspecies and wild species makes rice an ideal candidate for pan-genome analysis-the goal is to capture the full diversity from domestication, varietal group differentiation, and introgression from wild gene pools. Characterizing this diversity at the whole-genome level is essential to unlock novel alleles for crop improvement. 3.2 Sequencing strategies (e.g., short reads, long reads, HiFi, Hi-C) Early rice pan-genome research mostly relied on short-read sequencing because it was cheap and got the job done. The 3 000 Rice Genomes Project is a classic example-researchers mapped short reads to a reference genome to find genetic differences. This method worked well for spotting small variations but wasn’t great for assembling new genomes or detecting big structural changes. In recent years, long-read sequencing has stepped in to fill that gap. Technologies like PacBio’s CLR and HiFi, as well as Oxford Nanopore, are better at reading complex or repetitive regions. One project used these tools to assemble genomes for 32 O. sativa and one O. glaberrima, showing a much more diverse pan-genome than expected. Another study used Nanopore to sequence over 250 rice genomes-both wild and cultivated-at high depth, creating a valuable resource for future work. Combining long reads with scaffolding technologies greatly improves assembly contiguity. Hi-C sequencing, which captures information about chromatin contacts, has been used to order and orient contigs into
RkJQdWJsaXNoZXIy MjQ4ODYzNA==