Rice Genomics and Genetics 2025, Vol.16, No.3, 159-179 http://cropscipublisher.com/index.php/rgg 177 One particular integration challenge is dealing with gene presence/absence in gene expression studies. If a gene is missing in some lines, traditional differential expression analysis must account for that (treat missing genes appropriately rather than as zero expression). New computational methods are needed to handle such scenarios seamlessly. However, as these integrative analyses mature, we expect a more holistic understanding: not just which genes exist in the pan-genome, but which are turned on or off, methylated or not, and ultimately how they drive complex traits across the myriad contexts in which rice is grown. 9.3 From graph-based pan-genomes to pan-transcriptomes Graph-based representations of the rice pan-genome are likely to serve as a foundation for next-generation multi-omics integration. By encoding all genomic variants in a graph, we set the stage for aligning not only DNA reads but also RNA transcripts and other sequence-based data to the same structure. A pan-transcriptome can be built by aligning RNA-seq reads from diverse rice varieties to a pan-genome graph rather than a single reference (Woldegiorgis et al., 2022). This approach would allow discovery of transcripts arising from sequences unique to certain subpopulations. For example, if an indica-specific gene is expressed under heat stress, its mRNA reads will align to the indica branch of the pan-genome graph and be correctly assembled into an indica-specific transcript, rather than being lost or misaligned when using a japonica reference. Early efforts in model plant systems hint at the promise of this approach-for instance, graph-based read mapping has improved the detection of allele-specific expression and splice variants in Arabidopsis. In rice, a graph-based pan-transcriptome could reveal new isoforms that are present only in, say, aus rice, or alternative splicing patterns that differ between indica and japonicadue to structural variants affecting splice sites. Another frontier is constructing pan-metabolic networks or pan-proteomes, extending the concept beyond the genome. A pan-genome graph can be annotated with protein-coding and non-coding elements, and one could overlay proteomic data (peptide mass spectra) to detect peptides from subspecies-specific genes (Shrestha et al., 2024). Similarly, one could map chromatin immunoprecipitation (ChIP-seq) data for transcription factors or histone marks onto the pan-genome graph to see how regulatory landscape differs across genomes. All of this will require robust graph bioinformatics tools. The graph approach is computing-intensive-the rice genome is already ~390 Mb, and a graph representing hundreds of genomes will be larger. Algorithms for mapping RNA or DNA reads to a variation graph (such as VG, GraphAligner, or Minigraph) are improving, but handling the sheer data volumes of population-scale transcriptomes is still challenging. Moreover, visualizing and interpreting results on a graph is non-trivial for biologists. User-friendly interfaces will be needed to query, for example, “show expression of this gene across all varieties, including those where the gene is absent or truncated.” Despite these challenges, the concept of pan-transcriptomics is on the horizon. As graph-based pan-genomes become standard, it’s natural that all sequence-based assays-RNA-seq, ATAC-seq, methyl-seq-will be re-analyzed in a graph context to maximize discovery. This will lead to richer functional catalogs, identifying not just the static presence of genes, but their dynamic usage across the species. 9.4 Prospects for pan-genomes in sustainable agriculture and climate adaptation Pan-genomics is poised to play a key role in breeding crops that can withstand the challenges of climate change and ensure sustainable agriculture (Shang et al., 2022). By capturing the full genetic diversity of rice, the pan-genome provides a reservoir of traits that can be tapped for adaptation. For instance, climate change is leading to more erratic weather patterns-droughts, floods, extreme temperatures-and pan-genome analyses have identified many stress-response genes that were left behind during domestication. These include heat shock factors, dehydration-responsive element-binding proteins, and various osmoprotectant biosynthesis genes present in wild or traditional varieties but absent in modern high-yield cultivars. Armed with this knowledge, breeders can re-introduce such genes into elite backgrounds to create climate-resilient varieties. Indeed, efforts are underway to breed “climate-smart” rice, such as flood-tolerant varieties (Sub1 introgression lines) and drought-tolerant varieties, and pan-genomic data accelerates the identification of new tolerance genes and the markers to track them.
RkJQdWJsaXNoZXIy MjQ4ODYzNA==