Journal of Tea Science Research, 2024, Vol.14, No.2, 79-91 http://hortherbpublisher.com/index.php/jtsr 81 Continuous innovation in sequencing technologies has led to unprecedented advancements in tea genome research, opening new dimensions in genetic diversity, functional genomics, and biotechnological applications. These advancements collectively enhance the understanding of the tea genome, enabling researchers to utilize genomic resources for precision breeding strategies, develop tea plant varieties resistant to biotic and abiotic stresses, and improve tea quality and nutritional value. 3 Genome Assembly Challenges 3.1 Complexities in tea genome The assembly of the tea genome faces significant challenges due to its inherent complexity, including large genome size, abundant repetitive sequences, and high heterozygosity. The tea genome size typically ranges from 2.5 to 4.0 gigabases (Gb), depending on the species and ploidy level. This large genome size poses challenges for sequencing, assembly, and subsequent annotation processes. Additionally, the high proportion of repetitive sequences in the tea genome, accounting for at least 64%, further complicates assembly (Wei et al., 2018). These repetitive sequences can lead to misassemblies and gaps in the final genome sequence. The high degree of heterozygosity and structural variation within the genome adds another layer of complexity to the assembly process (Chin et al., 2016). This introduces allelic variation and haplotype diversity that must be resolved during genome reconstruction. These complexities highlight the need for robust computational algorithms and innovative sequencing strategies to achieve high-quality, contiguous assemblies essential for downstream genomic analysis and biotechnological applications. Furthermore, the tea genome has undergone multiple whole-genome duplications, resulting in indistinguishable homologous gene copies, adding another layer of complexity to the assembly (Wei et al., 2018). 3.2 Tools and techniques for effective assembly To address these complexities, various tools and techniques have been developed and optimized for effective genome assembly. Bioinformatics pipelines, such as de novo assemblers (e.g., SOAPdenovo, SPAdes) and reference-guided assembly tools (e.g., BWA-MEM, Bowtie2), play pivotal roles in reconstructing tea genome sequences from raw sequencing data. Long-read sequencing technologies, such as those provided by Oxford Nanopore's MinION and PacBio, have become essential for spanning repetitive regions and achieving more contiguous assemblies (Chin et al., 2016; Mgwatyu et al., 2022). Optical mapping and chromosome conformation capture techniques provide complementary structural information, facilitating the scaffolding and validation of assembled genome sequences. The FALCON and FALCON-Unzip algorithms are particularly effective for assembling highly heterozygous genomes, as they can generate phased diploid assemblies that accurately represent the haplotype structure (Chin et al., 2016). The study found that FALCON-Phase addresses phase-switching issues by using Hi-C short-read sequences, thereby reconstructing long haplotype blocks. This method has demonstrated high accuracy (>96%) in benchmark tests (Kronenberg et al., 2018). Additionally, the Bridger package has shown superior performance in transcriptome assembly, providing high completeness and accuracy, which is crucial for understanding gene expression and regulation in tea plants (Li et al., 2019). 3.3 Case studies of successful genome assemblies Several successful genome assembly projects have been conducted on tea plants and related species, providing valuable insights and resources for the research community. For instance, Xia et al. (2020a) assembled the genome of Camellia sinensis var. sinensis using a combination of Single-Molecule Real-Time (SMRT) sequencing and Chromosome Conformation Capture (Hi-C) technology, resulting in a high-quality genome assembly (Figure 1). Through comparative genomics, phylogenetics, transcriptomics, and population genetics analyses, they gained deep insights into the evolution and adaptability of the tea tree genome. This study produced a highly contiguous assembly, improving the resolution of repetitive elements and gene-rich regions, enabling comprehensive gene annotation and comparative genomic analysis between tea varieties.
RkJQdWJsaXNoZXIy MjQ4ODYzNA==