International Journal of Marine Science, 2025, Vol.15, No.3, 130-143 http://www.aquapublisher.com/index.php/ijms 133 3 Application of Whole-Genome Data in Phylogenetic Reconstruction 3.1 Advances in high-throughput sequencing In the past two decades, high-throughput sequencing technology has developed rapidly, bringing a large amount of available data to phylogenetic research (Gao et al., 2024). Early "Sanger sequencing" could only obtain a single gene sequence at a time, while today's second-generation sequencing and third-generation sequencing can produce massive sequence information covering the entire genome in a short time. In the field of fish genome research, since the first fish genome (pufferfish) was published in 2002, hundreds of fish have completed genome sequencing. For non-model marine fish such as the genus S. guttatus, breakthroughs have also been made in recent years. For example, in 2024, researchers assembled a high-quality genome of this genus for the first time - the chromosome-level whole genome of the Indo-Pacific S. guttatus. The study combined PacBio HiFi long-read sequencing and Hi-C chromosome conformation capture technology to construct a genome map of 24 pairs of chromosomes with a total length of approximately 798 million bases, corresponding to 25,886 predicted protein-coding genes. This marks the first high-quality reference genome of the genus Spanish mackerel. In addition to Spanish mackerel, multiple genomes of other members of the Scomberidae family, such as tuna and mackerel, have also been published, such as bigeye tuna and true trevally. It is worth mentioning that in 2022, researchers published the genome assembly results of Atlantic horse mackerel (Scomber colias), using third-generation sequencing and graphical assembly methods to construct a highly continuous reference genome. The accumulation of these genomic resources has laid the foundation for comparative genomic and phylogenetic studies of scomber fish. In addition, there is genome skimming technology, which can recover high-copy sequences such as mitochondrial whole genomes and ribosomal genes through shallow sequencing, which can be used to quickly construct phylogenetic frameworks. The emergence of third-generation long-read sequencing (PacBio, Nanopore) also provides a new perspective for phylogeny: its long read length can not only better assemble genomes and capture structural variations, but also directly perform phylogenetic analysis of k-mer frequency or non-paired assembly. For example, studies have shown that species relationships can be reconstructed without alignment by comparing the repetitive sequence composition or k-mer sharing of different genomes. 3.2 Coordinated analysis of nuclear and mitochondrial genomes In phylogenetic studies, nuclear and mitochondrial genomes have their own characteristics. Coordinated analysis of the information of the two can obtain more comprehensive and reliable evolutionary signals (Sun et al., 2022). Mitochondrial DNA (mtDNA) is widely used in animal phylogenetic and population genetic studies due to its high mutation rate and maternal haplotype nature. However, mtDNA only represents the maternal evolutionary history and is easily affected by factors such as hybridization and mate preference, resulting in phylogenetic trees that are inconsistent with the true relationship of species (called "gene tree/species tree inconsistency"). In the phylogenetic study of the genus Mackerel, mitochondrial and nuclear genes were often used for analysis separately in the past. Now, whole genome sequencing enables us to obtain mitochondrial whole genome sequences and tens of thousands of nuclear gene markers at one time. For example, through low-depth genome sequencing (genome skimming), a complete mitochondrial genome can be assembled from sequencing data, and a large number of single-copy nuclear gene sequences can be detected at the same time. The phylogenetic tree constructed using the mitochondrial whole genome usually has a high resolution and can effectively identify closely related species and cryptic lineages (Jeena et al., 2022). A recent study reported the mitochondrial genome characteristics of two species of Spanish mackerel in the family Scombridae, and reconstructed the phylogenetic relationship of the family Scombridae based on the complete mitochondrial sequence. The results are basically consistent with the traditional classification (Lorenzen et al., 2021). On the other hand, the nuclear genome provides a huge number of independent sites, and supermatrix (connecting many genes) or consensus methods (such as multi-gene co-alescent models) can be used to infer species trees, which usually obtain highly supported phylogenetic results and quantify potential conflict signals on
RkJQdWJsaXNoZXIy MjQ4ODYzNA==