IJMEC_2025v15n3

International Journal of Molecular Ecology and Conservation, 2025, Vol.15, No.3, 134-143 http://ecoevopublisher.com/index.php/ijmec 137 important foundation for the identification of functional genes, comparative genomics and breeding research in goats. However, since the reference genome only represents the genetic composition of a single individual (or a single species), it cannot encompass all the variations of the entire species (Li et al., 2023a). Studies have shown that compared with the reference sequence of goats, different breeds and wild relatives carry a large number of sequence fragments and genes "missing" from the reference genome (Pogorevc et al., 2024). Furthermore, many structural variations (such as long fragment insertions/deletions, gene copy number variations, etc.) are not easily detected in reference-based variation detection, resulting in biases in the understanding of certain functional variations (Li et al., 2019; Li et al., 2023a; Li et al., 2023b). 3.3 Progress in the application of pan-genome in livestock Research on the pan-genome of livestock is advancing rapidly, and the picture of genetic differences within and outside species is being drawn clearer and deeper. Pigs are a typical example: there are approximately 206 million DNA base gaps in the primary reference genome, and a vast number of structural variations have been detected, among which there are many genes related to high-altitude adaptation and reproductive strategies (Li et al., 2020). The same is true for sheep - 195 million new bases and 2 678 previously unannotated genes have been added; Meanwhile, variations related to key traits such as tail shape (fat tail, thin tail) were discovered (Dai et al., 2023). The progress of cattle has gone even further. After integrating multi-variety long-read assemblies, Crysnanto et al. (2021) constructed a multi-assembly genome map and revealed a large number of new functional fragments. As a result, the complexity of the bovine genome is presented in greater detail. For goats, early exploration has already begun: Liao et al. (2023) were the first to attempt to supplement the missing fragments in the goat reference genome by splicing the genomes of several closely related species; Subsequently, Li et al. (2023b) constructed the goat pan-genome using hundreds of global goat genomic data and achieved valuable results. It can be foreseen that the in-depth research on the pan-genome of the Goat genus will further benchmark against the research achievements of other domestic animals, thereby comprehensively enhancing our understanding of genomic variations and traits in domestic animals. 4 Methods and Technical Routes for Constructing the Pan-genome 4.1 Basic concepts and components of the pan-genome The pan-genome is a collection of genomic sequences of all individuals within a certain species or taxonomic unit, consisting of two parts: the "core genome" and the "variable genome" (Gao et al., 2019). The core genome refers to the sequences and genes that exist in all individuals, typically including conserved functional elements necessary for maintaining basic life activities. A variable genome (also known as a supplementary genome or variable part) is composed of sequences that are not shared by all individuals, including fragments specific to a particular subgroup, species, or even an individual. In practice, the pan-genome is often represented in the form of a "non-redundant sequence set", that is, by integrating the gene assembly of multiple individuals and removing repetitive regions, the union of all unique sequences is obtained (Liao et al., 2023). The proposal of the pan-genome has greatly promoted the understanding of intra-species variation patterns, extending from microorganisms to animals and plants, providing a new perspective for the study of genomic evolution and function (Gao et al., 2019; Gong et al., 2023). 4.2 The construction process of the Goat genus pan-genome This study adopted a systematic approach to construct the pan-genome of the genus goats. In terms of sample selection, the major domestic goat breeds worldwide (Asian, African, European native breeds and improved breeds) and wild relatives (ibex, northern goat, etc.) were covered to ensure lineage and geographical representativeness (Li et al., 2023b); Data acquisition was achieved using PacBio/Nanopore long-read sequencing to obtain chromosome-level assembly, combined with short-read resequencing data to supplement variations (Crysnanto et al., 2021). The specific process includes: identification of non-reference insertion fragments by multiple sequence alignment (Liao et al., 2023); Graph structure splicing and integration of new sequences to construct a graph pan-genome with multiple branches (Bao et al., 2019); "Map-to-pan" iterative strategy: Unaligned reads are reassembled and incorporated into the pan-genome; Generate a non-redundant sequence set

RkJQdWJsaXNoZXIy MjQ4ODYzNA==