CMB_2024v14n2

Computational Molecular Biology 2024, Vol.14, No.2, 54-63 http://bioscipublisher.com/index.php/cmb 56 phenotyping and deep learning approaches can leverage the large amount of genomic and phenotypic data collected across different growing seasons and environments to increase heritability estimates, selection intensity, and selection accuracy (Merrick et al., 2022). 3 Data Requirements and Management 3.1 High-throughput genotyping High-throughput genotyping is a cornerstone of modern plant and animal breeding programs, enabling the identification and utilization of genetic variation on a genome-wide scale. Single Nucleotide Polymorphisms (SNPs) are the most commonly used markers due to their abundance and the development of high-throughput genotyping technologies such as SNP arrays and whole-genome sequencing (WGS). SNP arrays, like the TaBW280K developed for wheat, allow for efficient genotyping of large populations, providing valuable data for diversity analyses and breeding programs (Rimbert et al., 2018). Similarly, genotyping-by-sequencing (GBS) has emerged as a cost-effective alternative, combining marker discovery and genotyping in a single step, which is particularly useful for species with large genomes (He et al., 2014; Gorjanc et al., 2015). The effectiveness of genomic selection (GS) is highly dependent on the density and coverage of genetic markers. High-density SNP arrays and WGS provide comprehensive coverage of the genome, capturing a wide range of genetic variation. For instance, the TaBW280K array for wheat includes 280,226 SNPs, covering both genic and intergenic regions, which enhances the resolution of genetic mapping and the accuracy of GS models (Rimbert et al., 2018). In livestock, GBS has been shown to provide comparable accuracy to SNP arrays when a sufficient number of markers and appropriate sequencing depth are used (Gorjanc et al., 2015). The choice between SNP arrays and WGS often depends on the specific requirements of the breeding program, including the species, genome size, and available resources (Bhat et al., 2016; Moraes et al., 2018). Cost and efficiency are critical factors in the selection of genotyping methods. SNP arrays, while having a high initial development cost, offer a cost-effective solution for routine genotyping once established. For example, the development of species-specific SNP arrays can be expensive, but they provide high-throughput and reliable genotyping for large breeding populations (Grattapaglia et al., 2011; Moraes et al., 2018). On the other hand, GBS and other NGS-based methods offer flexibility and lower initial costs, making them suitable for species where SNP arrays are not available or economically feasible (He et al., 2014; Gorjanc et al., 2015). The continuous decline in sequencing costs is expected to further enhance the feasibility of WGS for GS in the near future (Bhat et al., 2016). 3.2 Phenotypic data collection Accurate phenotypic data is essential for the success of GS. High-throughput phenotyping technologies are being developed to complement genotyping efforts, enabling the collection of large-scale, precise phenotypic data. These technologies include automated imaging systems, remote sensing, and various sensor-based methods that can capture complex traits in real-time. The integration of high-throughput phenotyping with genotyping data is crucial for improving the accuracy of genomic predictions and achieving significant genetic gains in breeding programs (Figure 1) (Bhat et al., 2016; Wang et al., 2016). Bhat et al. (2016) found that combining high-throughput phenotyping (HTP) with genomic estimated breeding values (GEBV) enables precise prediction of an individual’s breeding value, thereby accelerating the identification, testing, and promotion of superior genotypes. NGS and HTP technologies significantly enhance the efficiency and accuracy of genomic selection by increasing the coverage of genotype data and the precision of phenotype data collection, speeding up the breeding process for superior varieties. The application of these technologies reduces costs, optimizes breeding resources, and provides powerful tools for crop improvement. 3.3 Data integration and management The integration and management of large-scale genotypic and phenotypic data pose significant challenges. Effective data management systems are required to handle the vast amounts of data generated by high-throughput genotyping and phenotyping technologies. These systems must support data storage, retrieval, and analysis,

RkJQdWJsaXNoZXIy MjQ4ODYzNA==