FC_2025v8n3

Field Crop 2025, Vol.8, No.3, 139-153 http://cropscipublisher.com/index.php/fc 143 relationships. Nowadays, deep learning models that integrate multi-source data have significantly improved prediction accuracy. Ultimately, AI is not only capable of classifying images; it has indeed done a lot of "serious work" in agriculture. Take cotton for example. Want to predict the output? In the past, people relied on experience, but now with deep learning, a lot of trouble can be saved. Yang et al. (2025) mentioned that structures like convolutional layers and pooling layers can actually automatically extract useful features from complex data without the need for manual variable selection at all. Moreover, no matter where the data comes from, whether it is images or sensors, it can process them all together. However, this matter is not just about theory. The research also verified this point-when CNN models are used in combination with images captured by drones, the average error in plot level yield prediction can be controlled within 8%. Some teams have made further attempts, such as using the improved version of YOLOv8 to identify cotton bolls and then estimate the yield, with an error rate of only around 7.7%. To be honest, this is quite astonishing. Feng et al. (2025) developed a yield estimation method that combines multispectral remote sensing and machine learning. So, what was the result? If only one sensor is relied on, the effect is just so-so. However, once the visible light, red edge and near-infrared bands are all integrated, the prediction accuracy immediately increases. From this perspective, it seems that relying on machine learning to "piece together the puzzle" is much more reliable than the traditional method that only uses a single variable. Speaking of this, in fact, AI can do something else. It can not only estimate the yield but also categorize and "identify cotton". Tools such as support vector machines, random forests, and deep neural networks can be used to determine whether a variety is insect-resistant cotton or to identify growth stages such as the seedling stage, bud stage, and flowering stage. 3.3 Multi-modal data fusion and automated feature extraction The phenotype of cotton is jointly influenced by genes, environment and planting management. Therefore, to understand the phenotype of cotton, it is necessary to combine data from different aspects for examination. Multimodal data fusion means analyzing data from various sources and of different types together. For instance, information such as images, spectra, meteorology and soil can be combined. This way, the situation of cotton can be understood more comprehensively and accurately. In the phenotypic analysis of cotton, multimodal fusion has become an important method to improve model performance and discover new problems. Sometimes, the optical image of the canopy alone may not be sufficient. For instance, it might not be able to distinguish whether there is a lack of nitrogen or water. But if thermal infrared images (which can show the leaf temperature) and soil moisture data are added, it will be easier to determine which problem it is. This "1+1>2" effect has been reflected in many studies. Wang et al. (2022) have done relevant work. By using images from the Sentinel-2 satellite and combining data from multiple time points with meteorological information, they not only improved the accuracy of cotton yield prediction but also identified which growth period is most suitable for yield estimation. Multimodal data fusion can overcome the limitations of a single data source and provide a more comprehensive explanatory power for complex agronomic traits. Ai-driven platforms are naturally suitable for conducting multimodal fusion analysis. On the one hand, the development of sensor technology enables us to simultaneously obtain multimodal data: for instance, unmanned aerial vehicle (UAV) platforms can be equipped with RGB cameras to capture visible light images, multispectral cameras to obtain vegetation indices, hyperspectral imagers to obtain fine spectral curves, and thermal imagers to obtain temperature distributions, thereby collecting multimodal phenotypes in a single flight. Ground-based phenotype vehicles can also integrate LiDAR and imaging sensors to simultaneously obtain three-dimensional structural and spectral information. On the other hand, machine learning, especially deep learning models, provides a powerful framework for multimodal data fusion. Convolutional neural networks can have multiple branches to handle data of different modalities respectively, and then fuse them on high-level features. Models based on the attention mechanism can also automatically learn the weights of each modality (Zhang et al., 2024). Multimodal fusion not only enhances the prediction accuracy but also provides a new perspective for revealing the relationships between different data sources and traits. Automated feature extraction is an important characteristic of AI phenotypic analysis. Traditional analysis often relies on manual feature selection, such as choosing specific band ratios as vegetation indices or selecting several

RkJQdWJsaXNoZXIy MjQ4ODYzNA==