Abstract:
Objective In view of the characteristics of the feed area in the monitoring image, which has a long structure, fuzzy boundaries, as well as complex and changeable shapes and sizes, the aim of this study was to more accurately segment the feed residual area and consumption area, and achieve the purpose of accurately monitoring the feed consumption status.
Method This study proposed a semantic segmentation model based on Swin-Unet, which applied ConvNeXt block at the beginning of the Swin Transformer block to enhance the model’s ability of encoding feature information to provide better feature representation. The model used depth-wise convolution to replace linear attention projection to provide local spatial context information. At the same time, a novel wide receptive field module was proposed to replace the multi-layer perceptron to enrich multi-scale spatial context information. In addition, at the beginning of the encoder, the linear embedding layer was replaced with a convolutional embedding layer, which introduces more spatial context information between and within patches by compressing features in stages. Finally, a multi-scale input strategy, a deep supervision strategy and a feature fusion module were introduced to strengthen feature fusion.
Result The mean intersection over union, accuracy, F1-score and operation speed of the proposed method were 86.46%, 98.60%, 92.29% and 23 frames/s respectively, which were 4.36, 2.90, 0.65 percentage points and 15% higher than those of Swin-Unet.
Conclusion It is feasible to apply the method based on image semantic segmentation to the automatic monitoring of feed consumption status. This method effectively improves the segmentation accuracy and computing efficiency by introducing convolution into Swin-Unet, which is of great significance for improving production management efficiency.