PAN Weihao, SHENG Huizi, WANG Chunyu, et al. Pig state audio recognition based on underdetermined blind source separation and deep learning[J]. Journal of South China Agricultural University, 2024, 45(5): 730-742. DOI: 10.7671/j.issn.1001-411X.202312011
    Citation: PAN Weihao, SHENG Huizi, WANG Chunyu, et al. Pig state audio recognition based on underdetermined blind source separation and deep learning[J]. Journal of South China Agricultural University, 2024, 45(5): 730-742. DOI: 10.7671/j.issn.1001-411X.202312011

    Pig state audio recognition based on underdetermined blind source separation and deep learning

    • Objective In order to solve the problem of difficult separation and recognition of pig audio under group rearing environment, we propose a method of pig state audio recognition based on underdetermined blind source separation and ECA-EfficientNetV2.
      Method Four types of pig audio signals were simulated as observation signals in group rearing environment. After the signals were sparsely represented, the signal mixing matrix was estimated by hierarchical clustering, and the lp-paradigm reconstruction algorithm was used to solve for the minimum of lp-paradigm to complete the reconstruction of pig audio signals. The reconstructed signals were transformed into acoustic spectrograms, which were divided into four categories, namely, eating sound, roar sound, hum sound and estrous sound. The audio was recognized using the ECA-EfficientNetV2 network model to obtain the state of the pigs.
      Result The normalized mean square error of the hybrid matrix estimation was as low as 3.266×10−4, and the signal-to-noise ratios of the separated reconstructed audio ranged from 3.254 to 4.267 dB. The acoustic spectrogram was recognized and detected by ECA-EfficientNetV2 with an accuracy of up to 98.35%, and the accuracy improved by 2.88 and 1.81 percentage points compared with the classical convolutional neural networks ResNet50 and VGG16, respectively. Compared with the original EfficientNetV2, the accuracy decreased by 0.52 percentage points, but the amount of the model parameters reduced by 33.56%, the floating-point operations (FLOPs) reduced by 1.86 G, and inference time reduced by 9.40 ms.
      Conclusion The method based on blind source separation and improvement of EfficientNetV2 lightly and efficiently realizes separating and recognizing audio signals of group-raised pigs.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return