• Chinese Core Journal
  • Chinese Science Citation Database (CSCD) Source journal
  • Journal of Citation Report of Chinese S&T Journals (Core Edition)
ZHOU Yuncheng, LIU Zhongying, DENG Hanbing, et al. Depth estimation for corn plant images based on hybrid group dilated convolution[J]. Journal of South China Agricultural University, 2024, 45(2): 280-292. DOI: 10.7671/j.issn.1001-411X.202304019
Citation: ZHOU Yuncheng, LIU Zhongying, DENG Hanbing, et al. Depth estimation for corn plant images based on hybrid group dilated convolution[J]. Journal of South China Agricultural University, 2024, 45(2): 280-292. DOI: 10.7671/j.issn.1001-411X.202304019

Depth estimation for corn plant images based on hybrid group dilated convolution

More Information
  • Received Date: April 13, 2023
  • Available Online: December 10, 2023
  • Published Date: December 06, 2023
  • Objective 

    To study the image depth estimation methods for corn field scenes, solve the problem of insufficient accuracy in depth estimation models due to the lack of effective photometric loss measures, and provide technical support for the vision system design of field intelligent agricultural machinery and navigation obstacle avoidance.

    Method 

    This study applied binocular cameras as visual sensors, and proposed an unsupervised depth estimation model based on hybrid grouping extended convolution. A hybrid grouping extended convolution structure and its corresponding self-attention regulation mechanism were designed. The reverse residual module and deep neural network were constructed as the backbone of the model. The illumination insensitive image gradient and Gabor texture features were introduced into the apparent difference measurement of view, and the model optimization objective was constructed based on them. Taking maize plant image as an example, the model training and verification tests were carried out.

    Result 

    Compared with the fixed expansion factor, the average relative error of maize plant depth estimation in the field was reduced by 63.9%, the average absolute error and root mean square error were reduced by 32.3% and 10.2% respectively, and the accuracy of the model was significantly improved. With the introduction of image gradient, Gabor texture feature and self-attention mechanism, the mean absolute error and root mean square error of field scene depth estimation were further reduced by 3.2% and 4.6% respectively. Increasing the network width and depth of shallow encoder could significantly improve the accuracy of model depth estimation, but the effect of this treatment on deep encoder was not obvious. The self-attention mechanism designed in this study was selective to the convolution grouping of different expansion factors in the shallow reverse residual module of the encoder, indicating that the mechanism had the ability to adjust the receptive field. Compared with Monodepth2, the average relative error and the average absolute error of the estimated depth of maize plants in the field of the research model were reduced by 48.2% and 17.1% respectively. Within the sampling range of 20 m, the average absolute error of the estimated depth was no more than 16 cm, and the calculation speed was 14.3 frames per second.

    Conclusion 

    The image depth estimation model based on hybrid group dilated convolution is superior to existing methods, effectively improves the accuracy of depth estimation and can meet the depth estimation requirements of field corn plant images.

  • [1]
    MALAVAZI F B P, GUYONNEAU R, FASQUEL J B, et al. LiDAR-only based navigation algorithm for an autonomous agricultural robot[J]. Computers and Electronics in Agriculture, 2018, 154: 71-79. doi: 10.1016/j.compag.2018.08.034
    [2]
    毛文菊, 刘恒, 王小乐, 等. 双导航模式果园运输机器人设计与试验[J]. 农业机械学报, 2022, 53(3): 27-39.
    [3]
    王亮, 翟志强, 朱忠祥, 等. 基于深度图像和神经网络的拖拉机识别与定位方法[J]. 农业机械学报, 2020, 51(S2): 554-560.
    [4]
    何勇, 蒋浩, 方慧, 等. 车辆智能障碍物检测方法及其农业应用研究进展[J]. 农业工程学报, 2018, 34(9): 21-32. doi: 10.11975/j.issn.1002-6819.2018.09.003
    [5]
    景亮, 王瑞, 刘慧, 等. 基于双目相机与改进YOLOv3算法的果园行人检测与定位[J]. 农业机械学报, 2020, 51(9): 34-39. doi: 10.6041/j.issn.1000-1298.2020.09.004
    [6]
    魏建胜, 潘树国, 田光兆, 等. 农业车辆双目视觉障碍物感知系统设计与试验[J]. 农业工程学报, 2021, 37(9): 55-63. doi: 10.11975/j.issn.1002-6819.2021.09.007
    [7]
    翟志强, 熊坤, 王亮, 等. 采用双目视觉和自适应Kalman滤波的作物行识别与跟踪[J]. 农业工程学报, 2022, 38(8): 143-151.
    [8]
    洪梓嘉, 李彦明, 林洪振, 等. 基于双目视觉的种植前期农田边界距离检测方法[J]. 农业机械学报, 2022, 53(5): 27-33.
    [9]
    EIGEN D, PUHRSCH C, FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS 2014). Montreal, Canada: ACM, 2014: 2366-2374.
    [10]
    XIE J Y, GIRSHICK R, FARHADI A. Deep3D: Fully automatic 2D-to-3D video conversion with deep convolutional neural networks[C]//14th European Conference on Computer Vision (ECCV 2016). Amsterdam, Netherlands: Springer, 2016: 842-857.
    [11]
    GARG R, BG V K, CARNEIRO G, et al. Unsupervised CNN for single view depth estimation: Geometry to the rescue[C]//14th European Conference on Computer Vision (ECCV 2016). Amsterdam, Netherlands: Springer, 2016: 740-756.
    [12]
    ZHOU T H, BROWN M, SNAVELY N, et al. Unsupervised learning of depth and ego-motion from video[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA: IEEE, 2017: 6612-6619.
    [13]
    GODARD C, MAC AODHA O, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA: IEEE, 2017: 6602-6211.
    [14]
    GODARD C, MAC AODHA O, FIRMAN M, et al. Digging into self-supervised monocular depth estimation[C]//IEEE International Conference on Computer Vision (ICCV). Seoul, Korea: IEEE, 2020: 3827-3837.
    [15]
    周云成, 邓寒冰, 许童羽, 等. 基于稠密自编码器的无监督番茄植株图像深度估计模型[J]. 农业工程学报, 2020, 36(11): 182-192.
    [16]
    PILZER A, XU D, PUSCAS M, et al. Unsupervised adversarial depth estimation using cycled generative networks[C]//2018 International Conference on 3D Vision. Verona, Italy: IEEE, 2018: 587-595.
    [17]
    MIYATO T, KATAOKA T, KOYAMA M, et al. Spectral normalization for generative adversarial networks[EB/OL]. arXiv: 1802.05957. https://arxiv.org/abs/1802.05957.pdf.
    [18]
    WAN Y C, ZHAO Q K, GUO C, et al. Multi-sensor fusion self-supervised deep odometry and depth estimation[J]. Remote Sensing, 2022, 14(5): 1228. doi: 10.3390/rs14051228.
    [19]
    JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS 2015). Montreal, Canada: ACM, 2015: 2017-2025.
    [20]
    IDRISSA M, ACHEROY M. Texture classification using Gabor filters[J]. Pattern Recognition Letters, 2002, 23(9): 1095-1102. doi: 10.1016/S0167-8655(02)00056-9
    [21]
    RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2015). Munich, Germany: Springer, 2015: 234-241.
    [22]
    YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[C]//International Conference on Learning Representations. San Juan, Puerto Rico: IEEE, 2016.
    [23]
    WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe, NV, USA: IEEE, 2018: 1451-1460.
    [24]
    YU F, KOLTUN V, FUNKHOUSER T. Dilated residual networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 636-644.
    [25]
    SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 1-9.
    [26]
    SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17). San Francisco, California, USA: ACM, 2017: 4278-4284.
    [27]
    CHEN Y P, FAN H Q, XU B, et al. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2020: 3434-3443.
    [28]
    CHOLLET F. Xception: Deep learning with depth wise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1800-1807.
    [29]
    SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 4510-4520.
    [30]
    HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2020: 1314-1324.
    [31]
    HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7132-7141.
    [32]
    KINGMA D P, BA J. Adam: A method for stochastic optimization[J]. Computer Science, 2014.
    [33]
    HIRSCHMÜLLER H. Stereo processing by semiglobal matching and mutual information[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(2): 328-341. doi: 10.1109/TPAMI.2007.1166
    [34]
    SMOLYANSKIY N, KAMENEV A, BIRCHFIFLD S. On the importance of stereo for accurate depth estimation: An efficient semi-supervised deep neural network approach[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City, UT, USA: IEEE, 2018: 11200-11208.
  • Cited by

    Periodical cited type(2)

    1. 张兴楠,罗龙辉,黄裕鑫,刘吉平. 桑褐斑病病原菌的分离鉴定及高通量测序分析. 蚕业科学. 2024(01): 9-16 .
    2. 胡昌雄,杨伟克,刘增虎,李涛,李琼艳,范永慧,董占鹏. 桑蓟马种群动态及异色瓢虫对其的捕食功能反应. 蚕业科学. 2023(03): 225-233 .

    Other cited types(0)

Catalog

    Article views (111) PDF downloads (24) Cited by(2)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return