A lightweight crop pest identification method based on multi-head attention
-
摘要:目的
解决当前病虫害识别方法参数多、计算量大、难以在边缘嵌入式设备部署的问题,实现农作物病虫害精准识别,提高农作物产量和品质。
方法提出一种融合多头注意力的轻量级卷积网络(Multi-head attention to convolutional neural network,M2CNet)。M2CNet采用层级金字塔结构,首先,结合深度可分离残差和循环全连接残差构建局部捕获块,用来捕捉短距离信息;其次,结合全局子采样注意力和轻量级前馈网络构建轻量级全局捕获块,用来捕捉长距离信息。提出M2CNet-S/B/L 3个变体以满足不同的边缘部署需求。
结果M2CNet-S/B/L参数量分别为1.8M、3.5M和5.8M,计算量(Floating point operations,FLOPs)分别为0.23G、0.39G和0.60G。M2CNet-S/B/L对PlantVillage病害数据集取得了大于99.7%的Top5准确率和大于95.9%的Top1准确率,对IP102虫害数据集取得了大于88.4%的Top5准确率和大于67.0%的Top1准确率,且比同级别的模型表现优异。
结论该方法能够对作物病虫害进行有效识别,且可为边缘侧工程部署提供有益参考。
Abstract:ObjectiveTo solve the problems that the current pest identification method has many parameters, a large amount of calculation and is difficult to deploy embedded devices at the edge, so as to realize accurate identification of crop pests and diseases, and improve crop yield and quality.
MethodA lightweight convolutional neural network called multi-head attention to convolutional neural network (M2CNet) was proposed. M2CNet adopted hierarchical pyramid structure. Firstly, a local capture block was constructed by combining depth separable residual and cyclic fully connected residual to capture short-range information. Secondly, a lightweight global capture block was constructed by combining global subsampling attention and lightweight feedforward network to capture long-distance information. Three variants, namely M2CNet-S, M2CNet-B, and M2CNet-L, were proposed by M2CNet to meet different edge deployment requirements.
ResultM2CNet-S/B/L had parameter sizes of 1.8M, 3.5M and 5.8M, and floating point operations of 0.23G, 0.39G, and 0.60G, respectively. M2CNet-S/B/L achieved top5 accuracy greater than 99.7% and top1 accuracy greater than 95.9% in PlantVillage disease dataset, and top5 accuracy greater than 88.4% and top1 accuracy greater than 67.0% in IP102 pest dataset, outperforming models of the same level in comparison.
ConclusionEffective identification of crop diseases and pests can be achieved by this method, and it provides valuable references for edge engineering deployment.
-
藏猪是少有的高原型地方猪种,是我国宝贵的地方品种资源[1]。据调查,藏猪产仔数并不低,母猪的营养水平低下和乳腺发育不佳以及生存条件较恶劣可能是导致仔猪死亡率高的主要原因[2]。乳腺的良好发育是正常泌乳的前提,仔猪的存活率与母猪的乳腺发育密不可分。因此,研究藏猪妊娠期乳腺发育状况对于判断乳腺是否正常发育以及提高藏猪繁殖能力具有重要意义。
妊娠期是母猪乳腺发育的关键时期,特别是妊娠后1/3阶段,即妊娠75 d后,乳腺快速发育,其质量快速增加,乳腺结构由怀孕初期的以脂肪细胞为主转化为怀孕后期以导管和腺泡结构为主[3];妊娠期乳腺发育受到雌二醇(Estradiol,E2)、孕酮(Progesterone,P)、催乳素(Prolactin,PRL)[4]等激素的调控。E2对于乳腺导管的伸长和分支具有重要作用[5];P与E2相似,由卵巢分泌,调控乳腺组织的导管分枝、腺泡形成[6];PRL促进乳腺腺泡的发育及乳汁的分泌[7-8]。此外,在信号通路方面,PI3K/Akt是细胞内重要的信号转导通路,在乳腺细胞的增殖、分化、凋亡等活动中发挥重要的生物学功能[9-11]。Jak2/STAT5信号通路对乳腺腺泡的生成和多种乳汁蛋白基因的转录有着重要的调控作用[12-13]。但是,目前对于藏猪妊娠期乳腺的发育情况及激素和信号通路调控尚不清楚。
本试验以藏猪为对象,选取妊娠期不同时间点,在研究乳腺发育形态的基础上,进一步探索不同时间点血清中E2、P、PRL的水平,乳腺中激素受体的表达及乳腺发育关键信号通路PI3K/Akt和Jak2/STAT5的变化。研究旨在初步探究妊娠期藏猪乳腺发育过程及其潜在调控机制,为日后藏猪乳腺发育规律的揭示和地方品种的保护提供科学依据。
1. 材料与方法
1.1 试验动物
妊娠藏猪选取4个时间点(妊娠33、50、75和90 d)进行屠宰采样,采集血后离心取血清,采集第3、4对乳腺组织提取蛋白质,取第4对靠近乳头部乳腺进行石蜡切片染色。
1.2 试验材料
雌二醇、孕酮和催乳素ELISA试剂盒购于南京建成生物工程有限公司;催乳素受体(Prolactin receptor,PRLR)(货号:382057)、雌激素受体(Estrogen receptor,ER)(货号:220467)、孕酮受体(Progesterone receptor, PR)(货号:220124)抗体购于正能生物有限公司,蛋白酪氨酸激酶2(Janus kinase 2, Jak2)(货号:3230)、磷酸化蛋白酪氨酸激酶2 (p-Jak2)(货号:3771)、信号转导及转录激活因子5 (Signal transducers and activators of transduction 5, STAT5)(货号:9359)、磷酸化信号转导及转录激活因子5 (p-STAT5)(货号:4322)、磷脂酰肌醇三激酶(Phosphatidylinositol 3-kinase, PI3K)(货号:4249)、磷酸化磷脂酰肌醇三激酶(p-PI3K)(货号:4228)、蛋白激酶B (Protein kinase B, AKT)(货号:9272)和磷酸化蛋白激酶B (p-AKT)(货号:4060)抗体购于Cell Signaling Technology公司;苏木素染液、伊红染液、苏木素分化液和苏木素返蓝液购于塞维尔生物公司;BCA蛋白定量试剂盒购自白泰克生物技术有限公司(北京);ECL化学发光液购自上海雅酶生物医药科技有限公司。
1.3 测定指标及方法
乳腺采集和HE染色:切下右侧乳腺腹腺体(第4对乳腺),体积分数为4%的多聚甲醛溶液固定24 h,石蜡包埋,切片,进行HE染色,显微镜下观察并拍照。
蛋白质免疫印迹(Western blot):按照每10 μg乳腺组织加入100 μL裂解液进行匀浆,试剂盒抽提法提取蛋白质,按照BCA蛋白定量试剂盒进行蛋白浓度测定,调整蛋白浓度并用5×loading buffer制样,按照每孔20 μg总蛋白上样电泳,经电转至聚偏二氟乙烯(PVDF) 膜上后封闭,孵育一抗过夜,并用TBST缓冲液洗净孵育二抗,TBST缓冲液洗净后按照ECL化学发光底物说明书1∶1配置工作液,使PVDF膜与其充分反应30 s,置于曝光仪中曝光显色并拍照。
血清激素检测:按照南京建成ELISA试剂盒说明书进行检测;加入准备好的样品、标准品和生物素抗原,37 ℃条件下反应30 min;洗板5次,加入亲和素−HRP,37 ℃条件下反应30 min;洗板5次,加入显色液A、B,37 ℃条件下显色10 min;加入终止液;10 min之内读取D450 nm,计算浓度。
1.4 统计分析
数据结果用平均值±标准误表示,统计分析采用SigmaPlot 12.5软件分析,采用单因素方差分析,并用Duncan’s法对各组进行多重比较分析。
2. 结果与分析
2.1 藏猪妊娠期乳腺形态变化
乳腺HE染色结果如图1所示,妊娠33 d时,藏猪乳腺中主要是导管结构;50 d时,乳腺中出现少量腺泡结构;75 d时,乳腺中腺泡结构快速增多;至90 d时,乳腺中主要是腺泡结构。
2.2 藏猪妊娠期乳腺发育标志蛋白的表达
利用Western blot方法检测了藏猪妊娠期不同时间点乳腺发育标志蛋白Elf-5和PLIN2的表达,结果发现,在妊娠50、75和90 d时,Elf-5蛋白表达水平显著高于33 d;PLIN2的蛋白水平在75和90 d时显著高于33 d (图2、图3)。
图 3 藏猪妊娠期不同时间点乳腺发育标志蛋白相对表达量相同标志蛋白柱子上方的不同小写字母表示差异显著(P<0.05,Duncan’s法)Figure 3. Relative expressions of marker proteins for mammary gland development at different time points during gestation in Tibetan pigsDifferent lowercase letters on bars of the same marker protein indicate significant differences (P<0.05,Duncan’s method)2.3 藏猪妊娠期血清激素水平
如表1所示,妊娠期不同时间点藏猪血清中E2、P和PRL的水平随着妊娠的进行呈升高趋势。其中E2水平逐渐升高,到妊娠90 d达到最高水平,为42.82 ng/L;P水平在75和90 d显著高于33和50 d,90 d P水平达到36.76 μg/L;PRL水平在50 d升高,75 d显著高于33 d但与50 d无显著差异,90 d时达到最高,为66.53 μg/L,显著高于其他时间点。
表 1 藏猪妊娠期不同时间点的血清激素水平1)Table 1. Serum hormone levels at different time points during gestation in Tibetan pigst妊娠/d
Days of gestationρ(E2)/
(ng·L−1)ρ(P)/
(μg·L−1)ρ(PRL)/
(μg·L−1)33 14.56±0.82a 31.55±1.15a 44.86±1.36a 50 22.30±0.71b 32.36±0.62a 54.12±2.73b 75 27.56±0.91c 35.19±0.90b 52.91±1.31b 90 42.82±2.25d 36.76±0.94b 66.53±2.87c 1)同列数据后的不同小写字母表示差异显著(P<0.05,Duncan’s 法)
1) Different lowercase letters in the same column indicate significant differences (P<0.05,Duncan’s method)2.4 藏猪妊娠期激素受体的蛋白表达模式
利用Western blot方法检测了藏猪妊娠期不同时间点乳腺中激素受体的蛋白表达。结果显示,妊娠50 d时PRLR表达水平显著高于33 d,在90 d时达到最高;ER表达水平在50 d时显著增加,在50、75和90 d时水平相当;在75 d时PR表达水平显著高于33和50 d,在90 d时达到更高(图4、图5)。
图 5 藏猪妊娠期不同时间点乳腺中激素受体蛋白相对表达量相同激素受体柱子上方的不同小写字母表示差异显著(P<0.05,Duncan’s法)Figure 5. Relative expressions of the hormone receptors at different time points during gestation in mammary glands of Tibetan pigsDifferent lowercase letters on bars of the same hormone receptor indicate significant differences (P<0.05,Duncan’s method)2.5 藏猪妊娠期乳腺发育相关信号通路变化
利用Western blot方法检测藏猪妊娠期不同时间点乳腺发育相关信号通路Jak2/STAT5和PI3K/AKT的激活情况。由图6、图7可知,妊娠75 d时,Jak2、STAT5、PI3k和AKT的磷酸化水平显著升高,90 d时,Jak2、STAT5和PI3k的磷酸化水平显著升高,提示Jak2/STAT5和PI3K/AKT信号通路被显著激活。
图 7 藏猪妊娠期不同时间点乳腺中Jak2/STAT5和PI3K/AKT信号通路的蛋白相对表达量相同信号通路柱子上方的不同小写字母表示差异显著(P<0.05,Duncan’s法)Figure 7. Relative expressions of proteins from Jak2/STAT5 and PI3K/AKT signaling pathways at different time points during gestation in mammary glands of Tibetan pigsDifferent lowercase letters on bars of the same signaling pathway indicate significant differences (P<0.05,Duncan’s method)3. 讨论与结论
3.1 藏猪妊娠期乳腺形态变化及发育标志蛋白的表达模式
妊娠期是母猪乳腺发育的重要时期。我们的研究发现,妊娠33 d时,藏猪乳腺中主要是导管结构,50 d时出现少量腺泡结构,75 d时腺泡快速增多,90 d时乳腺中主要是腺泡结构。与我们的研究结果类似的是,Ji等[14]研究发现,母猪乳腺在妊娠45 d仅316 g,75 d乳腺质量达1606 g,90 d达到2357 g;此外,Kensinger等[15]研究表明,母猪乳腺在90 d时,腺泡数量达到最大,在90~105 d,乳腺腺泡开始分泌并蓄积大量乳汁,泌乳活动即将开始等。高慧杰等[16]在奶山羊的妊娠前期也发现乳腺并没有进入快速增殖分化阶段,而是代谢和呼吸作用增强,妊娠中期有大量细胞增殖分化。
Elf-5在妊娠期和哺乳期对乳腺腺泡的增殖和分化有重要作用,是调节乳腺发育中必不可少的调控因子[17-18]。PLIN2是调控乳脂生成的关键分子[19]。我们的研究结果显示,Elf-5和PLIN2在妊娠50 d后表达量显著升高,结合乳腺的形态和Elf-5、PLIN2蛋白水平,说明50 d乳腺开始发育出腺泡,75 d乳腺进入快速发育的阶段,90 d达到更高的发育程度,其中主要是腺泡结构。
3.2 藏猪妊娠期血清激素水平、乳腺激素受体及关键信号通路的蛋白表达模式
妊娠期乳腺发育受到多种激素的调节,其中,E2、P是调节妊娠期乳腺发育的主要激素,PRL是调节泌乳期乳腺发育及泌乳的的主要激素[5-6]。激素通过与其受体结合发挥作用,若敲除其受体,则乳腺无法正常发育[20]。方莉莉等[21]研究发现,在牦牛的妊娠早中期,PR在乳腺组织中表达量较少,与本文藏猪妊娠早中期PR蛋白表达较少一致。本文研究结果显示,E2、P和PRL等激素及其受体在妊娠期呈升高趋势,与乳腺的发育程度相吻合。类似地,Horigan等[22]通过体外给卵巢切除并抑制PRL分泌的猪注射E2、P、E2+PRL、E2+PRL+P,结果表明,注射E2+PRL+P这3种激素的组合方式才能最大程度地促进乳腺导管和腺泡产生,说明E2、P和PRL之间的相互作用对母猪乳腺发育起着关键作用。
PI3K/Akt和Jak2/STAT5信号通路在乳腺发育中起着重要调控作用。其中,PI3K/Akt是细胞内参与细胞信号转导的重要通路,参与细胞生长、增殖及分化等细胞过程[9, 23]。Meng等[10-11]研究表明,PI3K/Akt信号通路对乳腺发育和乳腺细胞的增殖具有重要作用。此外,JAK2/STAT5信号通路对乳腺腺泡的生成和多种乳汁蛋白基因的转录有着重要的调控作用[12-13]。我们的研究结果表明,PI3K/Akt和Jak2/STAT5信号通路在妊娠75 d后被显著激活,这与乳腺的高度发育及泌乳活动的开始有关。Palin等[24]的研究表明,梅山猪与大白猪乳腺组织实质中STAT5A和STAT5B的表达水平存在差异,STAT5发生磷酸化后易位至细胞核,与产乳靶基因启动子结合,激活并维持泌乳,梅山猪妊娠期乳腺组织中STAT5A和STAT5B的表达水平更高,能够生成更多的磷酸二聚体易位至核,同时梅山猪乳腺发育情况更好,具有更高的泌乳力。
综上所述,本文研究了藏猪妊娠期乳腺形态和乳腺发育标志蛋白、相关激素及信号通路的变化。结果发现,在藏猪妊娠过程中,其乳腺在妊娠50 d开始腺泡发育,75 d乳腺进入腺泡快速发育期,90 d发育程度更高,同时伴随着血清中乳腺发育相关激素(E2、P和PRL)和乳腺中激素受体表达的显著升高,以及乳腺发育相关通路PI3K/AKT和Jak2/STAT5的激活。研究结果为认识藏猪的乳腺发育和繁殖功能奠定了科学依据,为保护藏猪资源奠定了理论基础。
-
图 1 M2CNet网络总体组成
LCB:局部捕获块;LGCB:轻量级全局捕获块;H和W分别代表输入图片的高度和宽度;$ {C}_{i} $:指用于阶段i的通道数;$ {L}_{i} $:阶段i的局部捕获块和轻量级全局捕获块数量
Figure 1. Overall structure of the M2CNet network
LCB: Local capture block; LGCB: Lightweight global capture block; H and W represent the height and width of the input image, respectively; $ {C}_{i}: $ Number of channels used for stage i; $ {L}_{i} $ represents the number of local capture blocks and lightweight global capture blocks in stage i
图 3 标准多头注意力(a)与全局子采样注意力(b)的对比
Q、K、V分别表示查询、键和值,H、W分别表示输入图片的高度和宽度,s表示子窗口大小,C表示通道数
Figure 3. Comparison of standard multi-head attention and global subsampling attention
Q, K, V represent query, key and value respectively, H and W represent the height and width of the input picture respectively, s represents size of the subwindow, C represents number of channels
图 4 轻量级全局捕获块
a:条件位置编码;b:全局子采样注意力;c:轻量级前馈网络;di:通道维数;s:子窗口的大小;h:多头注意力头的数量;H和W分别代表输入特征的高度和宽度
Figure 4. Lightweight global capture block
a: Conditional position encoding; b: Global subsampling attention; c: Lightweight feedforward network; di: Channel dimension; s: Size of the sub window; h: Number of attention heads with multiple heads; H and W represent the height and width of the input features, respectively
图 6 病虫害数据集识别结果
柱状图的宽度与模型参数呈线性关系,参数量越大柱状图越宽;同一色系代表同一对照,同一色系中颜色最深的柱子对应M2CNet变体
Figure 6. Identification results of pest data sets
The width of the bar chart is linearly related to the model parameters, the larger the number of parameters, the wider the bar chart; The same color system represents the same control, and the darkest column in the same color system corresponds to the M2CNet variant
表 1 IP102 数据集害虫分级分类体系
Table 1 Taxonomy of the IP102 dataset on different class levels
作物
Crop害虫类别
Pest class训练集
Training set测试集
Test set水稻 Rice 14 6734 1683 玉米 Corn 13 11212 2803 小麦 Wheat 9 2734 684 甜菜 Sugarbeet 8 3536 884 苜蓿 Alfalfa 13 8312 2078 葡萄 Grape 16 14041 3510 柑橘 Orange 19 5818 1455 芒果 Mango 10 7790 1948 总计 Total 102 60177 15045 表 2 M2CNet-S/B/L的网络架构 1)
Table 2 M2CNet-S/B/L network architecture
阶段
Stage输出尺寸
Output size层名称
Name of layerM2CNet-S M2CNet-B M2CNet-L 1 $ 56\times 56 $ Conv.下采样 $ 4\times \mathrm{4,36},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e}\;4 $ $ 4\times \mathrm{4,48},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e}\;4 $ $ 56\times 56 $ 深度可分离卷积 $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,36}\\ 3\times \mathrm{1,1}\times \mathrm{3,36}\\ \begin{array}{c}{H}_{1}=1,{s}_{1}=4\\ {R}_{1}=4\end{array}\end{array} \right]\times 1 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,48}\\ 3\times \mathrm{1,1}\times \mathrm{3,48}\\ \begin{array}{c}{H}_{1}=1,{s}_{1}=4\\ {R}_{1}=4\end{array}\end{array} \right]\times 1 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,48}\\ 3\times \mathrm{1,1}\times \mathrm{3,48}\\ \begin{array}{c}{H}_{1}=1,{s}_{1}=4\\ {R}_{1}=4\end{array}\end{array} \right]\times 1 $ 多层循环全连接 全局子采样注意力 轻量级前馈网络 2 $ 28\times 28 $ Conv.下采样 $ 2\times \mathrm{2,72},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e}\; 2 $ $ 2\times \mathrm{2,96},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e}\; 2 $ $ 28\times 28 $ 深度可分离卷积 $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,72}\\ 3\times \mathrm{1,1}\times \mathrm{3,72}\\ \begin{array}{c}{H}_{1}=2,{s}_{1}=2\\ {R}_{1}=4\end{array}\end{array} \right]\times 2 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,96}\\ 3\times \mathrm{1,1}\times \mathrm{3,96}\\ \begin{array}{c}{H}_{1}=2,{s}_{1}=2\\ {R}_{1}=4\end{array}\end{array} \right]\times 1 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,96}\\ 3\times \mathrm{1,1}\times \mathrm{3,96}\\ \begin{array}{c}{H}_{1}=2,{s}_{1}=2\\ {R}_{1}=4\end{array}\end{array} \right]\times 2 $ 多层循环全连接 全局子采样注意力 轻量级前馈网络 3 $ 14\times 14 $ Conv.下采样 $ 2\times \mathrm{2,144},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e}\; 2 $ $ 2\times \mathrm{2,192},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e}\; 2 $ $ 14\times 14 $ 深度可分离卷积 $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,144}\\ 3\times \mathrm{1,1}\times \mathrm{3,144}\\ \begin{array}{c}{H}_{1}=4,{s}_{1}=2\\ {R}_{1}=4\end{array}\end{array} \right]\times 3 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,192}\\ 3\times \mathrm{1,1}\times \mathrm{3,192}\\ \begin{array}{c}{H}_{1}=4,{s}_{1}=2\\ {R}_{1}=4\end{array}\end{array} \right]\times 4 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,192}\\ 3\times \mathrm{1,1}\times \mathrm{3,192}\\ \begin{array}{c}{H}_{1}=4,{s}_{1}=2\\ {R}_{1}=4\end{array}\end{array} \right]\times 6 $ 多层循环全连接 全局子采样注意力 轻量级前馈网络 4 $ 7\times 7 $ Conv.下采样 $ 2\times \mathrm{2,288},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e}\; 2 $ $ 2\times \mathrm{2,384},\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{d}\mathrm{e} \;2 $ $ 7\times 7 $ 深度可分离卷积 $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,288}\\ 3\times \mathrm{1,1}\times \mathrm{3,288}\\ \begin{array}{c}{H}_{1}=8,{s}_{1}=1\\ {R}_{1}=4\end{array}\end{array} \right]\times 2 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,384}\\ 3\times \mathrm{1,1}\times \mathrm{3,384}\\ \begin{array}{c}{H}_{1}=8,{s}_{1}=1\\ {R}_{1}=4\end{array}\end{array} \right]\times 2 $ $ \left[ \begin{array}{c}3\times \mathrm{3,1}\times \mathrm{1,384}\\ 3\times \mathrm{1,1}\times \mathrm{3,384}\\ \begin{array}{c}{H}_{1}=8,{s}_{1}=1\\ {R}_{1}=4\end{array}\end{array} \right]\times 4 $ 多层循环全连接 全局子采样注意力 轻量级前馈网络 输出 Output $ 1\times 1 $ 全连接 100 参数量(M) No. of parameters 1.83 3.52 5.76 计算量(G) Floating point operations 0.23 0.39 0.60 1)输入图像大小默认为224像素×224像素,Conv.代表卷积操作,stride表示卷积的步幅,Hi和Si是第i个全局子采样注意力的头数和次采样大小,Ri是第i个轻量级前馈网络的特征尺寸缩放比
1) The input image size is 224×224 by default, Conv. stands for convolution operation, stride stands for convolution step, Hi and Si are the number of heads and subsampling size of the ith global subsampling, and Ri is the scaling ratio of the feature size of the ith lightweight feedforward network表 3 CIFAR100数据集模型对比结果
Table 3 Comparison results of CIFAR100 dataset model
模型
Model参数量 (M)
No. of
parameters计算量 (G)
Floating
point
operations准确率/%
AccuracyTop5 Top1 ShuffleNet-V2 0.5 0.4 0.04 72.74 41.83 ShuffleNet-V2 1.0 1.4 0.15 86.21 59.65 ShuffleNet-V2 1.5 2.6 0.30 90.08 66.56 ShuffleNet-V2 2.0 5.6 0.56 93.06 72.79 SqueezeNet 1.0 0.8 0.75 78.48 49.68 SqueezeNet 1.1 0.8 0.30 78.12 50.14 MobileNet-V3-Small 1.6 0.06 87.90 61.74 MobileNet-V2 2.4 0.31 91.69 69.16 MobileNet-V3-Large 4.3 0.23 93.57 73.27 MnasNet 0.5 1.1 0.11 88.13 62.60 MnasNet 0.75 2.0 0.22 91.44 69.20 MnasNet 1.0 3.2 0.32 92.81 72.70 MnasNet 1.3 5.1 0.54 94.41 76.64 EfficientNet B0 4.1 0.40 94.63 76.00 EfficientNet B1 6.6 0.60 94.95 77.96 ResNet 18 11.2 1.80 94.66 76.85 VGG 11 129.2 7.60 94.25 75.82 VGG 13 129.4 11.30 94.38 76.46 VGG 16 134.7 15.50 94.63 78.19 VGG 19 140.0 19.60 95.25 78.19 MobileViT-XXS 1.0 0.33 84.98 55.96 MobileViT-XS 2.0 0.90 89.55 64.34 MobileViT-S 5.1 1.75 93.64 72.93 M2CNet-S 1.8 0.23 92.46 71.09 M2CNet-B 3.5 0.39 94.16 75.32 M2CNet-L 5.8 0.60 95.31 78.39 -
[1] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. arXiv: 1409.1556. https://arxiv.org/abs/1409.1556.
[2] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
[3] 李静, 陈桂芬, 安宇. 基于优化卷积神经网络的玉米螟虫害图像识别[J]. 华南农业大学学报, 2020, 41(3): 110-116. doi: 10.7671/j.issn.1001-411X.201907017 [4] 刘洋, 冯全, 王书志. 基于轻量级CNN的植物病害识别方法及移动端应用[J]. 农业工程学报, 2019, 35(17): 194-204. doi: 10.11975/j.issn.1002-6819.2019.17.024 [5] 陆健强, 林佳翰, 黄仲强, 等. 基于Mixup算法和卷积神经网络的柑橘黄龙病果实识别研究[J]. 华南农业大学学报, 2021, 42(3): 94-101. doi: 10.7671/j.issn.1001-411X.202008041 [6] 邱文杰, 叶进, 胡亮青, 等. 面向植物病害识别的卷积神经网络精简结构Distilled-MobileNet模型[J]. 智慧农业(中英文), 2021(1): 109-117. [7] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. arXiv: 2010.11929. https://arxiv.org/abs/2010.11929.
[8] KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[R/OL]. Technical report: University of Toronto, https://www.cs.toronto.edu~kriz/learning-features-2009-TR.pdf.
[9] HUGHES D P, SALATHE M. An open access repository of images on plant health to enable the development of mobile disease diagnostics[EB/OL]. arXiv: 1511.08060. https://arxiv. org/abs/1511.08060.
[10] WU X P, ZHAN C, LAI Y K, et al. IP102: A large-scale benchmark dataset for insect pest recognition[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2020: 8779-8788.
[11] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
[12] HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. arXiv: 1704.04861. https://arxiv.org/abs/1704.0486.
[13] CHEN S, XIE E, GE C, et al. CycleMLP: A MLP-like architecture for dense prediction[C]//International Conference on Learning Representations. OpenRe view. net, 2022: 1-21.
[14] IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning. New York: ACM, 2015: 448-456.
[15] LIU Z, MAO H, WU C Y, et al. A convnet for the 2020s[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA: IEEE, 2022: 11966-11976.
[16] CHU X, TIAN Z, WANG Y, et al. Twins: Revisiting the design of spatial attention in vision transformers[C]//Advances in Neural Information Processing Systems(NIPS). 2021, 34: 9355-9366.
[17] CHU X X, TIAN Z, ZHANG B, et al. Conditional positional encodings for vision transformers[EB/OL]. arXiv: 2102.10882. https://arxiv.org/abs/2102.10882.
[18] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: ACM, 2017: 6000-6010.
[19] LOSHCHILOV I, HUTTER F. SGDR: Stochastic gradient descent with restarts[C]//International Conference on Learning Representations. Toulon: OpenReview. net, 2017: 1-16.
[20] LOSHCHILOY I, HUTTER F. Decoupled weight decay regularization[C]//International Conference on Learning Representations. New Orleans: OpenReview. net, 2019: 1-19.
[21] MULLER R, KORNBLITH S, HINTON G E. When does label smoothing help? [EB/OL]. arXiv: 1906.02629. https://arxiv.org/abs/1906.02629.
[22] ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: Beyond empirical risk minimization[EB/OL]. arXiv: 1710.09412. https://arxiv.org/abs/1710.09412.
[23] ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 6848-6856.
[24] MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). New York: ACM, 2018: 122-138.
[25] IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[EB/OL]//arXiv: 1602.07360. https://arxiv.org/abs/1602.07360.
[26] SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: Inverted residuals and linear bottlenecks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 4510-4520.
[27] HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2020: 1314-1324.
[28] TAN M X, CHEN B, PANG R M, et al. Mnasnet: Platform-aware neural architecture search for mobile[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2020: 2815-2823.
[29] TAN M, LE Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning. Long Beach, CA, USA: LR, 2019: 6105-6114.
[30] MEHTA S, RASTEGARI M. MobileViT: Light-weight, general-purpose, and mobile-friendly vision transformer[EB/OL]. arXiv: 2110.02178. https://arxiv.org/abs/2110.02178.
[31] SELVARAJU R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2): 336-359. doi: 10.1007/s11263-019-01228-7
-
期刊类型引用(2)
1. 韩江涛,张帅博,秦雅蕊,韩硕洋,张雅康,王吉庆,杜清洁,肖怀娟,李猛. 甜瓜β-淀粉酶基因家族的鉴定及对非生物胁迫的响应. 生物技术通报. 2025(03): 171-180 . 百度学术
2. 梅玉琴,刘意,王崇,雷剑,朱国鹏,杨新笋. 甘薯PHB基因家族的全基因组鉴定和表达分析. 作物学报. 2023(06): 1715-1725 . 百度学术
其他类型引用(3)