WOA-BP rice yield prediction based on Spark
-
摘要:目的
随着大数据技术和人工智能的快速发展,针对当前水稻产量预测模型精度低、预测区域范围过大、模型优化时间过长等问题,本文提出一种基于Spark的鲸鱼优化算法−反向传播神经网络(Whale optimization algorithm-backpropagation,WOA-BP)水稻产量预测方法。
方法本文以广东省西部地区的县/市/区水稻产量及气象数据作为研究对象,采用WOA对BP网络的权值和偏置值进行优化,并构建水稻产量预测模型,提升预测精度;此外,在Spark框架下,实现WOA-BP算法并行化,减少算法时间开销。
结果模型精度方面,通过对预测结果进行反归一化后比较,经WOA优化后的BP神经网络模型,平均绝对百分比误差 (Mean absolute percentage error) 从8.354%降至7.068%,平均绝对误差 (Mean absolute error) 从31.320 kg降至26.982 kg,均方根误差 (Root mean square error) 从41.008 kg降至33.546 kg;运行时间方面,3节点Spark集群比非Spark模式减少了11 742 s,减少44%的时间开销。
结论基于Spark的WOA-BP水稻产量预测方法,能够较好地预测出广东西部县/市/区的水稻产量,同时可以很好地反映气象因素对广东省西部地区水稻产量的影响情况,对研究广东西部县/市/区乃至整个广东的水稻产量情况具有一定的参考价值。
Abstract:ObjectiveWith the rapid development of big data technology and artificial intelligence, aiming at the problems of low accuracy, too large prediction area, too long model optimization time of the current rice yield prediction model, etc., a whale optimization algorithm-backpropagation (WOA-BP) rice yield prediction method based on Spark was proposed.
MethodThis paper took rice yield and weather data of counties/cities/districts in the western region of Guangdong Province as the research object, used WOA to optimize the weights and bias values of BP neural network, and constructed a rice yield prediction model to improve the prediction accuracy. In addition, the WOA-BP algorithm was parallelized in the Spark framework to reduce the algorithm time overhead.
ResultIn terms of model accuracy, by comparing the prediction results after inverse normalization, the mean absolute percentage error of the BP neural network model optimized by WOA decreased from 8.354% to 7.068%, and the mean absolute error decreased from 31.320 kg to 26.982 kg, the root mean square error dropped from 41.008 kg to 33.546 kg. In terms of run time, 3-node Spark cluster reduced runtime by 11 742 s over non-Spark mode, reducing time overhead by 44%.
ConclusionThe WOA-BP rice yield prediction method based on Spark can better predict rice yield in western Guangdong counties/cities/districts, and at the same time can well reflect the influence of weather factors on rice yield in western Guangdong Province, which is a reference for studying the rice yield situation in western Guangdong counties/cities/districts and even the whole Guangdong.
-
-
表 1 3种模型精度对比
Table 1 Precision comparison of the three models
模型
Model平均绝对
百分比误差/%
MAPE平均绝对
误差/kg
MAE均方根
误差/kg
RMSEBP 8.354 31.320 41.008 PSO-BP 7.890 29.999 38.786 WOA-BP 7.068 26.982 33.546 表 2 不同节点数量性能对比及配置信息
Table 2 Performance comparison and configuration information under different node number
节点数量
Node
number总内存/G
Total
memory总物理核数
Total physical
nuclei number分区数量
Partition
numbert/s 1 16 12 24 24 534 2 32 24 48 18 955 3 48 36 72 14 895 -
[1] 李晔, 白雪. 基于新维无偏灰色马尔可夫模型的小麦产量预测[J]. 江苏农业科学, 2021, 49(15): 181-186. [2] 韩芳玉, 张俊飚, 程琳琳, 等. 气候变化对中国水稻产量及其区域差异性的影响[J]. 生态与农村环境学报, 2019, 35(3): 283-289. [3] 闫蓉, 李凤霞, 赵维忠, 等. 气象条件对水稻蒸腾速率的影响[J]. 宁夏农林科技, 2005(2): 7-8. doi: 10.3969/j.issn.1002-204X.2005.02.003 [4] 杨从党, 朱德峰, 周玉萍, 等. 不同生态条件下水稻产量及其构成因子分析[J]. 西南农业学报, 2004(S1): 35-39. [5] 焦江华. 不同土壤有机碳含量下气象因子主导的水稻产量模拟及模型改进[D]. 北京: 中国农业科学院, 2020. [6] 刘洪英, 鲜铁军, 李睿, 等. 基于气象因子的水稻产量预报模型[J]. 陕西气象, 2020(5): 45-47. [7] 高俊杰, 袁业溶, 梁应. 高要区早稻产量预测模型的建立[J]. 广东气象, 2022, 44(2): 50-52. [8] CHUTIA S, DEKA R L, GOSWAMI J, et al. Forecasting rice yield through modified Hendrick and Scholl technique in the Brahmaputra valley of Assam[J]. Journal of Agrometeorology, 2021, 23(1): 106-112. doi: 10.54386/jam.v23i1.95
[9] KAEOMUANGMOON T, JINTRAWET A, CHOTAMONSAK C, et al. Estimating seasonal fragrant rice production in Thailand using a spatial crop modelling and weather forecasting approach[J]. Journal of Agricultural Science, 2020, 157(7/8): 566-577.
[10] TRAORE S, ZHANG L, GUVEN A, et al. Rice yield response forecasting tool (YIELDCAST) for supporting climate change adaptation decision in Sahel[J]. Agricultural Water Management, 2020, 239: 106242. doi: 10.1016/j.agwat.2020.106242.
[11] JHA P K, ATHANASIADIS P, GUALDI S, et al. Using daily data from seasonal forecasts in dynamic crop models for yield prediction: A case study for rice in Nepal’s Terai[J]. Agricultural and Forest Meteorology, 2018, 265: 349-358.
[12] DHEKALE B S, NAGESWARARAO M M, NAIR A, et al. Prediction of kharif rice yield at Kharagpur using disaggregated extended range rainfall forecasts[J]. Theoretical and Applied Climatology, 2018, 133(3/4): 1075-1091.
[13] NAIN G, BHARDWAJ N, JASLAM P K M, et al. Rice yield forecasting using agro-meteorological variables: A multivariate approach[J]. Journal of Agrometeorology, 2021, 23(1): 100-105. doi: 10.54386/jam.v23i1.94
[14] GUO Y, XIANG H, LI Z, et al. Prediction of rice yield in East China based on climate and agronomic traits data using artificial neural networks and partial least squares regression[J]. Agronomy, 2021, 11(2): 282. doi: 10.3390/agronomy11020282.
[15] 杨北萍, 陈圣波, 于海洋, 等. 基于随机森林回归方法的水稻产量遥感估算[J]. 中国农业大学学报, 2020, 25(6): 26-34. [16] 徐强强, 王旭辉. 指数平滑法在椒江区早稻产量预测中的应用研究[J]. 上海农业科技, 2021(4): 22-24. [17] 路智渊, 顾娟, 龚小丽, 等. 固原市冬小麦产量预报与气象条件分析[J]. 现代农业, 2021(5): 111-112. [18] 马凡. 基于气象数据的安徽省冬小麦产量预测模型研究[D]. 合肥: 安徽农业大学, 2020. [19] RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back propagating Errors[J]. Nature, 1986, 323(6088): 533-536.
[20] 苏博, 刘鲁, 杨方廷. GM(1, N)灰色系统与BP神经网络方法的粮食产量预测比较研究[J]. 中国农业大学学报, 2006(4): 99-104. doi: 10.3321/j.issn:1007-4333.2006.04.021 [21] MIRJALILI S, LEWIS A. The Whale Optimization Algorithm[J]. Advances in Engineering Software, 2016, 95: 51-57. doi: 10.1016/j.advengsoft.2016.01.008
[22] 高岳林, 杨钦文, 王晓峰, 等. 新型群体智能优化算法综述[J]. 郑州大学学报(工学版), 2022, 43(3): 21-30. doi: 10.13705/j.issn.1671-6833.2022.03.007 [23] 翟光明, 李国和, 吴卫江, 等. 基于Spark的人工蜂群改进算法[J]. 计算机应用, 2017, 37(7): 1906-1910. doi: 10.11772/j.issn.1001-9081.2017.07.1906 [24] 王诏远, 王宏杰, 邢焕来, 等. 基于Spark的蚁群优化算法[J]. 计算机应用, 2015, 35(10): 2777-2780.