“双碳”目标下绿色人工智能技术研究综述A review of green AI research under carbon peaking and neutrality goals
卢毓东,陈益
LU Yudong,CHEN Yi
摘要(Abstract):
人工智能大规模训练导致了计算资源需求、能源需求及碳排放量的急剧攀升,不仅使人工智能技术自身实现“双碳”目标受到了严峻挑战,也限制了人工智能在电力巡检机器人、无人机等的边缘设备中的应用。在“双碳”目标下的电网数字化转型期,研究绿色人工智能技术,实现节能减碳,对促进新型电力系统建设和人工智能技术进步具有重要意义。首先介绍了绿色人工智能的由来、定义及影响模型能耗的关键因素;接着探讨了绿色人工智能模型技术的发展现状、关键问题、改进方法和效果;然后讨论了高效硬件基础设施节能减碳的措施;最后对绿色人工智能技术的未来发展提出相关建议和展望。
The large-scale training of artificial intelligence(AI) has led to a significant surge in computational resource demands, energy consumption, and carbon emissions, which not only poses severe challenges to the realization of carbon peaking and neutrality goals by AI technology itself, but also impedes the application of AI in edge devices such as power inspection robots and UAVs. During the digital transformation of power grids under carbon peaking and neutrality goals, study of green AI for energy saving and carbon reduction is of great significance to the construction of new power systems and the advancement of AI. Firstly, the origin and definition of green AI and the key factors affecting the energy consumption of the model are introduced; then the current development status, key issues, improvement methods and effects of green AI model technology are discussed. Afterwards, the energy-saving and carbon reduction measures of high-efficiency physical infrastructures are discussed. Finally, the relevant suggestions and outlooks for the future development of green AI are presented.
关键词(KeyWords):
绿色人工智能;新型电力系统;节能;碳排放;模型加速
green AI;new-type power systems;energy conservation;carbon emission;model acceleration
基金项目(Foundation): 国网浙江省电力有限公司科技项目(5211DS22001L)
作者(Author):
卢毓东,陈益
LU Yudong,CHEN Yi
DOI: 10.19585/j.zjdl.202310006
参考文献(References):
- [1]叶琳,项中明,张静,等.基于多强化学习智能体架构的电网运行方式调节方法[J].浙江电力,2022,41(6):1-7.YE Lin,XIANG Zhongming,ZHANG Jing,et al.An operating condition adjustment method for power grid using multi-DRL-agent architecture[J].Zhejiang Electric Power,2022,41(6):1-7.
- [2]叶琳,杨滢,洪道鉴,等.深度学习在电力系统中的应用研究综述[J].浙江电力,2019,38(5):83-89.YE Lin,YANG Ying,HONG Daojian,et al.A survey of deep learning technology application in power system[J].Zhejiang Electric Power,2019,38(5):83-89.
- [3]刘吉臻.支撑新型电力系统建设的电力智能化发展路径[J].能源科技,2022,20(4):3-7.LIU Jizhen. Development path of power intelligence supporting the construction of new power system[J].Energy Science and Technology,2022,20(4):3-7.
- [4] SHARIR O,PELEG B,SHOHAM Y.The cost of training NLP models:a concise overview[EB/OL].(2020-04-19)[2022-11-01].https://arxiv.org/abs/2004.08900.
- [5] AMODEI D,HERNANDEZ D. AI and compute[EB/OL].(2018-05-16)[2022-11-01]. https://openai. com/blog/ai-and-compute.
- [6] PATTERSON D,GONZALEZ J,LE Q,et al. Carbon emissions and large neural network training[EB/OL].(2021-04-21)[2022-11-01]. https://arxiv. org/abs/2014.10350.
- [7] SCHWARTZ R,DODGE J,SMITH N A,et al.Green AI[J].Communications of the ACM,2020,63(12):54-63.
- [8] STRUBELL E,GANESH A,MCCALLUM A. Energy and policy considerations for deep learning in NLP[EB/OL].(2019-06-05)[2022-11-01]. https://arxiv. org/abs/1906.02243.
- [9] HU H Y,LI A,CALANDRIELLO D,et al. One pass Imagenet[EB/OL].(2021-11-03)[2022-11-02].https://arxiv.org/abs/2111.01956.
- [10] PATTERSON D,GONZALEZ J,H?LZLE U,et al.The carbon footprint of machine learning training will plateau,then shrink[J].Computer,2022,55(7):18-28.
- [11] HOWARD A G,ZHU M L,CHEN B,et al.Mobilenets:efficient convolutional neural networks for mobile vision applications[EB/OL].(2017-04-17)[2022-11-02].https://arxiv.org/abs/1704.04861.
- [12] CHOLLET F. Xception:deep learning with depthwise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),July21-26,2017,Honolulu,HI,USA.IEEE,2017:1800-1807.
- [13] IANDOLA F N,HAN S,MOSKEWICZ M W,et al.Squeezenet:AlexNet-level accuracy with 50x fewer parameters and<0.5 MB model size[EB/OL].(2016-04-24)[2022-11-02].https://arxiv.org/abs/1602.07360.
- [14] GHOLAMI A,KWON K,WU B C,et al.SqueezeNext:hardware-aware neural network design[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW),June 18-22,2018,Salt Lake City,UT,USA.IEEE,2018:1719-1728.
- [15] XIE S N,GIRSHICK R,DOLLáR P,et al.Aggregated residual transformations for deep neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),July 21-26,2017,Honolulu,HI,USA.IEEE,2017:5987-5995.
- [16] HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),June 27-30,2016,Las Vegas,NV,USA.IEEE,2016:770-778.
- [17] ZHANG X Y,ZHOU X Y,LIN M X,et al.ShuffleNet:an extremely efficient convolutional neural network for mobile devices[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 18-23,2018,Salt Lake City,UT,USA.IEEE,2018:6848-6856.
- [18] MA N N,ZHANG X Y,ZHENG H T,et al.ShuffleNet V2:practical guidelines for efficient CNN architecture design[C]//Computer Vision-ECCV 2018:15th European Conference,September 8-14,2018,Munich,Germany.New York:ACM,2018:122-138.
- [19] LECUN Y,DENKER J,SOLLA S.Optimal brain damage[J]. Advances in Neural Information Processing Systems,1989,2:598-605.
- [20] HASSIBI B,STORK D.Second order derivatives for network pruning:optimal brain surgeon[EB/OL].(1992-11-30)[2022-11-01]. https://www. semanticscholar. org/paper/Second-Order-Derivatives-for-Network-Pruning%3ABrain-Hassibi-Stork/a42954d4b9d0ccdf1036e0af46d87a01b94c3516.
- [21] HAN S,POOL J,TRAN J,et al.Learning both weights and connections for efficient neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.New York:ACM,2015:1135-1143.
- [22] SUZUKI K,HORIBA I,SUGIE N.A simple neural network pruning algorithm with application to filter synthesis[J].Neural Processing Letters,2001,13(1):43-53.
- [23] ARDAKANI A,CONDO C,GROSS W J. Sparselyconnected neural networks:towards efficient VLSI implementation of deep neural networks[EB/OL].(2016-11-3)[2022-11-4].https://arxiv.org/abs/1611.01427.
- [24] SRINIVAS S,BABU R V. Data-free parameter pruning for deep neural networks[EB/OL].(2015-07-22)[2022-11-04].https://arxiv.org/abs/1507.06149.
- [25] HU H Y,PENG R,TAI Y W,et al.Network trimming:a data-driven neuron pruning approach towards efficient deep architectures[EB/OL].(2016-07-12)[2022-11-04].https://arxiv.org/abs/1607.03250.
- [26] BABAEIZADEH M,SMARAGDIS P,CAMPBELL R.NoiseOut:a simple way to prune neural networks[EB/OL].(2016-11-18)[2022-11-5]. https://arxiv. org/abs/1611.06211.
- [27] YU R C,LI A,CHEN C F,et al.NISP:pruning networks using neuron importance score propagation[C]//2018IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 18-23,2018,Salt Lake City,UT,USA.IEEE,2018:9194-9203.
- [28] LI H,KADAV A,DURDANOVIC I,et al.Pruning filters for efficient convnets[EB/OL].(2016-08-31)[2022-11-05].https://arxiv.org/abs/1608.08710.
- [29] LUO J H,ZHANG H,ZHOU H Y,et al.ThiNet:pruning CNN filters for a thinner net[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(10):2525-2538.
- [30] LIU B Y,WANG M,FOROOSH H,et al.Sparse convolutional neural networks[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),June7-12,2015,Boston,MA.IEEE,2015:806-814.
- [31] HE Y H,ZHANG X Y,SUN J.Channel pruning for accelerating very deep neural networks[C]//2017 IEEE International Conference on Computer Vision(ICCV),October 22-29,2017,Venice,Italy.IEEE,2017:1398-1406.
- [32] LIU Z,LI J G,SHEN Z Q,et al.Learning efficient convolutional networks through network slimming[C]//2017IEEE International Conference on Computer Vision(ICCV),October 22-29,2017,Venice,Italy.IEEE,2017:2755-2763.
- [33] CHEN S,ZHAO Q. Shallowing deep networks:layerwise pruning based on feature representations[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(12):3048-3056.
- [34] FRANKLE J,CARBIN M.The lottery ticket hypothesis:finding sparse,trainable neural networks[EB/OL].(2018-03-09)[2022-11-05].https://arxiv.org/abs/1803.03635.
- [35] PRASANNA S,ROGERS A,RUMSHISKY A. When BERT plays the lottery, all tickets are winning[EB/OL].(2020-05-01)[2022-11-05]. https://arxiv. org/abs/2005.00561.
- [36] LIU S W,CHEN T L,CHEN X H,et al.The unreasonable effectiveness of random pruning:return of the most naive baseline for sparse training[EB/OL].(2022-02-05)[2022-11-06].https://arxiv.org/abs/2202.02643.
- [37] HOROWITZ M. 1.1 Computing’s energy problem(and what we can do about it)[C]//2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers(ISSCC),February 9-13,2014,San Francisco,CA,USA.IEEE,2014:10-14.
- [38] HAN S,MAO H Z,DALLY W.Deep compression:compressing deep neural network with pruning,trained quantization and huffman coding(2015-10-01)[2022-11-06].https://arxiv.org/abs/1510.00149.
- [39] XU Y H,WANG Y Z,ZHOU A J,et al.Deep neural network compression with single and multiple level quantization[J].Proceedings of the AAAI Conference on Artificial Intelligence,2018,32(1):4335-4342.
- [40] MIYASHITA D,LEE E H,MURMANN B. Convolutional neural networks using logarithmic data representation[EB/OL].(2016-03-03)[2022-11-06].https://arxiv.org/abs/1603.01025.
- [41] LI R D,WANG Y,LIANG F,et al.Fully quantized network for object detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),June 15-20,2019,Long Beach,CA,USA.IEEE,2020:2805-2814.
- [42] JUNG S,SON C,LEE S,et al.Learning to quantize deep networks by optimizing quantization intervals with task loss[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),June 15-20,2019,Long Beach,CA,USA.IEEE,2020:4345-4354.
- [43] ZHUANG B H,SHEN C H,TAN M K,et al.Towards effective low-bitwidth convolutional neural networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 18-23,2018,Salt Lake City,UT,USA.IEEE,2018:7920-7928.
- [44] WANG P S,CHEN Q,HE X Y,et al.Towards accurate post-training network quantization via bit-split and stitching[C]//Proceedings of the 37th International Conference on Machine Learning.New York:ACM,2020:9847-9856.
- [45] JACOB B,KLIGYS S,CHEN B,et al.Quantization and training of neural networks for efficient integer-arithmeticonly inference[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 18-23,2018,Salt Lake City,UT,USA.IEEE,2018:2704-2713.
- [46] COURBARIAUX M,BENGIO Y,DAVID J P.Binaryconnect:training deep neural networks with binary weights during propagations[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2.NewYork:ACM,2015:3123-3131.
- [47] HUBARA I,COURBARIAUX M,SOUDRY D,et al.Binarized neural networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.New York:ACM,2016:4114-4122.
- [48] CARRON I.XNOR-net:ImageNet classification using binary convolutional neural networks[EB/OL].(2016-03-25)[2022-11-06].https://arxiv.org/abs/1603.05279.
- [49] REDFERN A J,ZHU L J,NEWQUIST M K.BCNN:a binary CNN with all matrix Ops quantized to 1 bit precision[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW),June 19-25,2021,Nashville,TN,USA.IEEE,2021:4599-4607.
- [50] QIN H T,GONG R H,LIU X L,et al.Forward and backward information retention for accurate binary neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),June 13-19,2020,Seattle,WA,USA.IEEE,2020:2247-2256.
- [51] KRISHNAN S,LAM M,CHITLANGIA S,et al.QuaRL:quantization for sustainable reinforcement learning quantization for sustainable reinforcement Learning[EB/OL].(2016-03-25)[2022-11-06].https://arxiv.org/abs/1910.01055.
- [52] JADERBERG M,VEDALDI A,ZISSERMAN A.Speeding up convolutional neural networks with low rank expansions[EB/OL].(2014-05-15)[2022-11-06].https://arxiv.org/abs/1405.3866.
- [53] LEBEDEV V,GANIN Y,RAKHUBA M,et al.Speeding-up convolutional neural networks using finetuned CP-decomposition[EB/OL].(2014-12-19)[2022-11-06].https://arxiv.org/abs/:1412.6553.
- [54] NOVIKOV A,PODOPRIKHIN D,OSOKIN A,et al.Tensorizing neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1. New York:ACM,2015:442-450.
- [55] YU X Y,LIU T L,WANG X C,et al.On compressing deep models by low rank and sparse decomposition[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),July 21-26,2017,Honolulu,HI,USA.IEEE,2017:67-76.
- [56] HINTON G,VINYALS O,DEAN J. Distilling the knowledge in a neural network[EB/OL].(2015-03-09)[2022-11-06].https://arxiv.org/abs/1503.02531.
- [57] ROMERO A,BALLAS N,KAHOU S E,et al.Fitnets:hints for thin deep netsk[EB/OL].(2014-12-19)[2022-11-06].https://arxiv.org/abs/1412.6550.
- [58] CHEN T Q,GOODFELLOW I,SHLENS J. Net2Net:accelerating learning via knowledge transfer[EB/OL].(2015-11-18)[2022-11-06]. https://arxiv. org/abs/1511.05641.
- [59] SANH V,DEBUT L,CHAUMOND J,et al. DistilBERT,a distilled version of BERT:smaller,faster,cheaper and lighter[EB/OL].(2019-10-02)[2022-11-07].https://arxiv.org/abs/1910.01108.
- [60] SUN Z Q,YU H K,SONG X D,et al.MobileBERT:a compact task-agnostic BERT for resource-limited devices[EB/OL].(2020-04-06)[2022-11-07].https://arxiv.org/abs/2004.02984.
- [61] JIAO X Q,YIN Y C,SHANG L F,et al.TinyBERT:distilling BERT for natural language understanding[EB/OL].(2019-09-23)[2022-11-07]. https://arxiv. org/abs/1909.10351.
- [62] ZHAO B R,CUI Q,SONG R J,et al.Decoupled knowledge distillation[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),June18-24,2022,New Orleans,LA,USA.IEEE,2022:11943-11952.
- [63]马进,白雨生.应用于绝缘子缺陷检测的轻量化YOLOv4研究[J].电子测量技术,2022,45(14):123-130.MA Jin,BAI Yusheng.Research on lightweight YOLOv4applied to insulator defect detection[J]. Electronic Measurement Technology,2022,45(14):123-130.
- [64]杨锴,周顺勇,曾雅兰,等.基于轻量级Fast-Unet网络的航拍图像电力线快速精确分割[J].四川轻化工大学学报(自然科学版),2022,35(1):74-83.YANG Kai,ZHOU Shunyong,ZENG Yalan,et al. Fast and accurate segmentation of aerial image power lines based on lightweight fast-unet network[J].Journal of Sichuan University of Science&Engineering(Natural Science Edition),2022,35(1):74-83.
- [65]赵红成,田秀霞,杨泽森,等.YOLO-S:一种新型轻量的安全帽佩戴检测模型[J].华东师范大学学报(自然科学版),2021(5):134-145.ZHAO Hongcheng,TIAN Xiuxia,YANG Zesen,et al.YOLO-S:a new lightweight helmet wearing detection model[J].Journal of East China Normal University(Natural Science),2021(5):134-145.
- [66]魏敏,王刘旺.基于MSSST和强化轻量级卷积神经网络的有载分接开关运行工况识别[J].浙江电力,2022,41(4):51-61.WEI Min,WANG Liuwang.Operating condition identification of on-load tap changer based on MSSST and RLCNN[J].Zhejiang Electric Power,2022,41(4):51-61.
- [67]肖昭男.基于深度学习的电力负荷识别方法[D].南昌:南昌大学,2021.XIAO Zhaonan.Power load identification method based on deep learning[D].Nanchang:Nanchang University,2021.
- [68] ZOPH B,LE Q V. Neural architecture search with reinforcement learning[EB/OL].(2016-11-04)[2022-11-07].https://arxiv.org/abs/1611.01578.
- [69] ZOPH B,VASUDEVAN V,SHLENS J,et al.Learning transferable architectures for scalable image recognition[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 18-23,2018,Salt Lake City,UT,USA.IEEE,2018:8697-8710.
- [70] TAN M X,CHEN B,PANG R M,et al. MnasNet:platform-aware neural architecture search for mobile[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),June 15-20,2019,Long Beach,CA,USA.IEEE,2020:2815-2823.
- [71] PHAM H,GUAN M,ZOPH B,et al.Efficient neural architecture search via parameter sharing[EB/OL].(2018-02-09)[2022-11-08].https://arxiv.org/abs/1802.03268.
- [72] LIU H X,SIMONYAN K,YANG Y M.DARTS:differentiable architecture search[EB/OL].(2018-06-24)[2022-11-08].https://arxiv.org/abs/1806.09055.
- [73] WU B C,DAI X L,ZHANG P Z,et al.FBNet:hardwareaware efficient ConvNet design via differentiable neural architecture search[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR),June15-20,2019,Long Beach,CA,USA.IEEE,2020:10726-10734.
- [74] CAI H,ZHU L G,HAN S.ProxylessNAS:direct neural architecture search on target task and hardware[EB/OL].(2018-09-27)[2022-11-08]. https://arxiv. org/abs/1812.00332.
- [75] CAI H,GAN C,WANG T Z,et al.Once-for-all:train one network and specialize it for efficient deployment[EB/OL].(2019-08-26)[2022-11-08]. https://arxiv. org/abs/1908.09791.
- [76] TOLIA N,WANG Z K,MARWAH M,et al.Delivering energy proportionality with non energy-proportional systems:optimizing the ensemble[C]//Proceedings of the2008 conference on Power aware computing and systems.New York:ACM,2008:2-6.
- [77] HEATH T,DINIZ B,CARRERA E V,et al.Energy conservation in heterogeneous server clusters[C]//Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming. New York:ACM,2005:186-195.
- [78] FU X,WANG X R,LEFURGY C. How much power oversubscription is safe and allowed in data centers[C]//Proceedings of the 8th ACM international conference on Autonomic computing. New York:ACM,2011:21-30.
- [79] CHASE J S,ANDERSON D C,THAKAR P N,et al.Managing energy and server resources in hosting centers[J].ACM SIGOPS Operating Systems Review,2001,35(5):103-116.
- [80] NARAYANAN D,SANTHANAM K,KAZHAMIAKA F,et al. Heterogeneity-aware cluster scheduling policies for deep learning workloads[C]//Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation.New York:ACM,2020:481-498.
- [81] WANG Q,CHU X W. GPGPU performance estimation with core and memory frequency scaling[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(12):2865-2881.
- [82] MEI X X,CHU X W,LIU H,et al.Energy efficient realtime task scheduling on CPU-GPU hybrid clusters[C]//IEEE INFOCOM 2017-IEEE Conference on Computer Communications,May 1-4,2017,Atlanta,GA,USA.IEEE,2017:1-9.
- [83] DODGE J,PREWITT T,DES COMBES R T,et al.Measuring the carbon intensity of AI in cloud instances[C]//Proceedings of the 2022 ACM Conference on Fairness,Accountability,and Transparency.New York:ACM,2022:1877-1894.
- [84]伍康文,柴华.全太阳能数据中心整体技术方案与实践[J].微型机与应用,2012,31(21):1-3.WU Kangwen,CHAI Hua.The technology plan and practice of solar data center[J].Microcomputer&Its Applications,2012,31(21):1-3.
- [85] SCHUMAN C D,POTOK T,PATTON R,et al.A survey of neuromorphic computing and neural networks in hardware[EB/OL].(2017-05-19)[2022-11-10]. https://arxiv.org/abs/1705.06963.
- 绿色人工智能
- 新型电力系统
- 节能
- 碳排放
- 模型加速
green AI - new-type power systems
- energy conservation
- carbon emission
- model acceleration