计算机科学 ›› 2012, Vol. 39 ›› Issue (12): 171-176.

• 人工智能 • 上一篇    下一篇

基于主动半监督学习的智能

年素磊 黎铭 杜科 姜远 林为民 郭经红   

  1. (南京大学软件新技术国家重点实验室 南京 210093) (中国电力科学研究院 南京 211106)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Classi行ing Communication Dispatch System Logs of Smart Grid Based on Active Semi-supervised Learning

  • Online:2018-11-16 Published:2018-11-16

摘要: 智能电网的通信调度系统是智能电网正常运行的保证。为保证系统正确运行,值班员需要对电网信调系统 的运行状态、突发事件、事故故障以及相应的处理方案进行记录。为帮助管理者及时了解智能电网信息调度系统的工 作情况,发现潜在安全隐患,通常需要为这些日志数据标注其日志类型,以方便管理者查询和检索,因此,要求智能电 网信息调度系统能够自动对每天记录的各种日志根据管理需要进行分类。对大量根据值班员自己理解和习惯撰写的 日志进行自动分类,需要对由信息调度专家提供类型标注的大量日志数据进行学习。然而因人工阅读标注耗时、耗 力,故在实际应用中往往仅能提供少量的标注,从而影响自动分类的性能。针对这一问题,提出了基于主动半监督学 习的日志自动分类方法,该方法一方面利用主动学习找出对学习最有帮助的日志,获得其类型标注;另一方面,通过利 用大量缺乏类型标注的日志进一步提升学习性能。在国家电网的智能电网信息调度日志数据上的应用结果表明,基 于主动半监督学习,可获得比现有方法更优的日志自动分类性能。

关键词: 数据挖掘,机器学习,主动半监督学习,信调日志分类,智能电网

Abstract: Communication dispatch system is the guarantee for the normal operation of the smart grid. In order to en- sure the correct operation of the system, man on duty need to record the operational status, emergencies, the accident fault as well as the corresponding treatment program of the communication dispatch system of smart grid To help rr}}nr gers to keep up with the working status of system, for finding the potential security risks, the logs need to be labeled for certain types,to facilitate managers to query and retricve,so the communication dispatch system needs to be able to au- tomatically classify recorded logs according to various demands of management. However, the automatic classification for logs recorded by attendants in terms of their own understanding and habits, needs to learn from a large number of la- bcled logs data provided by information scheduling experts. Since manually reading to label is a timcconsuming and la- bor-intensive process,only a small amount of labels are often provided in practical applications, thus affecting the per- formance of the automatic classification. In terms of this limitation, this paper proposed an automated classification method based on active semi-supervised learning. hhis method, on one hand, acquires the labels of logs that can improve the classifier most through active learning,on the other hand,further enhance learning performance by the use of larges number of unlabeled logs. The results of application on logs of communication dispatch system of national smart grid show that the method based on active semi-supervised learning can achieve better performance than existing methods.

Key words: Data mining, Machine learning, Active semi supervised learning, Log classification, Smart grid

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!