Computer Science ›› 2021, Vol. 48 ›› Issue (1): 247-252.doi: 10.11896/jsjkx.191200088

• Artificial Intelligence • Previous Articles     Next Articles

Semantic Slot Filling Based on BERT and BiLSTM

ZHANG Yu-shuai, ZHAO Huan, LI Bo   

  1. College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China
  • Received:2019-12-13 Revised:2020-05-01 Online:2021-01-15 Published:2021-01-15
  • About author:ZHANG Yu-shuai,born in 1993,master,is a member of China Computer Federation.His main research interest is nature language processing.
    ZHAO Huan,born in 1967,Ph.D,professor,is a member of China Computer Federation.Her main research interests include speech information processing,nature language processing and intelligent computing.
  • Supported by:
    National Key R&D Project Program(2018YFC0831800).

Abstract: Semantic slot filling is an important task in the dialogue system,which aims to label each word of the input sentence correctly.Slot filling performance has a marked impact on the following dialog management module.At present,random word vector or pretrained word vector is usually used as the initialization word vector of the deep learningmodel used to solveslot filling task.However,the random word vector has no semantic and grammatical information,and the pre-trained word vector only pre-sent one meaning.Both of them cannot provide context-dependent word vector for the model.We proposed an end-to-end neural network model based on pre-trained model BERTand Long Short-Term Memory network(LSTM).First,the pre-trained model(BERT) encoded the input sentence as context-dependentword embedding.After that,the word embedding served as input to subsequent Bidirectional Long Short-Term Memory network(BiLSTM).Andusing the Softmax function and conditional random field to decode prediction labels finally.The pre-trained model BERT and BiLSTM networks were trained as a wholein order to improve the performance of semantic slot filling task.The model achieves F1 scores of 78.74%,87.60% and 71.54% on three data sets(MIT Restaurant Corpus,MIT Movie Corpus and MIT Movie trivial Corpus) respectively.The experimental results show that our model significantly improves the F1 value of Semantic slot filling task.

Key words: Slot filling, Pre-trained model, Long short-term memory network, Context-dependent, Word embedding

CLC Number: 

  • TP391
[1] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[2] DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[3] HOU L X,LI Y L,LI C C.Review of Research on Task-Oriented Spoken Language Understanding[J].Computer Engineering and Applications,2019,55(11):7-15.
[4] MCCALLUM A,FREITAG D,PEREIRA F C N.MaximumEntropy Markov Models for Information Extraction and Segmentation[C]//Proceedings of International Conference on Machine Learning.2000:591-598.
[5] RAYMOND C,RICCARDI G.Generative and DiscriminativeAlgorithms for Spoken Language Understanding[C]//Procee-dings of Conference of the International Speech Communication Association.2008:1605-1608.
[6] MESNIL G,DAUPHIN Y,YAO K,et al.Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2014,23(3):530-539.
[7] XU P,SARIKAYA R.Convolutional neural network based triangular CRF for joint intent detection and slot filling[C]//Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.IEEE,2013:78-83.
[8] XU Z X,CHE W X,LIU T.Slot filling based on Bi-LSTM-CRF[J].Intelligent Computer and Applications,2017,7(6):91-94.
[9] YAO K,PENG B,ZHANG Y,et al.Spoken language under-standing using long short-term memory neural networks[C]//Proceedings of 2014 IEEE Spoken Language Technology Workshop(SLT).IEEE,2014:189-194.
[10] PENG B,YAO K.Recurrent Neural Networks with ExternalMemory for Language Understanding[C]//Proceedings of Na-tural Language Processing and Chinese Computing.2015:25-35.
[11] VU N T.Sequential Convolutional Neural Networks for SlotFilling in Spoken Language Understanding[C]//Proceedings of 17th Annual Conference of the International Speech Communication Association(ISCA).2016:3250-3254.
[12] KURATA G,XIANG B,ZHOU B,et al.Leveraging Sentence-level Information with Encoder LSTM for Natural Language Understanding[J].arXiv:1601.01530,2016.
[13] LIU B,LANE I.Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding[J].arXiv:1711.11310,2017.
[14] ZHAO L,FENG Z.Improving slot filling in spoken language understanding with joint pointer and attention[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2018:426-431.
[15] KIM H Y,ROH Y H,KIM Y G.Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Student Research Workshop.2019:97-102.
[16] YOO K M,SHIN Y,LEE S.Data Augmentation for Spoken Language Understanding via Joint Variational Generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7402-7409.
[17] SHIN Y,YOO K M,LEE S G.Utterance Generation With Varia-tional Auto-Encoder for Slot Filling in Spoken Language Understanding[J].IEEE Signal Processing Letters,2019,26(3):505-509.
[18] VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of Advances in Neural Information Processing Systems.2017:5998-6008.
[19] PETERS M E,NEUMANN M,IYYER M,et al.Deep contextua-lized word representations[J].arXiv:1802.05365,2018.
[20] ZHU Y,KIROS R,ZEMEL R,et al.Aligning books and mo-vies:Towards story-like visual explanations by watching movies and reading books[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:19-27.
[21] WU Y,SCHUSTER M,CHEN Z,et al.Google's neural machine translation system:Bridging the gap between human and machine translation[J].arXiv:1609.08144,2016.
[22] JIN C,LI W H,JI C,et al.Bi-directional Long Short-term Me-mory Neural Networks for Chinese Word[J].Journal of Chinese Information Processing,2018,32(2):29-37.
[23] ZHOU J,XU W.End-to-end learning of semantic role labeling using recurrent neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.2015:1127-1137.
[1] TIAN Ye, SHOU Li-dan, CHEN Ke, LUO Xin-yuan, CHEN Gang. Natural Language Interface for Databases with Content-based Table Column Embeddings [J]. Computer Science, 2020, 47(9): 60-66.
[2] CHENG Jing, LIU Na-na, MIN Ke-rui, KANG Yu, WANG Xin, ZHOU Yang-fan. Word Embedding Optimization for Low-frequency Words with Applications in Short-text Classification [J]. Computer Science, 2020, 47(8): 255-260.
[3] LI Zhou-jun,FAN Yu,WU Xian-jie. Survey of Natural Language Processing Pre-training Techniques [J]. Computer Science, 2020, 47(3): 162-173.
[4] LIU Yun,YIN Chuan-huan,HU Di,ZHAO Tian,LIANG Yu. Communication Satellite Fault Detection Based on Recurrent Neural Network [J]. Computer Science, 2020, 47(2): 227-232.
[5] GU Xue-mei,LIU Jia-yong,CHENG Peng-sen,HE Xiang. Malware Name Recognition in Tweets Based on Enhanced BiLSTM-CRF Model [J]. Computer Science, 2020, 47(2): 245-250.
[6] HUO Dan, ZHANG Sheng-jie, WAN Lu-jun. Context-based Emotional Word Vector Hybrid Model [J]. Computer Science, 2020, 47(11A): 28-34.
[7] XU Sheng, ZHU Yong-xin. Study on Question Processing Algorithms in Visual Question Answering [J]. Computer Science, 2020, 47(11): 226-230.
[8] MA Xiao-hui, JIA Jun-zhi, ZHOU Xiang-zhen, YAN Jun-ya. Semantic Similarity-based Method for Sentiment Classification [J]. Computer Science, 2020, 47(11): 275-279.
[9] WANG Qi-fa, WANG Zhong-qing, LI Shou-shan, ZHOU Guo-dong. Comment Sentiment Classification Using Cross-attention Mechanism and News Content [J]. Computer Science, 2020, 47(10): 222-227.
[10] YANG Dan-hao,WU Yue-xin,FAN Chun-xiao. Chinese Short Text Keyphrase Extraction Model Based on Attention [J]. Computer Science, 2020, 47(1): 193-198.
[11] ZHANG Lu, SHEN Chen-lin, LI Shou-shan. Emotion Classification Algorithm Based on Emotion-specific Word Embedding [J]. Computer Science, 2019, 46(6A): 93-97.
[12] ZHENG Cheng, HONG Tong-tong, XUE Man-yi. BLSTM_MLPCNN Model for Short Text Classification [J]. Computer Science, 2019, 46(6): 206-211.
[13] XIAO Rui, JIANG Jia-qi, ZHANG Yun-chun. Study on Semantic Topology and Supervised Word Sense Disambiguation of Polysemous Words [J]. Computer Science, 2019, 46(11A): 13-18.
[14] JIA Ning, ZHENG Chun-jun. Model of Music Theme Recommendation Based on Attention LSTM [J]. Computer Science, 2019, 46(11A): 230-235.
[15] HOU Yu-chen, WU Wei. Design and Implementation of Crowdsourcing System for Still Image Activity Annotation [J]. Computer Science, 2019, 46(11A): 580-583.
Full text



[1] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[2] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[3] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[4] HAN Kui-kui, XIE Zai-peng and LV Xin. Fog Computing Task Scheduling Strategy Based on Improved Genetic Algorithm[J]. Computer Science, 2018, 45(4): 137 -142 .
[5] CAI Li, LIANG Yu, ZHU Yang-yong and HE Jing. History and Development Tendency of Data Quality[J]. Computer Science, 2018, 45(4): 1 -10 .
[6] QU Zhong and ZHAO Cong-mei. Anti-occlusion Adaptive-scale Object Tracking Algorithm[J]. Computer Science, 2018, 45(4): 296 -300 .
[7] DAI Wen-jing, YUAN Jia-bin. Survey on Hidden Subgroup Problem[J]. Computer Science, 2018, 45(6): 1 -8 .
[8] YANG Pei-an, WU Yang, SU Li-ya, LIU Bao-xu. Overview of Threat Intelligence Sharing Technologies in Cyberspace[J]. Computer Science, 2018, 45(6): 9 -18 .
[9] HU Ya-peng, DING Wei-long, WANG Gui-ling. Monitoring and Dispatching Service for Heterogeneous Big Data Computing Frameworks[J]. Computer Science, 2018, 45(6): 67 -71 .
[10] LIU Jing-wei, LIU Jing-ju, LU Yu-liang, YANG Bin, ZHU Kai-long. Optimal Defense Strategy Selection Method Based on Network Attack-Defense Game Model[J]. Computer Science, 2018, 45(6): 117 -123 .