Computer Science ›› 2024, Vol. 51 ›› Issue (8): 256-262.doi: 10.11896/jsjkx.230600204

• Artificial Intelligence • Previous Articles     Next Articles

Contrastive Learning-based Prompt Generation Method for Large-scale Language Model ReverseDictionary Task

TIAN Sicheng1, HUANG Shaobin1, WANG Rui1, LI Rongsheng1, DU Zhijuan2,3   

  1. 1 College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China
    2 Engineering Research Center of Ecological Big Data,Ministry of Education,Inner Mongolia,010021,China
    3 College of Computer,Inner Mongolia University,Inner Mongolia,010021,China
  • Received:2023-06-26 Revised:2023-11-14 Online:2024-08-15 Published:2024-08-13
  • About author:TIAN Sicheng,born in 1997,Ph.D.His main research interests include natural language processing and smart healthcare.
    HUANG Shaobin,born in 1965,Ph.D,professor,Ph.D supervisor.His main research interests include machine learning and natural language proces-sing.
  • Supported by:
    Open Project of Engineering Research Center of Ecological Big Data,Ministry of Education.

Abstract: Reverse dictionary task is an emerging task that aims to find the corresponding word based on a given definition.Large-scale language models offer new possibilities for this task,but the quality of the prompt sentences affects the performance of the large models.To this end,this paper proposes a contrastive learning-based prompt generation method.This method extracts definition semantics from multiple semantic levels.It also enhances the model’s generalization ability by incorporating negative examples through contrastive learning.With this method,we can narrow down the target word to a small range,and use a large model to select the most semantically consistent word from this range.Experimental results show that the proposed method can effectively improve the performance of large-scale language models on the reverse dictionary task.The prompt generation model has a 94.7% probability of generating a range that contains the target word.The large-scale language model has a 58.03% pro-bability of directly selecting the target word,and a 74.55% probability of including the target word when five candidate words are given.

Key words: Reverse dictionary, Large-scale language model, Contrastive learning, Multiple semantic scales, Contrastive loss

CLC Number: 

  • TP391
[1]MICKUS T,VAN DEEMTER K,CONSTANT M,et al.Sem-Eval-2022 Task 1:CODWOE-Comparing Dictionaries and Word Embeddings[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:1-14.
[2]QI F,ZHANG L,YANG Y,et al.Wantwords:An open-source online reverse dictionary system[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Proces-sing:system demonstrations.2020:175-181.
[3]ARDOIZ A,ORTEGA-MARTÍN M,GARCÍA-SIERRA Ó,et al.MMG at SemEval-2022 Task 1:A Reverse Dictionary approach based on a review of the dataset from a lexicographic perspective[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:68-74.
[4]TRANT H H,MARTINC M,PURVER M,et al.JSI at Sem-Eval-2022 Task 1:CODWOE-Reverse Dictionary:Monolingual and cross-lingual approaches[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:101-106.
[5]KHOSLA P,TETERWAK P,WANG C,et al.Supervised con-trastive learning [J].Neural Information Processing Systems,2020,33:18661-18673.
[6]YE H,ZHANG N,DENG S,et al.Contrastive triple extraction with generative transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:14257-14265.
[7]GADETSKY A,YAKUBOVSKIY I,VETROV D.ConditionalGenerators of Words Definitions[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2018:266-271.
[8]CHANG T Y,CHI T C,TSAI S C,et al.xSense:Learning sense-separated sparse representations and textual definitions for explainable word sense networks [J].arXiv:1809.03348,2018.
[9]BENDAHMAN N,BRETON J,NICOLAIEFF L,et al.Re-search at SemEval-2022 Task 1:Deep networks for Reverse Dictionary using embeddings and LSTM autoencoders[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:94-100.
[10]LI B,WENG Y,XIA F,et al.LingJing at SemEval-2022 task 1:Multi-task self-supervised pre-training for multilingual reverse dictionary[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:29-35.
[11]CHEN P,ZHAO Z.Edinburgh at SemEval-2022 Task 1:Jointly Fishing for Word Embeddings and Definitions[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(Sem-Eval-2022).2022:75-81.
[12]KORENČIĆD,GRUBISIC I.IRB-NLP at SemEval-2022 Task 1:Exploring the Relationship Between Words and Their Semantic Representations[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:36-59.
[13]KONG C,WANG Y,CHONG R,et al.BLCU-ICALL at SemEval-2022 Task 1:Cross-Attention Multitasking Framework for Definition Modeling[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:23-28.
[14]SRIVASTAVA A,VEMULAPATI H V.TLDR at SemEval-2022 task 1:Using transformers to learn dictionaries and representations[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:60-67.
[15]CERNIAVSKI R,STYMNE S.Uppsala University at SemEval-2022 Task 1:Can Foreign Entries Enhance an English Reverse Dictionary?[C]//Proceedings of the 16th International Workshop on Semantic Evaluation(SemEval-2022).2022:88-93.
[16]LI R,LI Z,HUANG S,et al.TransExplain:Using neural networks to find suitable explanations for Chinese phrases [J].Expert Systems with Applications,2021,183:115440.
[17]CHANG T Y,CHEN Y N.What does this word mean? explaining contextualized embeddings with natural language definition[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJC-NLP).2019:6064-6070.
[18]HAFIDI H,GHOGHO M,CIBLAT P,et al.Negative sampling strategies for contrastive self-supervised learning of graph representations[J].Signal Processing,2022,190:108310.
[19]JIANG R,NGUYEN T,ISHWAR P,et al.Supervised Contrastive Learning with Hard Negative Samples [J].arXiv:2209.00078,2022.
[20]HOPFIELD J J.Neural networks and physical systems withemergent collective computational abilities[J].Proceedings of the National Academy of Sciences,1982,79(8):2554-2558.
[21]AUGUSTYNIAK Ł,KAJDANOWICZ T,KAZIENKO P.Aspect detection using word and char embeddings with(Bi) LSTM and CRF[C]//2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering(AIKE).IEEE,2019:43-50.
[22]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition [J].IEEE,1998.86:2278-2324.
[23]LIN Z,FENG M,DOS SANTOS C,et al.A structured self-attentive sentence embedding[C]//International Conference on Learning Representations.International Conference on Learning Representations.ICLR,2017.
[24]NORASET T,LIANG C,BIRNBAUM L,et al.Definition mo-deling:Learning to define word embeddings in natural language[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2017.
[25]KENTON J D M-W C,TOUTANOVA L K.BERT:Pre-trai-ning of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT.2019:4171-4186.
[26]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543.
[27]LOSHCHILOV I,HUTTER F.Decoupled Weight Decay Regularization[C]//International Conference on Learning Representations.2017
[28]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners [J].Neural Information Processing Systems,2020,33:1877-1901.
[29]ZENG A,LIU X,DU Z,et al.Glm-130b:An open bilingual pre-trained model [J].arXiv:2210.02414,2022.
[1] TIAN Qing, LU Zhanghu, YANG Hong. Unsupervised Domain Adaptation Based on Entropy Filtering and Class Centroid Optimization [J]. Computer Science, 2024, 51(7): 345-353.
[2] HU Haibo, YANG Dan, NIE Tiezheng, KOU Yue. Graph Contrastive Learning Incorporating Multi-influence and Preference for Social Recommendation [J]. Computer Science, 2024, 51(7): 146-155.
[3] YU Bihui, TAN Shuyue, WEI Jingxuan, SUN Linzhuang, BU Liping, ZHAO Yiman. Vision-enhanced Multimodal Named Entity Recognition Based on Contrastive Learning [J]. Computer Science, 2024, 51(6): 198-205.
[4] LI Yilin, SUN Chengsheng, LUO Lin, JU Shenggen. Aspect-based Sentiment Classification for Word Information Enhancement Based on Sentence Information [J]. Computer Science, 2024, 51(6): 299-308.
[5] CHEN Runhuan, DAI Hua, ZHENG Guineng, LI Hui , YANG Geng. Urban Electricity Load Forecasting Method Based on Discrepancy Compensation and Short-termSampling Contrastive Loss [J]. Computer Science, 2024, 51(4): 158-164.
[6] LIAO Jinzhi, ZHAO Hewei, LIAN Xiaotong, JI Wenliang, SHI Haiming, ZHAO Xiang. Contrastive Graph Learning for Cross-document Misinformation Detection [J]. Computer Science, 2024, 51(3): 14-19.
[7] HUANG Kun, SUN Weiwei. Traffic Speed Forecasting Algorithm Based on Missing Data [J]. Computer Science, 2024, 51(3): 72-80.
[8] YANG Bo, LUO Jiachen, SONG Yantao, WU Hongtao, PENG Furong. Time Series Clustering Method Based on Contrastive Learning [J]. Computer Science, 2024, 51(2): 63-72.
[9] XU Jie, WANG Lisong. Contrastive Clustering with Consistent Structural Relations [J]. Computer Science, 2023, 50(9): 123-129.
[10] HU Shen, QIAN Yuhua, WANG Jieting, LI Feijiang, LYU Wei. Super Multi-class Deep Image Clustering Model Based on Contrastive Learning [J]. Computer Science, 2023, 50(9): 192-201.
[11] LI Xiang, FAN Zhiguang, LIN Nan, CAO Yangjie, LI Xuexiang. Self-supervised Learning for 3D Real-scenes Question Answering [J]. Computer Science, 2023, 50(9): 220-226.
[12] WANG Mingxia, XIONG Yun. Disease Diagnosis Prediction Algorithm Based on Contrastive Learning [J]. Computer Science, 2023, 50(7): 46-52.
[13] WU Jufeng, ZHAO Xungang, ZHOU Qiang, RAO Ning. Contrastive Learning for Low-light Image Enhancement [J]. Computer Science, 2023, 50(6A): 220600171-6.
[14] HE Chao, CHEN Jinjie, JIN Zhao, LEI Yinjie. Automatic Modulation Recognition Method Based on Multimodal Time-Frequency Feature Fusion [J]. Computer Science, 2023, 50(4): 226-232.
[15] ZHANG Zouquan, ZHANG Hui, WU Tianyue, CHEN Tiancai. Continuous Dense Normalized Flow Model for Anomaly Detection in Industrial Images [J]. Computer Science, 2023, 50(12): 212-220.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!