计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240500127-8.doi: 10.11896/jsjkx.240500127

• 人工智能 • 上一篇    下一篇

基于双重预训练的商品属性分类方法

赵哲宇, 王中卿, 王红玲   

  1. 苏州大学计算机科学与技术学院 江苏 苏州 215006
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 通讯作者: 赵哲宇(zyzhao0104@hotmail.com)
  • 基金资助:
    国家自然科学基金(62076175,61976146);江苏省双创博士计划

Commodity Attribute Classification Method Based on Dual Pre-training

ZHAO Zheyu, WANG Zhongqing, WANG Hongling   

  1. School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:ZHAO Zheyu,born in 2001,postgraduate.His main research interests include natural language processing and text classification.
    WANG Hongling,born in 1975,Ph.D,associate professor,is a member of CCF(No.14272M).Her main research interests include natural language processing and information retrieval.
  • Supported by:
    National Natural Science Foundation of China(62076175,61976146) and Jiangsu Innovation Doctor Plan.

摘要: 商品属性分类任务是指对一段商品的描述文字进行属性分析并进而对多个属性进行分类的过程,其有助于人们从多个角度了解商品,为市场营销、产品管理等提供帮助。当前大语言模型的使用也愈加广泛,但在商品属性分类问题上,通用大模型由于缺乏领域知识和属性关联等信息,性能不够理想。为此,提出了一个基于双重预训练的商品属性分类方法,旨在通过使用特定的预训练方式提高大语言模型在商品属性分类任务中的性能。在T5模型的基础上,引入了领域内文本预训练和基于属性间关联性的预训练两种方法。在Clothing Fit Data数据集上的实验结果显示,使用了双重预训练的T5模型较未经过预训练的模型以及其他基准模型,在各个属性上的分类效果都取得了一定提升。实验结果证明了所提方法的有效性。

关键词: 双重预训练, 多属性分类, 大语言模型, T5, 商品属性分类

Abstract: The commodity attribute classification task refers to the process of analyzing the attributes of a piece of merchandise based on its descriptive text and subsequently categorizing multiple attributes.This process aids in providing insights into merchandise from various perspectives,thereby assisting in marketing and product management.While the utilization of large language models is increasingly prevalent,their performance in commodity attribute classification tasks remains suboptimal due to the lack of domain knowledge and attribute correlations.To address this issue,this paper proposes a dual pre-training-based method for commodity attribute classification,aiming to enhance the performance of large language models in such tasks by employing specific pre-training techniques.Building upon the T5 model,this paper introduces two methods:domain-specific text pre-training and attribute correlation-based pre-training.These methods enhance the model's understanding of the specific task from both input and output text perspectives,facilitating the classification of multiple attributes of merchandise.Experimental results on the Clothing Fit Data dataset demonstrate that the dual pre-trained T5 model outperforms both non-pre-trained models and other baseline models in attribute classification,validating the effectiveness of the proposed approach.

Key words: Dual pretraining, Multi-attribute classification, Large language model, T5, Commodity attribute classification

中图分类号: 

  • TP391
[1]WU Z,DAI X Y,YINC,et al.Improving review representations with user attention and product attention for sentiment classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[2]ABDULLA G M,BORAR S.Size recommendation system forfashion e-commerce[C]//KDD Workshop on Machine Learning Meets Fashion.2017:1-7.
[3]BHATT A,PATEL A,CHHEDA H,et al.Amazon review classification and sentiment analysis[J].International Journal of Computer Science and Information Technologies,2015,6(6):5107-5110.
[4]CHENG Z J,ZHU Y H.SVM-based commodity scoring system[J].Computer Knowledge and Technology,2018,14(30):223-225.
[5]HEARST M A,DUMAIS S T,OSUNA E,et al.Support vectormachines[J].IEEE Intelligent Systems and Their Applications,1998,13(4):18-28.
[6]LUO H Q,LU X Y,ZHANG X B et al.Hidden parsimonious Bayes-based sentiment classification method for commodity reviews[J].Computer Engineering and Design,2017,38(1):203-208.
[7]JIANG L,ZHANG H,CAI Z.A novel bayes model:Hidden naive bayes[J].IEEE Transactions on Knowledge and Data Engineering,2008,21(10):1361-1371.
[8]FAYAZ M,KHAN A,RAHMAN J U,et al.Ensemble machine learning model for classification of spam product reviews[J].Complexity,2020,2020:1-10.
[9]TAUD H,MAS J F.Multilayer perceptron(MLP)[M]//Geomatic Approaches for Modeling LAND Change Scenarios.Springer,2018:451-455.
[10]KRISHNA K,MURTY M N.Genetic K-means algorithm[J].IEEE Transactions on Systems,Man,and Cybernetics,Part B(Cybernetics),1999,29(3):433-439.
[11]RIGATTIS J.Random forest[J].Journal of Insurance Medi-cine,2017,47(1):31-39.
[12]PENG S C.Research on Sentiment Classification of Commodity Reviews Based on RNN and LDA Models [D].Hangzhou:Zhejiang University of Technology,2018.
[13]ZAREMBA W,SUTSKEVER I,VINYALS O.Recurrent neural network regularization[J].arXiv:1409.2329,2014.
[14]BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3(Jan):993-1022.
[15]CHEN H,SUN M,TUC,et al.Neural sentiment classification with user and product attention[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Proces-sing.2016:1650-1659.
[16]SHI X,CHEN Z,WANG H,et al.Convolutional LSTM net-work:A machine learning approach for precipitation nowcasting[J].Advances in Neural Information Processing Systems,2015,28.
[17]KENTON J D M W C,TOUTANOVA L K.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT.2019:4171-4186.
[18]XU M M.Research on commodity classification method based on BERT [D].Nanchang:Nanchang University,2023.
[19]YOON K.Convolutional Neural Networks for Sentence Classifications[C]//Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistic,2014:1746-1751
[20]ZHANG S Q,MA J,ZHOU X B,et al.Commodity attribute extraction based on pre-trained language models[J].Journal of Chinese Information,2022,36(1):56-64.
[21]CLARK K,LUONG M T,LEQ V,et al.Electra:Pre-trainingtext encoders as discriminators rather than generators[J].ar-Xiv:2003.10555,2020.
[22]ESHEL Y,LEVI O,ROITMANH,et al.Presize:predicting size in e-commerce using transformers[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.2021:255-264.
[23]CHATTERJEE O,TEJ J R,DASARAJUN V.Incorporatingcustomer reviews in size and fit recommendation systems for fashion e-commerce[J].arXiv:2208.06261,2022.
[24]SUN C,QIU X,XU Y,et al.How to fine-tune bert for text clas-sification?[C]//Chinese Computational Linguistics:18th China National Conference,CCL 2019,Kunming,China,October 18-20,2019,Proceedings 18.Springer International Publishing,2019:194-206.
[25]BAO X,WANG Z,ZHOU G.Exploring graph pre-training for aspect-based sentiment analysis[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:3623-3634.
[26]RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].The Journal of Machine Learning Research,2020,21(1):5485-5551.
[27]MISRA R,WAN M,MCAULEY J.Decomposing fit semantics for product size recommendation in metric spaces[C]//Procee-dings of the 12th ACM Conference on Recommender Systems.2018:422-426.
[28]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.2002:311-318.
[29]YANG Z,DAI Z,YANGY,et al.Xlnet:Generalized autoregressive pretraining for language understanding[J].Advances in Neural Information Processing Systems,2019,32.
[30]OUYANG L,WU J,JIANG X,et al.Training language models to follow instructions with human feedback[J].Advances in Neural Information Processing Systems,2022,35:27730-27744.
[31]TOUVRON H,MARTIN L,STONE K,et al.Llama 2:OpenFoundation and Fine-Tuned Chat Models[J].arXiv:2307.092882023.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!