计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240300150-6.doi: 10.11896/jsjkx.240300150
郑骐健, 刘峰
ZHENG Qijian, LIU Feng
摘要: 随着互联网经济时代的到来,电子商务平台的高效管理日益受到学术界和工业界的广泛关注,其中,商品分类的精度与自动化水平直接影响着用户体验及运营效率的优化。鉴于此,本研究围绕商品信息的隐空间表征进行深入探讨,提出了一种面向商品隐空间表征的混合学习分析范式BEML。该框架融合了先进的双向编码器表示(BERT)技术与传统机器学习方法,旨在通过对商品信息隐空间的细致解析,显著提升商品分类的自动化处理效率及准确性。与现行主流的深度学习和机器学习算法进行对比分析的实验结果表明,BEML框架针对本次亚马逊在线分析数据集的最佳分类效果F1指标的宏平均达到了85.79%,微平均达到了84.73%,均超过了目前最佳F1指标83.3%,实现了新的SOTA。该框架不仅在理论上具有创新性,其在电子商务领域的信息管理和自动化处理实践中亦具有重要的应用价值,为科技商学领域提供了一种高效且可靠的混合学习分析范式。
中图分类号:
[1]LANDAUER T K,FOLTZ P W,LAHAM D.An introduction to latent semantic analysis[J].Discourse Processes,1998,25(2/3):259-284. [2]CHANG T Z,WILDT A R.Price,product information,and purchase intention:An empirical study[J].Journal of the Academy of Marketing Science,1994,22:16-27. [3]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [4]YANG L,SHIJIA E,XU S,et al.Bert with Dynamic Masked Softmax and Pseudo Labeling for Hierarchical Product Classification[C]//MWPD@ ISWC.2020. [5]BELTAGY I,LO K,COHAN A.SciBERT:A pretrained language model for scientific text[J].arXiv:1903.10676,2019. [6]LEE J,YOON W,KIM S,et al.BioBERT:a pre-trained biome-dical language representation model for biomedical text mining[J].Bioinformatics,2020,36(4):1234-1240. [7]PEETERS R,BIZER C.Dual-objective fine-tuning of BERT for entity matching[J].Proceedings of the VLDB Endowment,2021,14:1913-1921. [8]ZAHERA H M,SHERIF M.ProBERT:Product Data Classification with Fine-tuning BERT Model[C]//MWPD@ ISWC.2020. [9]MEUSEL R,PRIMPELI A,MEILICKE C,et al.Exploiting microdata annotations to consistently categorize product offers at web scale[C]//International Conference on Electronic Commerce and Web Technologies.Cham:Springer International Publishing,2015:83-99. [10]YU H F,HO C H,ARUNACHALAM P,et al.Product titleclassification versus text classification[J].Csie.Ntu.Edu.Tw,2012:1-25. [11]ZHANG Z,SONG X.An exploratory study on utilising the web of linked data for product data mining[J].SN Computer Science,2022,4(1):15. [12]LOUIZOS C,SWERSKY K,LI Y,et al.The variational fair autoencoder[J].arXiv:1511.00830,2015. [13]CHAVALTADA C,PASUPA K,HARDOOND R.A comparative study of machine learning techniques for automatic product categorisation[C]//Advances in Neural Networks(ISNN 2017),Part I 14.Springer International Bublishing,2017:10-17. [14]RISTOSKI P,PETROVSKI P,MIKAP,et al.A machine lear-ning approach for product matching and categorization[J].Semantic web,2018,9(5):707-728. [15]LANDAUER T K,DUMAIS S T.A solution to Plato's problem:The latent semantic analysis theory of acquisition,induction,and representation of knowledge[J].Psychological Review,1997,104(2):211-240. [16]LEE H,YOON Y.Engineering doc2vec for automatic classification of product descriptions on O2O applications[J].Electronic Commerce Research,2018,18:433-456. [17]ZHANG Z,PARAMITA M.Product classification using microdata annotations[C]//The Semantic Web-ISWC 2019:18th International Semantic Web Conference,Auckland,New Zealand,Part I 18.Springer International Publishing,2019:716-732. [18]REDDY B,RAMAKANTHA R,LOKESH K.Classification of health care products using hybrid CNN-LSTM model[J].Soft Computing,2023,27:9199-9126. [19]JAHANSHAHI H,OZYEGEN O,CEVIK M,et al.Text Classification for Predicting Multi-level Product Categories[C]//Proceedings of the 31st Annual International Conference on Computer Science and Software Engineering.2021:33-42. [20]HEUNG B,HO H C,ZHANG J,et al.An overview and com-parison of machine-learning techniques for classification purposes in digital soil mapping[J].Geoderma,2016,265:62-77. |
|