基于多特征融合的细粒度视频人物关系抽取

doi:10.11896/jsjkx.200800160

Abstract

Abstract: Video character relation extraction is an important task of information extraction.It is valuable for video description,video retrieval,character search,public security supervision,etc.Due to the huge gap between the underlying pixels of video data and the semantics of high-level relation,it is difficult to accurately extract the relations.Most existing studies are based on coarse- granularity analysis,such as co-occurrence of characters,which ignores the fine-granularity information.In order to solve the problem that it is difficult to accurately and completely extract the relations among video characters,this paper proposes a new method for extracting relations of video characters based on multi-feature fusion and fine-granularity analysis.First,a new character entity recognition model,named CRMF(Character Recognition based on Multi-feature Fusion),is proposed.Through this manner,we can generate a more complete character set using face and body features fusion.Second,we exploit a character relationship recognition model based on fine-granularity features,named FGAG(Fine-Granularity Analysis based on GCN),which not only fuses the spatio-temporal features,but also considers the fine-granularity objects information related to the characters.Thus a better mapping can be established to accurately identify the character relations.Comprehensive evaluations are conducted on the movie video and SRIV character relationship recognition dataset,and the experimental results demonstrate that the proposed method outperforms the state-of-the-art methods on character entity and relation recognition,F₁ value increases by 14.4% and accuracy increases by 10.1%.

Key words: Character relation recognition, Deep learning, Multi-feature fusion, Relation extraction, Video analysis

CLC Number:

TP391

LYU Jin-na, XING Chun-yu , LI Li. Video Character Relation Extraction Based on Multi-feature Fusion and Fine-granularity Analysis[J].Computer Science, 2021, 48(4): 117-122.

References

[1]ZHANG Z,LUO P,LOY C C,et al.Learning Social RelationTraits from Face Images[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision.Santiago,Chile,2015:3631-3639.
[2]LI J,WONG Y,ZHAO Q,et al.Dual-Glance Model for Deciphering Social Relationships[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2669-2678.
[3]VICOL P,TAPASWI M,CASTREJON L,et al.MovieGraphs:Towards Understanding Human-Centric Situations from Videos[C]//Proceedings of the CVPR.2018:8581-8590.
[4]TRAN Q D,JUNG J E.CoCharNet:Extracting Social Networks using Character Co-occurrence in Movies [J].Journal of Universal Computer Science,2015,21(6):796-815.
[5]WENG C,CHU W,WU J.RoleNet:Movie Analysis from the Perspective of Social Networks[J].IEEE Transactions on Multimedia,2009,11(2):256-271.
[6]YUAN K,YAO H,JI R,et al.Mining actor correlations with hie-rarchical concurrence parsing[C]//Proceedings of the IEEE International Conference on Acoustics,Speech,and Signal Processing.Dallas,Texas,USA,2010:798-801.
[7]WANG G,GALLAGHER A C,LUO J,et al.Seeing People in Social Context:Recognizing People and Social Relationships[C]//Proceedings of the Computer Vision- ECCV 2010-11th European Conference on Computer Vision.Crete,Greece,2010:169-182.
[8]DAI Q,CARR P,SIGAL L,et al.Family Member Identification from Photo Collections[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision.Waikoloa,HI,USA,2015:982-989.
[9]SAPRU A,BOURLARD H.Automatic Recognition of Emer-gent Social Roles in Small Group Interactions [J].IEEE Trans.Multimedia,2015,17(5):746-760.
[10]BOJANOWSKI P,BACH F R,LAPTEV I,et al.Finding Actors and Actions in Movies[C]//Proceedings of the IEEE International Conference on Computer Vision.Sydney,Australia,2013:2280-2287.
[11]LV J,LIU W,ZHOU L,et al.Multi-stream Fusion Model for Social Relation Recognition from Videos[C]//MultiMedia Modeling 24th International Conference.Bangkok,Thailand,2018:355-368.
[12]YAN H,HU J.Video-based kinship verification using distance metric learning [J].Pattern Recognition,2018,75:15-24.
[13]HE X M,CHN Y D,LI D.A Construction for Social Network on the Basis of Project Cooperation[J].Journal of Computer Research and Development,2016,53(4):776-784.
[14]MIKA P.Flink:Semantic Web technology for the extraction and analysis of social networks [J].SSRN Electronic Journal,2005,3(2/3):211-223.
[15]DING L,YILMAZ A.Learning Relations among Movie Characters:A Social Network Perspective[C]//Computer Vision - ECCV 2010,11th European Conference on Computer Vision.Hera-klion,Crete,Greece,2010:410-423.
[16]TRAN Q D,JUNG J E.CoCharNet:Extracting Social Networks using Character Co-occurrence in Movies [J].Journal of Universal Computer,2015,21(6):796-815.
[17]ZHANG K,ZHANG Z,LI Z,et al.Joint face detection andalignment using multitask cascaded convolutional networks [J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
[18]BARTOLI F,LISANTI G,KARAMAN S,et al.Scene-depen-dent proposals for efficient person detection [J].Pattern Recognition,2019,87:170-178.
[19]WANG F,ZUO W,LIN L,et al.Joint learning of single-image and cross-image representations for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,USA,2016:1288-1296.
[20]WANG C.Research and implementation of Chinese microblog character relationship map based on deep learning [D].Wuhan:Wuhan Institute of Posts and Telecommunications,2018.
[21]KANG Y R,ZHAO L,FAN W,et al.Digital Profiling:Relationships Analysis Based on Time Information of WeChat [J].Journal of Criminal Technique,2018,43(3):187-192.
[22]QIN X,TAN X,CHEN S.Tri-Subject Kinship Verification:Understanding the Core of a Family[J].IEEE Trans on Multimedia,2015,17(10):1855-1867.
[23]HUANG Q,XIONG Y,LIN D.Unifying Identification and Context Learning for Person Recognition[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA,2018:2217-2225.
[24]ERTUGRUL I O,JENI L A.Modeling and synthesis of kinship patterns of facial expressions [J].Image Vision Computer,2018,79:133-143.
[25]LÓPEZ M B,HADID A,BOUTELLAA E,et al.Kinship verification from facial images and videos:human versus machine [J].Mach.Vis.Appl.,2018,29(5):873-890.
[26]YAN H.Learning discriminative compact binary face descriptor for kinship verification [J].Pattern Recognition Letters,2019,117:146-152.
[27]BIBI S,ANJUM N,SHER M.Automated multi-feature human interaction recognition in complex environment [J].Computers in Industry,2018,99:282-293.
[28]LV J,WU B,ZHOU L,et al.StoryRoleNet:Social Network Construction of Role Relationship in Video [J].IEEE Access,2018,6:25958-25969.
[29]WANG X,GUPTA A.Videos as Space-Time Region Graphs[C]//Proceedings of the Computer Vision - ECCV 2018 -15th European Conference.Munich,Germany,2018:413-431.
[30]JDENG J K,GUO J,ZAFEIRIOU S F.Arcface:Additive angular margin loss for deep face recognition [J].arXiv:1801.07698,2018.
[31]ZHANG X,LUO H,FAN X,et al.AlignedReID:SurpassingHuman-Level Performance in Person Re-Identification [J].ar-Xiv:1711.08184,2017.
[32]FLORIAN S,DMITRY K,JAMES P.FaceNet:A unified em-bedding for face recognition and clustering[C]//Proceedings of the CVPR 2015.Boston,Massachusetts,2015:815-823.
[33]YUAN K,YAO H.Mining actor correlations with hierarchical concurrence parsing[C]//IEEE ICASSP.2010:798-801.
[34]TRAN D,BOURDEV L,FERGUS R,et al.Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision(ICCV).Santiago,Chile,2015:4489-4497.
[35]FINDLER N V.Short note on a heuristic search strategy in long-term memory networks [J].Information Processing Letters,1972,1(5):191-196.
[36]WANG L,XIONG Y,WANG Z,et al.Temporal segment networks:towards good practices for deep action recognition[C]//Proceedings of the ECCV 2016.Amsterdam,Netherlands,2016:20-36.
[37]LYU J N,WU B.Spatio-Temporal Attention Model Based on Multi-view for Social Relation Understanding [C]//International Conference on Multimedia Modeling.Springer,Cham,2019:390-401.

Related Articles 15

[1]	XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2]	RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4]	SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[5]	WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[6]	HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[7]	JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[8]	HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[9]	CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[10]	HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[11]	ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[12]	SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[13]	ZHU Wen-tao, LAN Xian-chao, LUO Huan-lin, YUE Bing, WANG Yang. Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN [J]. Computer Science, 2022, 49(6A): 378-383.
[14]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[15]	MAO Dian-hui, HUANG Hui-yu, ZHAO Shuang. Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance [J]. Computer Science, 2022, 49(6A): 523-530.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Video Character Relation Extraction Based on Multi-feature Fusion and Fine-granularity Analysis

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0