结合小波变换高频信息的可控面部性别伪造

doi:10.11896/jsjkx.221000241

Abstract

Abstract: Image-to-image translation(I2I) technology based on generative adversarial networks has made a series of breakthroughs in various fields,and is widely used in image synthesis,image coloring,and image super-resolution,especially in face attribute manipulation.To solve the issue of disparity in the performance of generated images in different translation directions due to model architecture and data imbalance,an high-frequency injection GAN(HFIGAN) model is proposed to achieve controlled facial gender forgery for transmitting high-frequency information.Firstly,in the wavelet module for transmitting high-frequency information,the features in the coding stage are decomposed at the feature level by discrete wavelet transform,and the obtained high-frequency information is injected reciprocally in the decoding stage,so that the information composition between the source and target domains is always in a more desirable ratio.Second,images’ dynamic consistency loss addresses the inconsistent translation difficulty in different directions for multi-domain conversion tasks in I2I.By redesigning the loss function,we scale the loss of difficult and easy samples,improve the feedback of difficult samples to the model,and make the model focus more on training difficult samples to improve performance.Finally,the diversity regular term based on style features is proposed to add the distance metric of style vectors in different spaces to the traditional diversity loss for supervision,which enables the model to maintain the diversity of generated images while improving the quality of image generation.Experiments on CelebA-HQ dataset and FFHQ dataset verify the effectiveness of the proposed method.The generalization of the loss function is verified in the mainstream I2I model combined with the proposed loss in this paper.Experimental results show that HFIGAN has better performance in facial gender falsification compared with previous advanced methods,and the proposed loss function has some generality.

Key words: Image generation, Generative adversarial network, Image-to-Image translation, Facial attribute manipulation, Focal loss

CLC Number:

TP391

CHEN Wanze, CHEN Jiazhen, HUANG Liqing, YE Feng, HUANG Tianqiang, LUO Haifeng. Controlled Facial Gender Forgery Combining Wavelet Transform High Frequency Information[J].Computer Science, 2023, 50(11A): 221000241-10.

References

[1]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2.Cambridge,MA,US:MIT Press,2014:2672-2680.
[2]MIRZA M,OSINDERO S.Conditional Generative AdversarialNets [EB/OL].(2014-11-06)[2022-08-16].https://arxiv.org/abs/1411.1784.
[3]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,HI,USA:IEEE,2017:1125-1134.
[4]PARK T,LIU M Y,WANG T C,et al.Semantic image synthesis with spatially-adaptive normalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE,2019:2337-2346.
[5]LEDIG C,THEIS L,HUSZAR F,et al.Photo-Realistic SingleImage Super-Resolution Using a Generative Adversarial Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4681-4690.
[6]LI X,ZHANG S,HU J,et al.Image-to-image Translation via Hierarchical Style Disentanglement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Virtual:IEEE,2021:8639-8648.
[7]ZHU J Y,PARK T,ISOLA P,et al.Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE,2017:2223-2232.
[8]HUANG X,LIU M Y,BELONGIE S,et al.Multimodal unsu-pervised image-to-image translation[C]//Proceedings of the European Conference on Computer Vision(ECCV).Munich,Germany,2018:172-189.
[9]LEE H Y,TSENG H Y,MAO Q,et al.DRIT++:Diverse Image-to-Image Translation via Disentangled Representations [EB/OL].(2019-05-02) [2022-08-16].https://arxiv.org/abs/1905.01270.
[10]CHOI Y,UH Y,YOO J,et al.Stargan v2:Diverse image synthesis for multiple domains[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle,WA,USA:IEEE,2020:8188-8197.
[11]HUANG X,BELONGIE S.Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization[C]//IEEE.2017.
[12]MAO Q,LEE H Y,TSENG H Y,et al.Mode seeking generativeadversarial networks for diverse image synthesis[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE,2019:1429-1437.
[13]LIN T Y,GOYAL P,GIRSHICKR,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[14]KARRAS T,AILA T,LAINE S,et al.Progressive Growing of GANs for Improved Quality,Stability,and Variation [EB/OL].(2018-02-26) [2022-08-16].https://arxiv.org/abs/1710.10196.
[15]KARRAS T,LAINE S,AILA T.A Style-Based Generator Architecture for Generative Adversarial Networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA:IEEE,2019:4401-4410.
[16]HE Z,ZUO W,KAN M,et al.AttGAN:Facial Attribute Editing by Only Changing What You Want[J].IEEE Transactions on Image Processing,2019,28(11):5464-5478.
[17]LIU M,DING Y,XIA M,et al.Stgan:A unified selective trans-fer network for arbitrary image attribute editing[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach,CA,USA:IEEE,2019:3673-3682.
[18]CHOI Y,CHOI M,KIM M,et al.StarGAN:Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Salt Lake City,UT,USA:IEEE,2018:8789-8797.
[19]YANG G,FEI N,DING M,et al.L2M-GAN:Learning to Manipulate Latent Space Semantics for Facial Attribute Editing[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Virtual:IEEE,2021:2950-2959.
[20]GRASSUCCI E,SIGILLO L,UNCINI A,et al.Hypercomplex Image-to-Image Translation [EB/OL].(2022-05-04) [2022-08-16].https://arxiv.org/abs/2205.02087.
[21]LIU Y,SANGINETO E,NADAI M D,et al.Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2021.
[22]ZHOU T,KRÄHENBÜHL P,AUBRY M,et al.LearningDense Correspondence via 3D-guided Cycle Consistency[C]//IEEE.2016.
[23]ZHOU T,BROWN M,SNAVELY N,et al.UnsupervisedLearning of Depth and Ego-Motion from Video[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2017.
[24]HOFFMAN J,TZENG E,PARK T,et al.Cycada:Cycle-consistent adversarial domain adaptation[C]//International Confe-rence on Machine Learning.Pmlr,2018:1989-1998.
[25]ZHU J Y,PARK T,ISOLA P,et al.Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE,2017:2223-2232.
[26]LI X,WANG W,WU L,et al.Generalized focal loss:Learning qualified and distributed bounding boxes for dense object detection[J].Advances in Neural Information Processing Systems,2020,33:21002-21012.
[27]SPIEGL B.Contrastive Unpaired Translation using Focal Loss for Patch Classification[J].arXiv:2109.12431,2021.
[28]YUN P,TAI L,WANG Y,et al.Focal loss in 3d object detection[J].IEEE Robotics and Automation Letters,2019,4(2):1263-1270.
[29]RIDNIK T,BEN-BARUCH E,ZAMIR N,et al.Asymmetricloss for multi-label classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:82-91.
[30]SMITH L N.Cyclical Focal Loss[EB/OL].(2014-02-16)[2022-08-16].https://arxiv.org/abs/2202.08978.
[31]HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium[C]//Neural Information Processing Systems(NIPS).Long Beach,CA,USA:MIT Press,2017:6626-6637.
[32]ZHANG R,ISOLA P,EFROS A A,et al.The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA:IEEE,2018:586-595.
[33]PARKHI O M,VEDALDI A,ZISSERMAN A.Deep Face Recognition[C]//British Machine Vision Conference.Swansea,UK,2015.

Related Articles 15

[1]	ZHUANG Yuan, CAO Wenfang, SUN Guokai, SUN Jianguo, SHEN Linshan, YOU Yang, WANG Xiaopeng, ZHANG Yunhai. Network Protocol Vulnerability Mining Method Based on the Combination of Generative AdversarialNetwork and Mutation Strategy [J]. Computer Science, 2023, 50(9): 44-51.
[2]	SONG Xinyang, YAN Zhiyuan, SUN Muyi, DAI Linlin, LI Qi, SUN Zhenan. Review of Talking Face Generation [J]. Computer Science, 2023, 50(8): 68-78.
[3]	YAN Yan, SUI Yi, SI Jianwei. Remote Sensing Image Pan-sharpening Method Based on Generative Adversarial Network [J]. Computer Science, 2023, 50(8): 133-141.
[4]	WU Jufeng, ZHAO Xungang, ZHOU Qiang, RAO Ning. Contrastive Learning for Low-light Image Enhancement [J]. Computer Science, 2023, 50(6A): 220600171-6.
[5]	WANG Jinwei, ZENG Kehui, ZHANG Jiawei, LUO Xiangyang, MA Bin. GAN-generated Face Detection Based on Space-Frequency Convolutional Neural Network [J]. Computer Science, 2023, 50(6): 216-224.
[6]	YU Xingzhan, LU Tianliang, DU Yanhui, WANG Xirui, YANG Cheng. Android Malware Family Classification Method Based on Synthetic Image and Xception Improved Model [J]. Computer Science, 2023, 50(4): 351-358.
[7]	LIANG Weiliang, LI Yue, WANG Pengfei. Lightweight Face Generation Method Based on TransEditor and Its Application Specification [J]. Computer Science, 2023, 50(2): 221-230.
[8]	ZHANG Dehui, DONG Anming, YU Jiguo, ZHAO Kai andZHOU You. Speech Enhancement Based on Generative Adversarial Networks with Gated Recurrent Units and Self-attention Mechanisms [J]. Computer Science, 2023, 50(11A): 230200203-9.
[9]	LI Xiaoling, WU Haotian, ZHOU Tao, LU Hui. Password Guessing Model Based on Reinforcement Learning [J]. Computer Science, 2023, 50(1): 334-341.
[10]	ZHANG Jia, DONG Shou-bin. Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer [J]. Computer Science, 2022, 49(9): 41-47.
[11]	SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[12]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[13]	XU Guo-ning, CHEN Yi-peng, CHEN Yi-ming, CHEN Jin-yin, WEN Hao. Data Debiasing Method Based on Constrained Optimized Generative Adversarial Networks [J]. Computer Science, 2022, 49(6A): 184-190.
[14]	YIN Wen-bing, GAO Ge, ZENG Bang, WANG Xiao, CHEN Yi. Speech Enhancement Based on Time-Frequency Domain GAN [J]. Computer Science, 2022, 49(6): 187-192.
[15]	XU Hui, KANG Jin-meng, ZHANG Jia-wan. Digital Mural Inpainting Method Based on Feature Perception [J]. Computer Science, 2022, 49(6): 217-223.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Controlled Facial Gender Forgery Combining Wavelet Transform High Frequency Information

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0