计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220100205-6.doi: 10.11896/jsjkx.220100205

• 图像处理&多媒体技术 • 上一篇    下一篇

基于字形感知和注意力归一化的字体迁移

吕文锐1, 普园媛1,2, 赵征鹏1, 徐丹1, 钱文华1   

  1. 1 云南大学信息学院 昆明 650504;
    2 云南省高校物联网技术及应用重点实验 昆明 650504
  • 出版日期:2023-06-10 发布日期:2023-06-12
  • 通讯作者: 普园媛(yuanyuanpu@ynu.edu.cn)
  • 作者简介:(1947663831@qq.com)
  • 基金资助:
    国家自然科学基金(62162068,61271361,61761046,62061049);云南省应用基础研究面上项目(2018FB100);云南省科技厅应用基础研究计划重点项目(202001BB050043,2019FA044)

Font Transfer Based on Glyph Perception and Attentive Normalization

LYU Wenrui1, PU Yuanyuan1,2, ZHAO Zhengpeng1, XU Dan1, QIAN Wenhua1   

  1. 1 School of Information Science and Engineering,Yunnan University,Kunming 650504,China;
    2 University Key Laboratory of Internet of Things Technology and Application,Yunnan Province,Kunming 650504,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:LYU Wenrui,born in 1994,postgra-duate.His main research interests include computer vision,image processing and font transfer. PU Yuanyuan,born in 1972,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include digital image processing and computer vision.
  • Supported by:
    National Natural Science Foundation of China(62162068,61271361,61761046,62061049),Yunnan Science and Technology Department Project(2018FB100) and Key Program of the Applied Basic Research Programs of Yunnan(202001BB050043,2019FA044).

摘要: 字体迁移是一项十分具有挑战性的任务,其目的是将目标字体通过某种映射方式迁移到源字体,以实现字体的变换。现有的方法在字体迁移方面的鲁棒性有限,突出表现为对生成字体结构完整性的保持较差,尤其是当两种不同种类的字体差别较大时。针对这些问题,提出了一种端到端的字体迁移网络框架模型。该模型引入了注意力归一化以更好地提取字形图像的高级语义特征,从而提高生成图像的质量。此外,使用自适应实例归一化进行字体特征和内容特征融合,以实现字体的转换。在保持字形结构完整性方面,设计了感知损失和上下文损失来约束字形结构的生成。为了稳定GAN网络的训练,在对抗损失函数的设计中加入了正则化项。为了验证该模型的有效性,实验采用FET-GAN中公开的数据集进行了多组训练和测试,并与FET-GAN,CycleGAN和StarGANv2进行了对比。实验结果表明,该模型能够在给定的多个字体域之间实现相互的字体迁移,并且其迁移的效果和模型泛化能力与其他工作相比均具有一定的优势。

关键词: 字体迁移, 自适应实例归一化, 注意力归一化, 上下文损失, 感知损失

Abstract: The style transfer of font is a very challenging task,and its aim is to transfer the target font to the source font through a certain mapping method,so that it can realize the conversion of fonts.Existing methods in glyph transfer are limited in robustness,it highlights the poor maintenance of the structural integrity of the generated fonts.None of these methods can get satisfactory results,especially with the presence of a huge difference among different glyph styles.To address this problem,an end-to-end font transfer network framework model is proposed,and the attentive normalization is introduced in the model to better extract the high-level semantic features of the font images,thus improving the quality of the generated images.Additionally feature fusion is performed using adaptive instance normalization for font transformation.In terms of maintaining the integrity of the glyph structure,the perception loss and context loss are designed to constrain the generation of the glyph structure.A regularization term is added to the design of the adversarial loss function to stabilize the training of GAN.To verify the validity of the model,experiment is trained and tested in multiple sets using publicly available datasets in FET-GAN,and compared with the latest methods in FET-GAN,CycleGAN and StarGANv2.It is experimentally verified that the model is able to achieve mutual transfer of fonts between a given number of font domains,and both its transfer effect and model generalization ability have some advantages compared with the latest work.

Key words: Font transfer, Adaptive instance normalization, Attentive normalization, Context loss, Perception loss

中图分类号: 

  • TP391
[1]GATYS L A,ECKER A S,BETHGE M.Imagestyle transferusing convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2414-2423.
[2]GATYS L,ECKER A S,BETHGE M.Texture synthesis using convolutional neural networks[J].Advances in Neural Information Processing Systems,2015,28:262-270.
[3]JING Y,YANG Y,FENG Z,et al.Neural Style Transfer:A Review[J/OL].IEEE Transactions on Visualization and Computer Graphics,2019.https://xueshu.baidu.com/usercenter/paper/show?paperid=1e5m0ae0sj700mg0774e0ck0nj242457&site=xueshu_se.
[4]LI Y,FANG C,YANG J,et al.Universal style transfer via feature transforms[C/OL]//2017.https://xueshu.baidu.com/usercenter/paper/show?paperid=af912f3490e8e1a6c23a027c8aa87cd8&site=xueshu_se.
[5]CAMPBELL N D F,KAUTZ J.Learning a manifold of fonts[J]. ACM Transactions on Graphics(TOG),2014,33(4):1-11.
[6]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[J/OL].Advances in Neural Information Processing Systems,2014,27.https://xueshu.baidu.com/usercenter/paper/show?paperid=8c5fb216c54c0422b63463c859e8d23f&site=xueshu_se&hitarticle=1.
[7]YANG S,LIU J,WANG W,et al.TET-GAN:Text effectstransfer via stylization anddestylization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:1238-1245.
[8]LIAN Z,ZHAO B,CHEN X,et al.EasyFont:A Style Learning-Based System to Easily Build Your Large-Scale Handwriting Fonts[J].ACM Transactions on Graphics,2018,38(1):1-18.
[9]BALASHOVA E,BERMANO A H,KIM V G,et al.Learning a Stroke-Based Representation for Fonts[C]//Computer Graphics Forum.2019:429-442.
[10]BALUJA S.Learning typographic style:from discrimination to synthesis[J].Machine Vision and Applications,2017,28(5):551-568.
[11]UPCHURCH P,SNAVELY N,BALA K.From A to Z:Supervised Transfer of Style and Content Using Deep Neural Network Generators[OL].2016.https://xueshu.baidu.com/usercenter/paper/show?paperid=046c1f9642aba596f8612603f1ceccd9&site=xueshu_se&hitarticle=1.
[12]LYU P,BAI X,YAO C,et al.Auto-encoder guided GAN for Chinese calligraphy synthesis[C]//2017 14th IAPR Interna-tional Conference on Document Analysis and Recognition(ICDAR).IEEE,2017:1095-1100.
[13]ZHANG R,ZHAN Y S,YANG M H.Handwritten Drawing Order Recovery Method Based on Endpoint Sequential Prediction[J].Computer Science,2019,46(11A):264-267.
[14]MAO Q,LEE H Y,TSENG H Y,et al.Mode seeking generative adversarial networks for diverse image synthesis[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:1429-1437.
[15]LEE H Y,TSENG H Y,HUANG J B,et al.Diverse image-to-image translation via disentangledrepresentations[C]//Procee-dings of the European Conference on Computer Vision(ECCV).2018:35-51.
[16]IIZUKA S,SIMO-SERRA E,ISHIKAWA H.Globally and locally consistent image completion[J].ACM Transactions on Graphics(ToG),2017,36(4):1-14.
[17]LI W,HE Y,QI Y,et al.FET-GAN:Font and Effect Transfer via K-shot Adaptive Instance Normalization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:1717-1724.
[18]KULKARNI T D,WHITNEY W,KOHLI P,et al.Deep convolutional inverse graphics network[J/OL].2015.https://xueshu.baidu.com/usercenter/paper/show?paperid=313d0148d77f64010501f5cde4f39df9&site=xueshu_se.
[19]MECHREZ R,TALMI I,ZELNIK-MANOR L.The contextual loss for image transformation with non-aligned data[C]//Proceedings of the European Confe-rence on Computer Vision(ECCV).2018:768-783.
[20]JOHNSON J,ALAHI A,FEI-FEI L.Perceptual losses for real-time style transfer and super-resolution[C]//European Confe-rence on Computer Vision.Cham:Springer,2016:694-711.
[21]MIYATO T,KATAOKA T,KOYAMA M,et al.Spectral normalization for generative adversarial networks[J/OL].2018.https://xueshu.baidu.com/usercenter/paper/show?paperid=bca8ce69d0885365284cc84a0f9ddccd&site=xueshu_se.
[22]MESCHEDER L,GEIGER A,NOWOZIN S.Which trainingmethods for GANs do actually converge[C]//International Conference on Machine Learning.PMLR,2018:3481-3490.
[23]ZHOU W,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Trans Image Process,2004,13(4).
[24]BABAEE A,SHAHRTASH S M,NAJAFIPOUR A.Compa-ring the trustworthiness of signal-to-noise ratio and peak signal-to-noise ratio in processing noisy partial discharge signals[J].Iet Science Measurement & Technology,2013,7(2):112-118.
[25]HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.Gans trained by a two time-scale update rule converge to a local nash equilibrium[J/OL]. Advances in Neural Information Processing Systems,2017,30.https://xueshu.baidu.com/usercenter/paper/show?paperid=c060c67e8f8e928c565d8da6ddc44300&site=xueshu_se&hitarticle=1.
[26]ZHU J Y,PARK T,ISOLA P,et al.Unpaired Image-to-ImageTranslation using Cycle-Consistent Adversarial Networks[J].IEEE,2017.
[27]CHOI Y,UH Y,YOO J,et al.Stargan v2:Diverse image synthesis for multiple domains[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8188-8197.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!