Computer Science ›› 2022, Vol. 49 ›› Issue (1): 41-46.doi: 10.11896/jsjkx.210900012
• Multilingual Computing Advanced Technology • Previous Articles Next Articles
LIU Yan, XIONG De-yi
CLC Number:
[1]GERNOT W.The Iranian languages[M].Routledge,2009. [2]LIAO B.The Language Situation in India-An Analysis Based on the Language Survey Data of the Indian Census in 2011[J].Journal of PLA University of Foreign Languages,2020,43(6):7. [3] JIANG S Y,LI S S,FU S H,et al.An Overview of Natural Language Processing for Indonesian and Malay[J].Pattern Recognition and Artificial Intelligence,2020,33(6):12. [4]JAMES N S.The Indonesia languages:Its history and role in Modern Society[M].UNSW Press,2004. [5]SCHWENK H,CHAUDHARY V,SUN S,et al.WikiMatrix:Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia[J].arXiv:1907.05791,2019. [6]El-KISHKY A,RENDUCHINTALA A,CROSS J,et al.XLEnt:Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment[J].arXiv:2104.08597,2021. [7]TIEDEMANN J.Parallel Data,Tools and Interfaces in OPUS[C]//Lrec.2012:2214-2218. [8]REIMERS N,GUREVYCH I.Making Monolingual SentenceEmbeddings Multilingual using Knowledge Distillation[J].arXiv:2004.09813,2020. [9]GUZMAN F,SAJJAD H,VOGEL S,et al.The AMARA Corpus:Building Resources for Translating the Web's Educational Content[C]//International Workshop on Spoken Language Translation(IWSLT).2013. [10]ZHAO F,ZHOU T,ZHANG L,et al.Research Progress onWikipedia[J].Journal of University of Electronic Science and Technology of China,2010(3):321-334. [11]SMITH J R,SAINTAMAND H,PLAMADA M,et al.Dirtcheap web-scale parallel text from the Common Crawl[C]//Proceedings of the 2013 Conference of the Association for Computational Linguistics (ACL 2013).2013. [12]ECK M,VOGEL S,WAIBEL A.Low Cost Portability for Statistical Machine Translation based on N-gram Frequency and TF-IDF [C]//Proceedings of International Workshop on Spoken Language Translation.2005. [13]SETTLES B.Active Learning Literature Survey[J].Science,1995,10(3):237-304. [14]LEVENSHTEIN V I.Binary codes capable of correcting dele-tions,insertions and reversals[C]//Soviet Physics Doklady.1996:707-710. [15]NEEDLEMAN S B.A general method applicable to the search for similarities in the amino acid sequence of two proteins[J].Journal of Molecular Biology,1970,48(3):443-453. [16]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [17]PAPINENI S.Blue;A method for Automatic Evaluation ofMachine Translation[C]//Meeting of the Association for Computational Linguistics.Association for Computational Linguistics.2002. [18]EL-KISHKY A,CHAUDHARY V,GUZMAN F,et al.CCAligned:A massive collection of cross-lingual web-document pairs[J].arXiv:1911.06154,2019. [19]SCHWENK H,WENZEK G,EDUNOV S,et al.Ccmatrix:Mi-ning billions of high-quality parallel sentences on the web[J].arXiv:1911.04944,2019. [20]ZHANG B,NAGESH A,KNIGHT K.Parallel Corpus Filtering via Pre-trained Language Models[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:8545-8554. [21]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//NAACL-HLT (1).2019. [22]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9. [23]IMANKULOVA A,SATO T,KOMACHI M.Improving low-resource neural machine translation with filtered pseudo-parallel corpus[C]//Proceedings of the 4th Workshop on Asian Translation (WAT2017).2017:70-78. [24]GRAÇA M,KIM Y,SCHAMPER J,et al.Generalizing Back-Translation in Neural Machine Translation[C]//Proceedings of the Fourth Conference on Machine Translation(Volume 1:Research Papers).2019:45-52. [25]LOWPHANSIRIKUL L,POLPANUMAS C,RUTHERFORDA T,et al.A large English-Thai parallel corpus from the web and machine-generated text[J].Language Resources and Evaluation,2021,55(1):1-23. [26]ZIN M M,RACHARAK T,LE N M.Construct-Extract:AnEffective Model for Building Bilingual Corpus to Improve English-Myanmar Machine Translation[C]//ICAART (2).2021:333-342. [27]MUBARAK H,HASSAN S,ABDELALI A.Constructing a bilingual corpus of parallel tweets[C]//Proceedings of the 13th Workshop on Building and Using Comparable Corpora.2020:14-21. |
[1] | LIU Jun-peng, SU Jin-song, HUANG De-gen. Incorporating Language-specific Adapter into Multilingual Neural Machine Translation [J]. Computer Science, 2022, 49(1): 17-23. |
[2] | QIAO Bo-wen,LI Jun-hui. Neural Machine Translation Combining Source Semantic Roles [J]. Computer Science, 2020, 47(2): 163-168. |
[3] | WANG Qi, DUAN Xiang-yu. Neural Machine Translation Based on Attention Convolution [J]. Computer Science, 2018, 45(11): 226-230. |
[4] | LAN Yi-yong, LIU Hai-feng and YANG Yuan-yuan. Minority Language Websites’ Automatic Identification and Collection [J]. Computer Science, 2015, 42(Z6): 79-82. |
|