Research on Process of the Bilingual Corpus Alignment Tool

  • Dandan Zhang Shandong Jianzhu University
Article ID: 2664
Keywords: Bilingual corpus, Corpus alignment


As language researchers and translators gradually realize the importance of the development of corpus, many institutions at home and abroad have begun to devote themselves to the research and construction of corpus. During the process of corpus, bilingual corpus alignment is an indispensable step. However, the alignment software based on the existing alignment technology still can’t meet the needs of users or translators. On the basis of previous studies, this paper mainly makes some beneficial attempts on the alignment process of sentence level automatic alignment technology in bilingual corpus. In this paper, Chinese and English files are imported into the corpus alignment tools and aligned one by one according to the translation units of sentence level. This paper presents the process of corpus alignment and proposes corresponding solutions to the errors in the process, to further improve the efficiency of bilingual corpus alignment.
How to Cite
Zhang, D. (2022). Research on Process of the Bilingual Corpus Alignment Tool. Learning & Education, 10(5), 35-38.


[1] Baker M. 1995. Corpus in translation studies:An overview and some suggestions for future research [M]. Target: 230-236.

[2] Zhao Xiaoman. 2010 sentence-Level Alignment of English-Chinese Parallel corpus and its Application in Machine Translation [C]. Anhui University: 1.

[3] Christopher C. Yang *, Kar Wing Li. 2003. Building parallel corpora by automatic title alignment using length-based and text-based approaches [J]. Information Processing & Management: 2.

[4] Simard, M., Foster, G., & Isabelle, P. 1992. Using cognates to align sentences in bilingual corpora [J]. In Fourth international conference on theoretical and methodological issues in machine translation (TMI-92), Montreal, Canada: 1072.

[5] André Santos. 2011. A survey on parallel corpora alignment [J]. MI-STAR: 122.

[6] Tiedemann, J. 2010. Lingua-Align: An Experimental Toolbox for Automatic Tree-to-Tree Alignment[J]. Department of Linguistics and Philology Uppsala University, Uppsala/Sweden: 742.