In proceedings details

Parallel Texts Alignment

Oct 2009

Alignment of parallel texts (texts that are a translation of each other) is a step required by many applications that use parallel texts, including statistical machine translation, automatic extraction of translation equivalents, automatic creation of concordances, etc. Most of existing methods for parallel texts alignment try to infer simultaneously a bilingual word lexicon and a set of correspondences between the occurrences of those words in the texts. Some authors suggest that an external lexicon can be used to complement the inferred one, but they tend to consider it secondary/optional. We defend that lexicon inference should not be embedded in the alignment process, and present LEXIC-AL, a new alignment method that relies exclusively on externally managed lexicons. In our experiments with the European Constitution corpus, LEXIC-AL achieves 84.45% precision and 84.55% recall.

Organization:

Publisher: Universidade de Aveiro

Authors: Luís Gomes, José Aires, Gabriel Pereira Lopes

Editors:

Series:

Volume: 0

ISSN:

ISBN:

Url: http://epia2009.web.ua.pt/onlineEdition.asp

Notes:

Bibtex Key:

DOI:

Pages: 513 to 524

Publication Date: 1 Oct 2009

Publication File: