Articles details

Extracting Translation Equivalents from Portuguese-Chinese Parallel Texts

01 Jan 2001

This paper describes a method for extracting Portuguese-Spanish word translation equivalents from aligned parallel texts. This method uses the standard loglikelihood statistics to measure the similarity between two words. Parallel texts are aligned using a simple method that extends previous work by Pascale Fung & Kathleen McKeown and Melamed. In contrast, the method in this paper does not use statistically unsupported heuristics to filter reliable correspondence points. Instead, it provides the statistical support those authors could not claim by using confidence bands of linear regressions. The points of the linear regression line are generated from the positions of homograph words which occur with the same frequency in parallel text segments. With this alignment method, we are able to extract word translation equivalents (about 90 of the best 100 are correct equivalents).

Journal: Studies in Lexicography

Authors: António Ribeiro, Gabriel Pereira Lopes, João Tiago Mexia

Editors:

Volume: 11

Number: 1

Issn:

Isbn:

Url:

Notes:

Bibtex Key:

DOI:

Pages: 181 to 194

Publication Date: 1 Jan 2001