Detail

Publication date: 1 de June, 2021

PCTRANS

This is a phrase-based statistical machine translation system that works on top of a parallel corpus aligned at a subsentence grain level. As such it depends on LEXICAL, a supervised evolving parallel text aligner that uses previously automatically extracted and validated phrase translations for realigning any parallel corpora at a sub-sentence grain.
Preliminary results obtained with PCTRANS, for translation from Portuguese into English, attained 71% BLEU score and 65% for translation from English into Portuguese. This means we obtained an improvement of 10 BLEU points in both translation directions by comparison with reported results obtained by state-of-the-art phrase-based SMT systems for the same language pair. Our approach, by assuming a higher alignment precision, does not need to treat alignment as a hidden variable and, as a consequence, translation becomes less heavy than in SMT.

Authors

Gabriel Pereira Lopes, José Aires,

Date 01/06/2009