prototypes
Detail
Publication date: 1 de June, 2021TRANSTOR
Our approach to Machine Translation diverges from the standing Statistical Machine Translation (SMT) philosophy, where there is no memory of acquired and already validated knowledge (accepted as correct or rejected as incorrect or incomplete) having the consequence that any new translating engine generated, due to training in some parallel text collection (made of texts that are translations of each other), will be identical to all others that have already been trained in the past on that collection, showing as a consequence no capability for improvement, unless we augment the number of models considered. But, even though, the improvement will be very small (1 BLEU point of increase gives rise to a new publication – BLEU measures translation quality in the range 0-100).
As we incorporate human validation both by evaluating translation equivalents automatically extracted and by accommodating the results of post-edition (revision of machine translated texts), our systems can rapidly improve their performance along the time and obtain better translation quality than MOSES, the state-of-the-art SMT system, trained on the same parallel texts collection. We obtain 14 BLEU points higher, medium value, for translation quality than MOSES, for 16 translation directions among languages as Portuguese, English, French, Spanish and German (we are not counting on German-French, nor on French-Spanish, that would complete 20 translation directions). We have not published as much as we could in this area as we want to patent our discoveries till 2014. There are other aspects that make our work different from the main research flow in SMT: namely on phrase-based parallel texts alignment rather than word-based alignments.
As a consequence of this work, based on 1 PhD thesis ended in 2002 and 5 ongoing PhD thesis, a start-up was launched in July 2013 for the Translation area: ISTRION BOX TRANSLATION AND REVISION, LDA.
Date | 01/06/2013 |
---|