prototypes
Detail
Publication date: 1 de June, 2021Compressed Bilingual Text Framework
This prototype manages collections of parallel texts, in two different languages, using compressed data structures. We say that two texts are parallel when one is a translation of the other and vice-versa.
With such framework, it is possible to index in main memory huge text collections, while it supports linear time queries operations.
The framework also represents the parallel text alignment, namely the segments_A of text_en is translated by segment_B of the a text_pt, with text_en and text_pt being parallel texts.
Such framework to be used in several Machine Translation tasks such as: concordancer, extraction of translation candidates, parallel text alignment, etc.
Date | 01/01/2013 |
---|