In proceedings details

  • Identifying Bilingual Segments for Translation Generation
  • Oct 2014
  • We present an approach that uses known translation forms in a validated bilingual lexicon and identifies bilingual stem and suffix segments. By applying the longest sequence common to pair of orthographically similar translations we initially induce the bilingual suffix transformations (replacement rules). Redundant analyses are discarded by examining the distribution of stem pairs and associated transformations. Set of bilingual suffixes conflating various translation forms are grouped. Stem pairs sharing similar transformations are subsequently clustered which serves as a basis for the generative approach. The pri- mary motivation behind this work is to eventually improve the lexicon coverage by utilising the correct bilingual entries in suggesting translations for OOV words. In the preliminary results, we report generation results, wherein, 90% of the generated translations are correct. This was achieved when both the bilingual segments (bilingual stem and bilingual suffix) in the bilingual pair being analysed are known to have occurred in the training data set.
  • Springer Berlin Heidelberg
  • Kavitha Mahesh, Luís Gomes, Gabriel Pereira Lopes
  • Lecture Notes in Computer Science
  • 8819
  • http://http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-3-319
  • 191 to 212
  • 1 Oct 2014