Book chapters details

Extracting Concepts from dynamic legislative text collections

Jan 2005

Selecting discriminating terms in order to represent the contents of texts is a critical problem for many applications in Information Retrieval. Most of the Information Retrieval systems index documents based on individual words that are not specific enough to evidence the contents of texts. As a consequence, there has been a growing interest in developing techniques for automatic term extraction. In this context, we propose a new architecture for retrieving relevant documents in a dynamic text collection. It combines the SINO search engine with the SENTA software designed for the automatic extraction of multiword lexemes. In this paper, we will particularly focus on the SENTA module that has recently been added to the global architecture. Keywords: Multiword Lexical Unit Extraction, Information Retrieval, Web Interface.

Book title: Meaningful Texts: The Extraction of Semantic Information from Monolingual and Multilingual Corpora

Publisher: Continuum

Authors: Gaël Dias, Sara Madeira, Gabriel Pereira Lopes

Editors:

Edition: Geoff Barnbrook and Pernilla Danielsson and Michaela Mahlberg

Series: Research in Corpus and Discourse

Volume:

ISBN: ISBN 0-8264-7490-X

Url: http://www.continuumbooks.com

Notes:

Bibtex Key:

DOI:

Pages: 5 to 16

Publication Date: 1 Jan 2005

Publication File: