Hegler Tissot: "OBIE: Ontology-based Information Extraction. An Approach to Extract and Deal with Imprecise Temporal Data and Spelling Errors"

Ponente: Hegler Tissot (Federal University of Paraná, Brazil)

Fecha: martes 25 de febrero de 2014

Hora: 12h00

Lugar de celebración: Sala 1.03, ETSI Informática, UNED


Given that more and more unstructured knowledge is represented in computer-readable forms, it is necessary not only to understand how to use it, but also to build tools that can effectively extract, analyze, and make the meaning of information useful. Information Extraction (IE) systems and techniques can deal with this vast amount of textual information that is available nowadays. Ontology-based Information Extraction (OBIE) is subfield of IE that uses ontologies to assist in the extraction of domain-specific information. However, ontologies do not have the semantics for temporal information, they cannot perform reasoning about temporal knowledge. Medical records are an example of textual content in which events are related to temporal information. The inability to extract temporal data that can place events on a timeline makes it difficult to understand how such events are organized in a chronological order. Temporal information, however, is not always accurately represented - as in expressions like "some days ago". Moreover, input text may contain spelling errors which can further complicate the understanding of such expressions. Although some proposed approaches address the issue of identifying explicit, implicit, or even imprecise date and time expressions in text during the IE process, existing OBIE solutions do not have integrated support to deal with uncertain temporal knowledge coupled with text that contain spelling errors. In this work we present a proposal to extract and deal with precise and imprecise temporal data within an OBIE process. The temporal information may contain spelling errors as well, based on a novel phonetic search solution. In this proposed work, we will present a set of components in a OBIE framework for handling these problems. We have already developed a phonetic search solution, which is a first result toward the final work.



Hegler Tissot has been working with information systems for the last 23 years, in particular desigining and maintaining large relational databases. He received MSc at Federal University of Santa Catarina (Brazil) in 2004 in the Engeneering and Production Systems Department working with requirements about Data Warehouse systems. He has worked in several companies in Brazil as system and database analyst, in projects related to different areas such as telecom and public heath/medical record systems. He is currently in a PhD program at Federal University of Paraná (Brazil) - 2012-2016 - researching about extracting imprecise temporal information from text and dealing with spelling errors.


