Building Domain Ontologies from Text Analysis:an application for Question Answering

Delmonte, Rodolfo

In the field of information extraction and automatic question answering access to a domain ontology may be of great help. But the main problem is building such an ontology, a difficult and time consuming task. We propose an approach in which the domain ontology is learned from the linguistic analysis of a number of texts which represent the domain itself. NLP analysis is done with GETARUNS system. GETARUNS can build a Discourse Model and is able to assign a relevance score to each entity. From Discourse Model we extract best candidates to become concepts in the domain ontology. To arrange concepts in the correct hierarchy we use WordNet taxonomy. Once the domain ontology is built we reconsider the texts to extract information. In this phase the entities recognized at discourse level are used to create instances of the concepts. The predicate-argument structure of the verb is used to construct instance slots for concepts. Eventually, the question answering task is performed by translating the natural language question in a suitable form and use that to query the Discourse Model enriched by the ontology.