Combining Knowledge- and Corpus-based Word-Sense-Disambiguation Methods

Andres Montoyo                                                                             montoyo@dlsi.ua.es

Dept. of Software and Computing Systems

University of Alicante, Spain 

Armando Suárez                                                                               armando@dlsi.ua.es

Dept. of Software and Computing Systems

University of Alicante, Spain 

German Rigau                                                                                      rigau@si.ehu.es

IXA Research Group

Computer Science Department

Basque Country University, Donostia 

Manuel Palomar                                                                            mpalomar@dlsi.ua.es

Dept. of Software and Computing Systems

University of Alicante, Spain m

Abstract:

In this paper we concentrate on the resolution of the lexical ambiguity that arises when a given word has several different meanings. This specific task is commonly referred to as word sense disambiguation (WSD). The task of WSD consists of assigning the correct sense to words using an electronic dictionary as the source of word definitions. We present two WSD methods based on two main methodological approaches in this research area: a knowledge-based method and a corpus-based method. Our hypothesis is that word-sense disambiguation requires several knowledge sources in order to solve the semantic ambiguity of the words. These sources can be of different kinds-- for example, syntagmatic, paradigmatic or statistical information. Our approach combines various sources of knowledge, through combinations of the two WSD methods mentioned above. Mainly, the paper concentrates on how to combine these methods and sources of information in order to achieve good results in the disambiguation. Finally, this paper presents a comprehensive study and experimental work on evaluation of the methods and their combinations.