Björn Gambäck, Swedish Institute of Computer Science
Lexical Acquisition
The Swedish VEX System

Lexical Acquisition The Swedish VEX System

Abstract: "The paper describes S-VEX, the lexical acquisition component of the Swedish Core Language Engine (S-CLE). The S-CLE is a general purpose natural language processing system for Swedish developed from its English counter-part, the SRI Core Language Engine. In parallel with the development of the S-CLE, a Swedish version of the English VEX (Vocabulary EXpander) system was designed. S-VEX allows for the creation of lexicon entries by users with knowledge of an application domain but not of linguistics or of the detailed workings of the system. The approach taken is based on eliciting grammaticality judgments and information of inflected forms interactively from the user. The S-VEX system and the lexicon of the S-CLE is described, as well as the problems of the specific lexical acquisition task and their solutions. The only 'real' linguistic information the user needs to provide is the general category (noun, verb, or adjective) of the new lexical entry. All other information needed to create lexicon entries is obtained from the user's answers to questions/examples presented to her by the system. For determining the morphological inflections, the system is equipped with knowledge of 41 different inflectional classes. When constructing the syntatic usage part of the lexical entries, S-VEX is provided with pointers to entries in a 'paradigm' lexicon for a number of representative word usages and declarative knowledge of the range of sentential contexts in which these usages can occur. This knowledge is encoded in 58 sentence patterns. The paradigm lexicon of the present system contains 62 different paradigms: 5 for nouns, 10 for adjectives, and all the others for verbs."
Sign up to use