Abstract: In this paper we present some research results and propose solutions for natural language string
lookup techniques. In particular a fast method is suggested for searching dictionary entries for possible matches
of sentence words without using relational databases or full dictionary load into machine random access memory.
Such approach is essential for minimizing the speed dependency from dictionary size and available machine
resources as well as for the scalability of the analyzer software. The mentioned is based on an implementation of
Aho-Corasick? Aho, Corasick, 1977 automata with a number of optimizations in the indexing and lookup
algorithm.
Keywords: UNL, natural language processing, dictionary lookup, indexing, search, XML, pattern matching
machine, string matching algorithm, information search
ACM Classification Keywords: F.2.2 Non-numerical Algorithms and Problems – Pattern matching, Sorting and
searching, I.2.7 Natural Language Processing - Text analysis, Language parsing and understanding, I.7.2
Document Preparation – Index generation, Markup languages, H.3.1 Content Analysis and Indexing –
Dictionaries, Indexing Methods, H.3.3 Information Search and Retrieval - Retrieval models, Search process.
Link:
IMPLEMENTATION OF DICTIONARY LOOKUP AUTOMATA
FOR UNL ANALYSIS AND GENERATION
Igor Zaslawskiy, Aram Avetisyan, Vardan Gevorgyan
http://www.foibg.com/ijita/vol17/ijita17-2-p03.pdf