Menu
Home
Contact us
Stats
Categories
Calendar
Toggle Wiki
Wiki Home
Last Changes
Rankings
List pages
Orphan pages
Sandbox
Print
Toggle Image Galleries
Galleries
Rankings
Toggle Articles
Articles home
List articles
Rankings
Toggle Blogs
List blogs
Rankings
Toggle Forums
List forums
Rankings
Toggle File Galleries
List galleries
Rankings
Toggle Maps
Mapfiles
Toggle Surveys
List surveys
Stats
ITHEA Classification Structure > I. Computing Methodologies  > I.7 DOCUMENT AND TEXT PROCESSING  
MULTI-AGENT SYSTEM FOR SIMILARITY SEARCH IN STRING SETS
By: Katarzyna Harężlak, Michał Sala (3669 reads)
Rating: (1.00/10)

Abstract: The aim of the paper is to present the assumptions and the architecture of the system for searching similarity in string sets. During the research all the required steps of a procedure of text documents processing which includes text extraction, pruning, stemming and lemmatization were analysed. Models of a text documents’ description and the method of creating a vector of features were developed as well. This vector consists, inter alia, of chosen words and the number of their occurrences. The process of the text analysis is supported by a set of various dictionaries. These are Stop-words, Domain and Lemma dictionaries and all of them were considered in the context of the Polish language. Because the Lemma dictionary is supposed to consist of many entries, the efficient method of its access optimisation was elaborated. Various measures used for calculating degree of a text documents similarity were studied too. Moreover, the method for determining the quality of user queries and text documents adjustment were proposed. The system was realized in accordance with the idea of multi-agent systems. Its functionality is ensured by the set of agents acting on the basis of separate threads. In the research, tests of the system work efficiency were also performed.

Keywords: agent systems, text similarity search

ACM Classification Keywords: I.7 Document And Text Processing

Link:

MULTI-AGENT SYSTEM FOR SIMILARITY SEARCH IN STRING SETS

Katarzyna Harężlak, Michał Sala

http://www.foibg.com/ibs_isc/ibs-26/ibs-26-p09.pdf

Print
I.7 DOCUMENT AND TEXT PROCESSING
article: NEAREST NEIGHBOR SEARCH AND SOME APPLICATIONS · MULTI-AGENT SYSTEM FOR SIMILARITY SEARCH IN STRING SETS ·
Login
[ register | I forgot my password ]
World Clock
Powered by Tikiwiki Powered by PHP Powered by Smarty Powered by ADOdb Made with CSS Powered by RDF powered by The PHP Layers Menu System
RSS Wiki RSS Blogs rss Articles RSS Image Galleries RSS File Galleries RSS Forums RSS Maps rss Calendars
[ Execution time: 0.08 secs ]   [ Memory usage: 7.50MB ]   [ GZIP Disabled ]   [ Server load: 0.41 ]
Powered by Tikiwiki CMS/Groupware