Abstract: RUBRYX is a document classifier developed in 2000s for processing large
volumes of Web information. RUBRYX uses weighted sum of n-grams (n=1,2,3) extracted
from a very limited number of samples (about 5-10) and takes into account their mutual
position in a given text. This sophisticated algorithm proves to be very effective in
classifying primary medical records presented in a free text form. In the paper we study
possibilities of RUBRYX (version 2.2) on a limited document set in Spanish.
These documents are medical histories related to stomach diseases. Such area should be
considered as a narrow subset of medical records. The high quality of archived results
(accuracy 80%-90%) allows us to recommend RUBRYX for similar applications.
Keywords: natural language processing, medical diagnostics, document classification
ACM Classification Keywords: I.2.7 Natural Language Processing
Link:
CLASSIFICATION OF PRIMARY MEDICAL RECORDS WITH RUBRYX-2: FIRST EXPERIENCE
Olga Kaurova, Mikhail Alexandrov, Ales Bourek
http://www.foibg.com/ibs_isc/ibs-26/ibs-26-p04.pdf