Abstract: The classification of scientific papers is based on the ability of the artificial system (let’s call such a
system ARSA i.e. Automated Review of Scientific Articles) to reflect the similarity of different scientific papers and
differential of similar papers. To identify the text as similar to and different from other texts a set of characteristics
needs to be used. In this paper the approach of the extraction of “linguistic items” from scientific paper that
provides representative information about the document content is considered.
Keywords: text mining, word’s properties, text pattern.
ACM Classification Keywords: I.2 Artificial intelligence: I.2.7 Natural Language Processing: Text analysis.
Link:
NUMERIC-LINGUAL DISTINGUISHING FEATURES OF SCIENTIFIC DOCUMENTS
Vladimir Lovitskii, Ina Markova, Krassimir Markov, Ilia Mitov
http://foibg.com/ibs_isc/ibs-16/ibs-16-p11.pdf