Abstract: This paper proposes a logical-linguistic model extracting semi-structured facts in English
texts. To identify the fact some entities expressed by lexical units as well as semantic relations between
them are defined in the text. The semantic relations are expressed by semantic functions of sentence
participants. A fact is written in form of a triplet: Subject - Predicate - Object, in which the Predicate
represents the relations and Subject and Object define the subjects, objects or concepts. Two types of
the facts are defined. The first type is fact that describes relation between two entities; the second one is
fact that fixes the value of a predetermined attribute. The functions are described by predicates of
algebra of finite predicates. The mathematical model allows associating meaning relations of concepts
of a sentence with elements of the syntactic and morphological structure of the English sentence. The
model is applied to the semantic stage of linguistic processor of information subsystem for facts
identification, which are essential for business analysis, in the framework a semi-structured texts.
Software implementation of the model is designed. The input subsystem receives text streams disparate
sources of information of the integrated corporate system, basic facts of space of the system are output.
The accuracy and completeness extracted facts from texts in English by the subsystem are compatible
with extracted facts by an expert.
Keywords: the system of facts generation, the automatic facts extraction, semantic relations, the
algebra of finite predicates, natural language processing.
ACM Classification Keywords: H.3.3 .Information Search and Retrieval, I.2.4. Knowledge
Representation Formalisms and Methods
Link:
ЛОГИКО-ЛИНГВИСТИЧЕСКАЯ МОДЕЛЬ ГЕНЕРАЦИИ ФАКТОВ ИЗ ТЕКСТОВЫХ
ПОТОКОВ ИНФОРМАЦИОННОЙ КОРПОРАТИВНОЙ СИСТЕМЫ
(Logic-linguistic model of fact generation from text streams of corporate information system)
Нина Хайрова, Наталья Шаронова, Аджит Пратап Сингх Гаутам
http://www.foibg.com/ijita/vol22/ijita22-02-p04.pdf