Abstract: RICHE (Research Inventory of Child Health in Europe) is a platform developed and funded under the Health
domain of 7th European Framework Program. The platform search engine is expected to use the multilingual taxonomy of
terms for processing and classifying large volumes of documents of the RICHE repository. So far the experts participating in
this project have produced the initial version of expert based taxonomy of terms relating to child health (based on existing
taxonomies). In the paper we propose a simple man-machine technique for continuous support and development of the
existing term list, which consists of three steps: 1) construction of various keyword lists extracted from a topic oriented
document set using various levels of word specificity 2) selection of the most useful keyword lists using subjective criteria as
a precision of selection and a number of new words 3) manual selection of new terms. Experimental material was
represented by documents uploaded from three organizations active in child health improvement policies: World Bank, World
Health Organization (WHO), and DG SANCO of European Commission (EC). The selection was performed in order to
assess terms used in these documents that may be absent in the RICHE taxonomy. Presented work should be considered
as a pilot (feasibility) study. The objective of the RICHE platform is to identify gaps in European child health research, so
extensive mapping exercise has been started. Classification of identified studies is essential and cannot be based only on
traditional terms of existing taxonomies. Emergent terms (such as for example “cyberbullying”) need to be identified and
included into existing taxonomies. In our future work we focus on developing techniques related to multilevel and multiword
term selection
Keywords: Child health, natural language processing, taxonomy, term selection
ACM Classification Keywords: I.2.7 Natural Language Processing
Link:
FOLKSONOMY - SUPPLEMENTING RICHE EXPERT BASED TAXONOMY BY TERMS FROM ONLINE DOCUMENTS (Pilot Study)
Aleš Bourek, Mikhail Alexandrov, Roque Lopez
http://foibg.com/ibs_isc/ibs-23/ibs-23-p11.pdf