Abstract: This paper describes the rationale and partial results of an ongoing experiment aiming to perform
analysis and visualization of social and organizational structure of Polish scientific community. Work described in
this paper concerns automated extraction of information (related to DSc theses) from Internet based sources and
visualization of resulting data set. Primary database that was mined was a scientific information repository “Polish
Science (“Nauka Polska” in Polish) maintained by the Information Processing Centre (OPI – “Ośrodek
Przetwarzania Informacji” in Polish). The nature of this repository, specifically lack of any data export facility and
web scrapping prevention provisions implemented (despite the fact, that database is supported by public funds),
complicated data extraction process. However, when finally downloaded, verified and processed the data set
proved to be very interesting and valuable, making possible statistical analysis and geospatial visualization.
Specifically, this paper describes creation of a software tool able to create geographical maps depicting
collaboration between various research institutions in Poland during the process of DSc theses review. It should
be regarded as a report on a work in progress. It is a part of larger SYNAT project, being a government funded
initiative to create an ICT infrastructure supporting scientific collaboration and exchange of research data in
Poland. The results presented herein constitute just a proof of concept for a visualization approach that will be
implemented when more detailed data, concerning Polish scientific community, will be collected during other
SYNAT activities.
Keywords: bibliometrics, information visualization, social network analysis, web mining
ACM Classification Keywords: J.4 SOCIAL AND BEHAVIORAL SCIENCES
Link:
ANALYSING AND VISUALIZING POLISH SCIENTIFIC COMMUNITY1
Piotr Gawrysiak, Dominik Ryżko
http://foibg.com/ibs_isc/ibs-23/ibs-23-p05.pdf