Abstract: This paper describes a clustering method for labeled link network (semantic graph) that can be used to group important nodes (highly connected nodes) along with their relevant link’s labels by using a technique borrowed from multilinear algebra known as PARAFAC tensor decomposition. In this kind of network, the adjacency matrix can not be used to fully describe all information about the network structure. We have to expand the matrix into 3-way adjacency tensor, so that not only the information about to which nodes a node connects to but by which link’s labels is also included. And by applying PARAFAC decomposition, we get two lists, nodes and link’s labels with scores attached to them for each decomposition group. So clustering process to get the important nodes along with their relevant labels can be done simply by sorting the lists in decreasing order. To test the method, we construct labeled link network by using blog's dataset, where the blogs are the nodes and labeled links are the shared words among them. The similarity measures between the results and standard measures look promising, especially for two most important tasks, finding the most relevant words to blogs query and finding the most similar blogs to blogs query, about 0.87.
Keywords: Blogs, Clustering Method, Labeled-link Network, PARAFAC Decomposition.
ACM Classification Keywords: I.7.1 Document management
Link:
WEBLOG CLUSTERING IN MULTILINEAR ALGEBRA PERSPECTIVE
Andri Mirzal
http://foibg.com/ijita/vol16/IJITA16-4-p03.pdf