Data di Pubblicazione:
2017
Abstract:
The maturity of structured knowledge bases and semantic resources has contributed to the enhancement of document clustering algorithms, that may take advantage of conceptual representations as an alternative for classic bag-of-words models. However, operating in the semantic space is not always the best choice in those domain where the choice of terms also matters. Moreover, users are usually required to provide a valid number of clusters as input, but this parameter is often hard to guess, due to the exploratory nature of the clustering process. To address these limitations, we propose a multi-view co-clustering approach that processes simultaneously the classic document-term matrix and an enhanced document-concept representation of the same collection of documents. Our algorithm has multiple key-features: it finds an arbitrary number of clusters and provides clusters of terms and concepts as easy-to-interpret summaries. We show the effectiveness of our approach in an extensive experimental study involving several corpora with different levels of complexity.
Tipologia CRIS:
04A-Conference paper in volume
Keywords:
co-clustering, semantic enrichment, multi-view clustering
Elenco autori:
Rho, Valentina; Pensa, Ruggero G.
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Foundations of Intelligent Systems. ISMIS 2017.
Pubblicato in: