Data di Pubblicazione:
2008
Abstract:
In this paper we investigate the possibility of an automatic construction of conceptual taxonomies and evaluate the achievable results. The hierarchy is performed by Ward algorithm, guided by Goodman-Kruskal τ as proximity measure. Then, we provide a concise description of each cluster by a keyword representative selected by PageRank.
The obtained hierarchy has the same advantages - both descriptive and operative - of indices on keywords which partition a set of documents with respect to their content.
We performed experiments in a real case - the abstracts of the papers published in ACM TODS in which the papers have been manually classified into the ACM Computing Taxonomy (CT).We evaluated objectively the generated hierarchy by two methods: Jaccard measure and entropy. We obtained good results by both the methods. Finally we evaluated the capability to classify in the categories of the two taxonomies showing that KH provides a greater facility than CT.
Tipologia CRIS:
04A-Conference paper in volume
Keywords:
knowledge discovery; taxonomy; page ranks; proximity measure
Elenco autori:
Meo, Rosa; Ienco, Dino
Link alla scheda completa:
Titolo del libro:
DATA WAREHOUSING AND KNOWLEDGE DISCOVERY
Pubblicato in: