You are in the accessibility menu

Please use this identifier to cite or link to this item: http://acervodigital.unesp.br/handle/11449/73556
Title: 
Improving hierarchical document cluster labels through candidate term selection
Author(s): 
Institution: 
  • Universidade de São Paulo (USP)
  • Universidade Estadual Paulista (UNESP)
ISSN: 
  • 1872-4981
  • 1875-8843
Abstract: 
One way to organize knowledge and make its search and retrieval easier is to create a structural representation divided by hierarchically related topics. Once this structure is built, it is necessary to find labels for each of the obtained clusters. In many cases the labels must be built using all the terms in the documents of the collection. This paper presents the SeCLAR method, which explores the use of association rules in the selection of good candidates for labels of hierarchical document clusters. The purpose of this method is to select a subset of terms by exploring the relationship among the terms of each document. Thus, these candidates can be processed by a classical method to generate the labels. An experimental study demonstrates the potential of the proposed approach to improve the precision and recall of labels obtained by classical methods only considering the terms which are potentially more discriminative. © 2012 - IOS Press and the authors. All rights reserved.
Issue Date: 
3-Sep-2012
Citation: 
Intelligent Decision Technologies, v. 6, n. 1, p. 43-58, 2012.
Time Duration: 
43-58
Keywords: 
  • association rules
  • Labeling hierarchical clustering
  • text mining
  • Classical methods
  • Experimental studies
  • Hier-archical clustering
  • Hierarchical document
  • Precision and recall
  • Search and retrieval
  • Structural representation
  • Text mining
  • Data mining
  • Association rules
Source: 
http://dx.doi.org/10.3233/IDT-2012-0121
URI: 
Access Rights: 
Acesso restrito
Type: 
outro
Source:
http://repositorio.unesp.br/handle/11449/73556
Appears in Collections:Artigos, TCCs, Teses e Dissertações da Unesp

There are no files associated with this item.
 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.