Keyword-based browsing and analysis of large document sets

I. Dagan, Ronen Feldman, Haym Hirsh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured textual form. This paper describes the KDT system for Knowledge Discovery in Texts. It is built on top of a text-categorization paradigm where text articles are annotated with keywords organized in a hierarchical structure. Knowledge discovery is performed by analyzing the co-occurrence frequencies of keywords from this hierarchy in the various documents. We show how this termfrequency approach supports a range of KDD operations, providing a general framework for knowledge discovery and exploration in collections of unstructured text.
Original languageAmerican English
Title of host publicationsymposium on document analysis and information retrieval (SDAIR-96)
StatePublished - 1996

Bibliographical note

Place of conference:Las Vegas, Nevada, USA


Dive into the research topics of 'Keyword-based browsing and analysis of large document sets'. Together they form a unique fingerprint.

Cite this