Abstract
Knowledge Discovery in Databases (KDD)
focuses on the computerized exploration of
large amounts of data and on the discovery of
interesting patterns within them. While most
work on KDD has been concerned with
structured databases, there has been little
work on handling the huge amount of
information that is available only in
unstructured textual form. This paper
describes the KDT system for Knowledge
Discovery in Texts. It is built on top of a
text-categorization paradigm where text
articles are annotated with keywords
organized in a hierarchical structure.
Knowledge discovery is performed by
analyzing the co-occurrence frequencies of
keywords from this hierarchy in the various
documents. We show how this termfrequency
approach supports a range of
KDD operations, providing a general
framework for knowledge discovery and
exploration in collections of unstructured
text.
Original language | American English |
---|---|
Title of host publication | symposium on document analysis and information retrieval (SDAIR-96) |
State | Published - 1996 |