Abstract
TextVis is a visual data mining system for document collections. Such a collection represents an application domain, and the primary goal of the system is to derive patterns that provide knowledge about this domain. Additionally, the derived patterns can be used to browse the collection. TextVis takes a multi-strategy approach to text mining, and enables defining complex analysis schemas from basic components, provided by the system. An analysis schema is constructed by dragging functional icons from a tool-pallette onto the workspace and connecting them according to the desired flow of information. The system provides a large collection of basic analysis tools, including: frequent sets, associations, concept distributions, and concept correlations. The discovered patterns are presented in a visual interface allowing the user to operate on the results, and to access the associated documents. TextVis is a complete text mining system which uses agent technology to access various online information sources, text preprocessing tools to extract relevant information from the documents, a variety of data mining algorithms, and a set of visual browsers to view the results. This paper provides an overview on the TextVis system. We describe the system's architecture, the various tools, and discuss the advantages of our visual environment for mining large document collections.
Original language | American English |
---|---|
Title of host publication | Principles of Data Mining and Knowledge Discovery, Second European Symposium, PKDD '98 |
Publisher | Springer Berlin Heidelberg |
State | Published - 1998 |