Abstract
Knowledge inference from semi-structured data can utilize frequent sub structures, in addition to frequency of data items. In fact, the working assumption of the present study is that frequent sub-trees of XML data represent sets of tags (objects) that are meaningfully associated. A method for extracting frequent sub-trees from XML data is presented. It uses thresholds on frequencies of paths and on the multiplicity of paths in the data. The frequent sub-trees are extracted and counted in a procedure that has O(n2) complexity. The data content of the extracted sub-trees, in the form of attribute values, is cast in tabular form. This enables a search for associations in the extracted data. Thus, the complete procedure uses structure and content to extract association rules from semistructured data. A large industrial example is used to demonstrate the operation of the proposed method.
Original language | English |
---|---|
Title of host publication | WISE 2002 - Proceedings of the 3rd International Conference on Web Information Systems Engineering Workshops |
Editors | Bo Huang, Tok Wang Ling, Ji-Rong Wen, S.K. Gupta, Wee Keong Ng, Mukesh Mohania |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 178-183 |
Number of pages | 6 |
ISBN (Electronic) | 0769518133, 9780769518138 |
DOIs | |
State | Published - 2002 |
Externally published | Yes |
Event | 3rd International Conference on Web Information Systems Engineering Workshops, WISE 2002 - Singapore, Singapore Duration: 11 Dec 2002 → … |
Publication series
Name | WISE 2002 - Proceedings of the 3rd International Conference on Web Information Systems Engineering Workshops |
---|
Conference
Conference | 3rd International Conference on Web Information Systems Engineering Workshops, WISE 2002 |
---|---|
Country/Territory | Singapore |
City | Singapore |
Period | 11/12/02 → … |
Bibliographical note
Publisher Copyright:© 2002 IEEE.