Abstract
Decision forests are known to excel in tabular data once their hyperparameters are well-tuned. In addition to being accurate and robust classifiers, these models can be easily converted into a collection of if-else rules that clearly describe the model's decision-making process. However, in practice, decision trees may reach enormous depth and size, and as a result, the collection of rules is vast and complicated. Furthermore, the model may consume a significant amount of memory space on the machine. By creating compact models, it is possible to establish a modest set of rules that may be easier to understand, require less memory, and by nature, may increase the decision-making ability and avoid overfitting. Previous studies attempted to reduce the size of the trees by altering their structure, but this affects both their advantages and simplicity because their structure is much more complicated. (e.g. oblique trees). In this research, we present FACET, a novel algorithm for retaining the compactness of trees while referring to the model as a black box. FACET addresses this by utilizing automated feature engineering methods, which generate a new feature set from the data set by manipulating the current feature set, resulting in a drastic reduction in the size of the decision trees while preserving and even improving the model's accuracy. Our algorithm, FACET, has been extensively tested on multiple datasets, models, and operators to demonstrate its effectiveness. On average, FACET achieves a 24% reduction in the size criteria of the tree-based model without sacrificing accuracy. This reduction in size leads to an average memory reduction of 44% on the dataset required for learning. These statistically significant results demonstrate the potential of FACET to enable more efficient and interpretable tree-based models, without compromising their accuracy, in practical applications.
Original language | English |
---|---|
Article number | 120470 |
Journal | Expert Systems with Applications |
Volume | 229 |
State | Published - 1 Nov 2023 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2023 Elsevier Ltd
Keywords
- Decision-tree
- Feature-engineering
- Machine-learning
- Space-complexity