Defect prediction is a technique introduced to optimize the testing phase of the software development pipeline by predicting which components in the software may contain defects. Its methodology trains a classifier with data regarding a set of features measured on each component from the target software project to predict whether the component may be defective or not. However, suppose the defective information is not available in the training set. In that case, we need to rely on an alternate approach that uses the training set of external projects to train the classifier. This approached is called cross-project defect prediction. Bad code smells are a category of features that have been previously explored in defect prediction and have been shown to be a good predictor of defects. Code smells are patterns of poor development in the code and indicate flaws in its design and implementation. Although they have been previously studied in the context of defect prediction, they have not been studied as features for cross-project defect prediction. In our experiment, we train defect prediction models for 100 projects to evaluate the predictive performance of the bad code smells. We implemented four cross-project approaches known in the literature and compared the performance of 37 smells with 56 code metrics, commonly used for defect prediction. The results show that the cross-project defect prediction models trained with code smells significantly improved 6.50 % on the ROC AUC compared against the code metrics.
|Number of pages||11|
|State||Published - Nov 2021|
Bibliographical notePublisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
- Code smell
- Cross-project defect prediction
- Defect prediction
- Mining software repositories
- Software engineering
- Software quality