An adaptive cost-sensitive learning approach in neural networks to minimize local training–test class distributions mismatch

Ohad Volk, Gonen Singer

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

We design an adaptive learning algorithm for binary classification problems whose objective is to reduce the cost of misclassified instances derived from the consequences of errors. Our algorithm (Adaptive Cost-Sensitive Learning — AdaCSL) adaptively adjusts the loss function to bridge the difference between the class distributions between subgroups of samples in the training and validation data sets. This adjustment is made for samples with similar predicted probabilities, in such a way that the local cost decreases. This process usually leads to a reduction in cost when applied to the test data set (i.e., local training–test class distributions mismatch). We present empirical evidence that neural networks used with the proposed algorithm yield better cost results on several data sets compared to other approaches. In addition, the proposed AdaCSL algorithm can optimize evaluation metrics other than cost. We present an experiment that demonstrates how utilizing the AdaCSL algorithm generates superior accuracy results. The AdaCSL algorithm can be used for applications in which the training set is noisy or when large variability may occur between the training and validation data sets, such as the classification of disease severity for a given subject based on other subjects. Our code is available at https://github.com/OhadVolk/AdaCSL.

Original languageEnglish
Article number200316
JournalIntelligent Systems with Applications
Volume21
DOIs
StatePublished - Mar 2024

Bibliographical note

Publisher Copyright:
© 2023 The Author(s)

Keywords

  • Adaptive loss function
  • Cost-sensitive
  • Misclassification costs
  • Training-test mismatch

Fingerprint

Dive into the research topics of 'An adaptive cost-sensitive learning approach in neural networks to minimize local training–test class distributions mismatch'. Together they form a unique fingerprint.

Cite this