RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells

Omer Kaspi, Abraham Yosipof, Hanoch Senderowitz

Research output: Contribution to journalArticlepeer-review

22 Scopus citations

Abstract

An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a “one stop shop” algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For “future” predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.

Original languageEnglish
Article number34
JournalJournal of Cheminformatics
Volume9
Issue number1
DOIs
StatePublished - 6 Jun 2017

Bibliographical note

Publisher Copyright:
© 2017 The Author(s).

Funding

The authors acknowledge financial support from the Israeli National Nanotechnology Initiative (INNI, FTA project).

FundersFunder number
Israeli National Nanotechnology Initiative
Federal Transit Administration

    Keywords

    • Material-informatics
    • Photovoltaics
    • QSAR
    • RANSAC
    • Solar Cells

    Fingerprint

    Dive into the research topics of 'RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells'. Together they form a unique fingerprint.

    Cite this