Big Data of Materials Science -- Critical Role of the Descriptor

ORAL

Abstract

Statistical learning of materials properties or functions so far starts with a largely silent, non-challenged step: the introduction of a multidimensional descriptor. However, when the scientific relationship of the descriptor to the actuating mechanisms is unclear, causality of the trained (learned) descriptor-property relation is uncertain. Thus, scientific advancement, trustful prediction of new promising materials and identification of anomalies is doubtful. We discuss and analyse this issue and define requirements for a descriptor that is suited for statistical learning of materials properties and functions. We show how a meaningful descriptor can be found systematically, by means of compressed sensing techniques. These concepts are demonstrated for examples in materials science: prediction of the relative stability of zincblende/wurtzite vs rocksalt octet binary semiconductors, and prediction of their band gaps, by using simple atomic input for building the descriptor.

Authors

  • Luca M. Ghiringhelli

    • Fritz-Haber-Institut der MPG, Berlin, DE
    • Fritz-Haber-Institut der MPG, Berlin
  • Jan Vybiral

    • Charles University, Prague, CZ
  • Sergey V. Levchenko

    • Fritz-Haber-Institut der MPG, Berlin, DE
  • Claudia Draxl

    • Humboldt-Universit\"{a}t zu Berlin, DE
  • Matthias Scheffler

    • Fritz-Haber-Institut der MPG, Berlin, DE