Big Data of Materials Science -- Critical Role of the Descriptor
ORAL
Abstract
Statistical learning of materials properties or functions so far starts with a largely silent, non-challenged step: the introduction of a multidimensional descriptor. However, when the scientific relationship of the descriptor to the actuating mechanisms is unclear, causality of the trained (learned) descriptor-property relation is uncertain. Thus, scientific advancement, trustful prediction of new promising materials and identification of anomalies is doubtful. We discuss and analyse this issue and define requirements for a descriptor that is suited for statistical learning of materials properties and functions. We show how a meaningful descriptor can be found systematically, by means of compressed sensing techniques. These concepts are demonstrated for examples in materials science: prediction of the relative stability of zincblende/wurtzite vs rocksalt octet binary semiconductors, and prediction of their band gaps, by using simple atomic input for building the descriptor.
–