Subgroup Discovery for Finding Interpretable Local Patterns in Data from Materials-Science
ORAL
Abstract
We demonstrate that subgroup discovery (SGD) can help find physically meaningful descriptors from materials-science data obtained by first-principles calculations. In contrast to global modelling algorithms, SGD finds descriptions of subpopulations in which, locally, the target property takes on an interesting distribution. First, the SGD algorithm is formulated for materials applications. Next, SGD is applied to gas-phase gold clusters (having 5 to 14 atoms) to discern patterns between their geometrical and physicochemical properties. Additionally, SGD is shown to identify subgroups that classify 79 of the 82 octet binary materials as either rock salt or zincblende from only information of its chemical composition. SGD is also used to find descriptors that predict both the formation and bandgap energies of transparent conducting oxides. Lastly, an efficient optimal solver using branch-and-bound is developed for dispersion-corrected objective functions to help find improved subgroups.
*This work is supported through the European Union’s Horizon 2020 research and innovation program under Grant agreement No. 676580 with The Novel Materials Discovery (NOMAD) Laboratory, a European Center of Excellence.
–
Presenters
-
Bryan Goldsmith
- University of Michigan
- Chemical Engineering, University of Michigan