Learning probabilities from random observables in high dimensions: the maximum entropy distribution and others

Tomoyuki Obuchi; Simona Cocco; Remi Monasson

Learning probabilities from random observables in high dimensions: the maximum entropy distribution and others

ORAL

Abstract

We consider the problem of learning a target probability distribution over a set of $N$ binary variables from the knowledge of the expectation values (with this target distribution) of $M$ observables, drawn uniformly at random. The space of all probability distributions compatible with these $M$ expectation values within some fixed accuracy, called version space, is studied. We introduce a biased measure over the version space, which gives a boost with the entropy of the distributions and with an arbitrary `temperature'. The choice of the temperature allows us to interpolate between the flat measure over all the distributions and the pointwise measure concentrated at the maximum entropy distribution. Using the replica method we compute the volume of the version space and other quantities of interest, such as the distance $R$ between the target distribution and the center-of-mass distribution over the version space. Some phase transitions are found, corresponding to qualitative improvements in the learning of the target distribution and to the decrease of the distance $R$. However, the distance $R$ does not vary with the temperature, meaning that the maximum entropy distribution is not closer to the target distribution than any others.

March 15, 2016, 11:15 AM – March 15, 2016, 11:27 AM

Authors

Tomoyuki Obuchi
- Tokyo Institute of Technology
Simona Cocco
- Laboratoire de Physique Statistique de l’Ecole Normale Superieure
Remi Monasson
- Laboratoire de Physique Theorique de l’Ecole Normale Superieure