Gaussian Process vs. Log Gaussian Cox Process: Comparing Methods for High Energy Physics Data

POSTER

Abstract

High energy physics (HEP) data from particle colliders are studied using statistical methods to compare the data we observe with what we expect. HEP collider data is often arranged into histograms of counts of the number of events observed with a given feature value which can be modeled with the Poisson distribution. In many analyses, the signal appears as a localized excess on top of a smooth background that must be modeled to observe a signal. For this project, we use toy data that mimics the falling exponential behavior of HEP data. An effective method for modeling smooth backgrounds in HEP data are Gaussian processes. Nonetheless, this method requires data to be binned, which loses information. Gaussian processes generally yield meaningful uncertainties, but it fails to capture the Poissonian uncertainties of HEP data. The log Gaussian Cox process is a novel method we expect to improve on those shortcomings. We compare the ability of the Gaussian process and the log Gaussian Cox process to reproduce a known intensity function when modeling the smooth background events. We find that the log Gaussian Cox process is promising, however, further exploration is needed to create an optimal model and develop a deeper understanding of the log Gaussian Cox process.

*This work was supported in part by the U.S. Department of Energy, Office of Science, Office of Workforce Development for Teachers and Scientists (WDTS) under the Science Undergraduate Laboratory Internships (SULI) program.

Authors

  • Pavani Jairam

    • Duke University
  • Rachel Hyneman

    • SLAC National Accelerator Laboratory
  • Michael Kagan

    • SLAC National Accelerator Laboratory