Machine Learning-Powered Data Cleaning for LEGEND

ORAL

Abstract

The Large Enriched Germanium Experiment for Neutrinoless Double-Beta Decay (LEGEND) will operate in two phases in the search for neutrinoless double-beta decay (0νββ). The first (second) stage will employ up to 200 (1000) kg of 76Ge semiconductor detectors to achieve a half-life sensitivity of 1027 (1028) years. In this study, we present a data-driven approach to remove non-physical events captured by 76Ge detectors in LEGEND-200 powered by a novel artificial intelligence model. We utilize Affinity Propagation to cluster events based on their shape and a Support Vector Machine to classify events into different categories. We demonstrate that our model efficiently classifies different categories of events, achieving a physical event sacrifice of < 0.001 %. This method will provide an automated data cleaning mechanism for LEGEND, which requires significant time and human effort when performed with traditional procedures.

*This work is supported by the U.S. DOE and the NSF, the LANL, ORNL and LBNL LDRD programs; the European ERC and Horizon programs; the German DFG, BMBF, and MPG; the Italian INFN; the Polish NCN and MNiSW; the Czech MEYS; the Slovak SRDA; the Swiss SNF; the UK STFC; the Russian RFBR; the Canadian NSERC and CFI; the LNGS, SNOLAB, and SURF facilities.

Publication: Machine Learning-Powered Data Cleaning for LEGEND

Presenters

  • Esteban A León

    • University of North Carolina at Chapel Hill

Authors

  • Esteban A León

    • University of North Carolina at Chapel Hill
  • Julieta Gruszko

    • University of North Carolina
    • University of North Carolina at Chapel Hill
  • Aobo Li

    • University of North Carolina at Chapel H
  • Miguel Angel Bahena Schott

    • University of North Carolina at Chapel Hill