Raman spectroscopy in open world learning settings using the Objectosphere approach
ORAL
Abstract
Raman spectroscopy, combined with machine learning techniques, holds great promise for many applications as a rapid, sensitive, and label-free identification method. Such approaches perform well when classifying spectra of chemical species that were encountered during the training phase. That is, species that are known to the neural network. However, in real-world settings such as in clinical applications, there will always be substances whose spectra have not yet been taken. When the neural network encounters these new species during the testing phase, the number of false positives becomes uncontrollable, limiting the usefulness of these techniques, especially in public safety applications.
To overcome these barriers, we implemented the recently introduced Entropic Open Set and Objectosphere loss functions. To demonstrate the efficacy and efficiency of this approach, we compiled a database of hyperspectral Raman images of 40 chemical species separating them into 3 class categorizations. The known class consisted of 20 biologically relevant species comprised of amino acids, the ignored class was 10 ``irrelevant" species comprised of bio-related chemicals, and the never seen before class was 10 various chemical species that the Neural Network had not seen before. We show that not only does this approach enable the network to effectively separate the unknown species while preserving high accuracy on the known ones and reducing false positives, but also that it performs better than the current gold standards in machine learning techniques. This opens the door to using Raman spectroscopy, combined with our novel machine learning algorithm, in a variety of practical applications.
Availability and implementation: Freely available on the web at: https://github.com/BalytskyiJaroslaw/RamanOpenSet.git.
To overcome these barriers, we implemented the recently introduced Entropic Open Set and Objectosphere loss functions. To demonstrate the efficacy and efficiency of this approach, we compiled a database of hyperspectral Raman images of 40 chemical species separating them into 3 class categorizations. The known class consisted of 20 biologically relevant species comprised of amino acids, the ignored class was 10 ``irrelevant" species comprised of bio-related chemicals, and the never seen before class was 10 various chemical species that the Neural Network had not seen before. We show that not only does this approach enable the network to effectively separate the unknown species while preserving high accuracy on the known ones and reducing false positives, but also that it performs better than the current gold standards in machine learning techniques. This opens the door to using Raman spectroscopy, combined with our novel machine learning algorithm, in a variety of practical applications.
Availability and implementation: Freely available on the web at: https://github.com/BalytskyiJaroslaw/RamanOpenSet.git.
*This work was supported by:1) National Institute of General Medical Sciences of the National Institutes of Health [grant number 1R15GM128166–01]2) UCCS BioFrontiers Center3) U.S. Civilian Research & Development Foundation (CRDF Global)
–
Publication: 1) Our paper is accepted to ACS Analytical Chemistry, DOI: 10.1021/acs.analchem.2c02666.
2) Preprint can be found here: https://arxiv.org/abs/2111.06268
3) ML models, code and data can be found here: https://github.com/BalytskyiJaroslaw/RamanOpenSet
Presenters
-
Yaroslav Balytskyi
- University of Colorado Colorado Springs