Automated Knowledge Graph Generation from Text for Synthesis of Energetic Materials

Connor P O'Ryan; Frank VanGessel; Zois Boukouvalas; Mark D Fuge; Peter W Chung; Ian Michel-Tyler; Ruth Doherty; William Wilson; Kevin Hayes

Automated Knowledge Graph Generation from Text for Synthesis of Energetic Materials

ORAL

Abstract

Within the past two decades machine-learning algorithms have seen diverse development and implementation in a variety of domains, including those related to shock compression. These developments include advances in computationally assisted synthesis planning and natural language processing for text documents in the context of chemical energy. The objective of this work is to explore the intersection of these emergent research capabilities and develop automatable approaches for extracting synthesis information for chemical storage from text documents to create novel representations via knowledge graphs. Knowledge graphs are composed of nodes and edges, wherein the nodes represent entities, such as chemical compounds, and the edges represent the relations between the entities, perhaps indicating solubility. The knowledge graph is generated automatically through a pipeline which utilizes several open-source resources, which are capable of identifying entities, such as the reaction product or other compounds, and linguistic features, including coreferences. As a result, the graph is heterogeneous, containing both natural language and chemical information. Additionally, in order to confirm a proportion of the information contained within the graph, it was linked to external databases, this step provides a means of checking edges between nodes that is not based on a probabilistic model. Following the creation of the graph, knowledge graph embedding techniques are implemented to recommend alternative synthesis pathways. While inspired by the various synthesis prediction frameworks, this work differs by utilizing information extraction algorithms on textual data to produce the database of synthesis information. Following the creation of the graph, the recommendation algorithm is trained on both chemical and the semantic features found within the graph.

July 14, 2022, 11:30 AM – July 14, 2022, 11:45 AM

Presenters

Connor P O'Ryan
- University of Maryland, College Park

Authors

Connor P O'Ryan
- University of Maryland, College Park
Frank VanGessel
- U.S. Naval Surface Warfare Center, Indian Head Division, Indian Head, MD
Zois Boukouvalas
- Department of Mathematics and Statistics, American University, Washington, DC
- American University
Mark D Fuge
- Department of Mechanical Engineering, University of Maryland, College Park
- University of Maryland, College Park
Peter W Chung
- University of Maryland, College Park
- Department of Mechanical Engineering, University of Maryland, College Park
Ian Michel-Tyler
- Energetics Technology Center, Indian Head, MD
- Energetics Technology Center
Ruth Doherty
- Energetics Technology Center, Indian Head, MD
- Energetics Technology Center
William Wilson
- Energetics Technology Center, Indian Head, MD
- Energetics Technology Center
Kevin Hayes
- University of Maryland, College Park