Quantifying Similarity between Polymer Ensembles

ORAL

Abstract

Synthetic polymers are typically stochastic in nature, which means that instead of having a single well-defined structure, they are ensembles with distributions across molecular mass, topology and sequence. Thus, determining the similarity between polymers is significantly more challenging than for small molecules where a number of established methods exist. When dealing with polymer ensembles, a typical approach is to compute the embedding vector or fingerprint for monomer or polymer and then take the weighted average to yield a single embedding vector for the ensemble. This pre-averaging can obfuscate differences between ensembles and can erroneously yield predictions of perfect similarity for two dissimilar ensembles. Using inspiration from the informatics community, we adopt the Earth Mover's Distance (EMD) along with explicit calculation of distances between all constituents. As its name implies, EMD allows for computing the cost of moving an entity, earth or probability, from one location to another. We demonstrate the utility of this technique with a number of case studies ranging from copolymers to experimentally measured molecular mass distributions. Ultimately, similarity can be used for enhanced search in data resources such as a Community Resource for Innovation in Polymer Technology and to accelerate machine learning enabled polymer design.

*This work was primarily funded by the National Science Foundation Convergence Accelerator award number ITE-2134795.

Presenters

  • Debra J Audus

    • NIST

Authors

  • Debra J Audus

    • NIST
  • Jiale Shi

    • Massachusetts Institute of Technology
  • Dylan Walsh

    • Massachusetts Institute of Technology
  • Weizhong Zou

    • Massachusetts Institute of Technology
  • Nathan J Rebello

    • Massachusetts Institute of Technology
  • Michael E Deagen

    • Rensselaer Polytechnic Institute
  • Katharina Fransen

    • Massachusetts Institute of Technology
  • Xian Gao

    • University of Notre Dame
  • Bradley D Olsen

    • Massachusetts Institute of Technology MI
    • Massachusetts Institute of Technology