Predicting Protein Developability via Convolutional Sequence Representation

Alexander Golinski; Bryce Johnson; Sidharth Laxminarayan; Diya Saha; Sandhya Appiah; Benjamin Hackel; Stefano Martiniani

Predicting Protein Developability via Convolutional Sequence Representation

ORAL

Abstract

Engineered proteins have emerged as novel diagnostics, therapeutics, and catalysts. Often, poor protein developability - quantified by expression, solubility, and stability - hinders commercialization. The ability to predict protein developability from amino acid sequence would reduce the experimental burden when selecting candidates. Recent advances in screening technologies enabled a high-throughput (HT) developability dataset for 10⁵of 10²⁰ possible variants of protein scaffold Gp2. In this work, we evaluate the ability of neural networks to learn a developability representation from the HT dataset and transfer the knowledge to predict recombinant expression beyond the observed sequences. Mimicking protein theory, our model convolves learned amino acid properties to predict expression levels 42% closer to the experimental variance compared to a non-embedded control. Analysis of learned amino acid embeddings highlights the uniqueness of cysteine and the importance of hydrophobicity and charge, and unimportance of aromaticity, when aiming to improve developability. We identify clusters of similar sequences with increased developability through nonlinear dimensionality reduction (UMAP) and explore the inferred developability landscape via nested sampling.

March 15, 2021, 4:36 PM – March 15, 2021, 4:48 PM

Presenters

Alexander Golinski
- University of Minnesota
- Department of Chemical Engineering and Materials Science, University of Minnnesota

Authors

Alexander Golinski
- University of Minnesota
- Department of Chemical Engineering and Materials Science, University of Minnnesota
Bryce Johnson
- University of Minnesota
- School of Physics and Astronomy, University of Minnesota
Sidharth Laxminarayan
- University of Minnesota
Diya Saha
- University of Minnesota
- Department of Chemical Engineering and Materials Science, University of Minnnesota
Sandhya Appiah
- University of Minnesota
- Department of Chemical Engineering and Materials Science, University of Minnnesota
Benjamin Hackel
- University of Minnesota
Stefano Martiniani
- University of Minnesota
- Chemical Engineering and Materials Science, University of Minnesota
- Department of Chemical Engineering and Materials Science, University of Minnesota
- Department of Chemical Engineering and Materials Science, University of Minnnesota