Bioprocesses play a critical role in producing enzymes, hormones, antibodies, and other high-value proteins for sectors ranging from pharmaceuticals to food and consumer goods. One major challenge in this field is predicting whether these engineered proteins will be efficiently exported out of the cell by host organisms, a process known as protein secretion. Secretion is a key factor in reducing purification costs and scaling production.
Traditionally, identifying the right signal peptides, short amino acid sequences that trigger secretion, has required costly, time-consuming screening campaigns. Now, researchers at the Sargent Centre are using advanced AI tools to radically improve this process.
Dr. Laura Helleckes, a Sargent Centre Postdoctoral Research Fellow, leads a project that uses Graph Neural Networks and Protein Language Models to predict protein secretion efficiency in silico. These models can capture complex structural and sequential features of proteins, creating molecular “fingerprints” that can be used to forecast secretion outcomes — even before lab experiments begin.
In partnership with dsm-firmenich, this AI-driven approach is being applied to real-world high-throughput experiments. The goal: to streamline bioprocess development pipelines by rapidly identifying which candidates are likely to be secreted effectively in specific host systems. This capability not only speeds up product development but also reduces experimental burden, enabling faster iteration and innovation.
Dr. Helleckes’ work integrates cutting-edge machine learning with newly available high-throughput secretion data. By combining predictions from GNNs and PLMs with experimental metadata, the models can be trained to recognize which signal peptides work best. This research is currently expanded across different proteins and has the potential to be applied to a wide range of host organisms.
This project exemplifies how AI can transform biomanufacturing by moving from empirical trial-and-error to intelligent, data-driven design. As the demand for biotechnological products continues to grow, predictive tools like these can dramatically improve efficiency, reduce waste, and enable scalable, sustainable innovation across multiple sectors.









