Boosting enzyme engineering with DNA recorders and epistasis-aware AI

© Elsa M.
Machine learning (ML) has revolutionized prediction and design of protein structure, but creating new protein function remains very challenging. Researchers from the LSAM, ETH Zurich and the MPI Munich have now developed a pipeline to boost the engineering of enzymes with new activities synergistically combining ultrahigh-throughput experiments with ML that leverages epistatic trends.
In the study published in Nature Communications, a new experimental approach for the ultradeep characterization of proteases is introduced. Proteases are key enzymes in all forms of life, which are responsible for the targeted degradation of proteins. The ability to reprogram them by engineering to specifically inactivate, for instance, disease-related proteins would have a tremendous impact across the life sciences.
Spearheaded by Lukas Huber, the team developed a so-called “DNA recorder”, which is a molecular device that allows to record the activity of proteases in DNA within cells of the bacterium Escherichia coli. Using next-generation sequencing, this DNA recorder allows to precisely measure the activity of hundreds of thousands of protease variants simultaneously on many different potential protein targets in a highly parallelized fashion. This does not only allow to identify variants with activity for new targets, but also to exclude potential side effects arising for the unwanted degradation of off-target proteins already in a very early stage of development.
Crucially, the new approach allows to generate extremely large sequence-function datasets on proteases. These can be effectively mined by machine learning to build computer models capable to design new protease functions without having to do extensive experimental work. Led by Tim Kucera, the authors show that such data-driven approaches can drastically speed up the development process for designer proteases. Furthermore, they show that epistatic trends (i.e. different amino acids of the protease acting synergistically together) can be leveraged to further boost our ability to reengineer the function of proteases and likely other proteins in a smart and resourceful fashion.
Swiss National Science Foundation
European Research Council
Huber L, Kucera T, Höllerer S, Borgwardt K, Panke S, Jeschek M. Data-driven protease engineering by DNA-recording and epistasis-aware machine learning. Nat Commun. 2025 Jul 1;16(1):5466. doi: 10.1038/s41467-025-60622-7. PMID: 40593579; PMCID: PMC12217912.