The quiet revolution in protein design: Enzymes and binders made by algorithms, built in yeast
Protein design is shifting from craft to workflow. Algorithms propose new enzymes and binding proteins, lab teams build them in microbes and yeast, then test what actually works. The results are promising in medicine and industry, but validation, safety and governance will decide what lasts.
A protein is, at heart, a folded chain of amino acids that does a job. Some jobs are brute, like holding tissue together. Others are delicate, like catalysing a reaction that would otherwise take years, or recognising one molecular target among billions. For decades, making a new protein was like commissioning a bespoke suit without ever seeing the tailor’s hands. You could tweak an existing design and hope. You could let evolution do the searching. You could sometimes compute your way to improvements, slowly.
That has changed. Modern protein engineering increasingly looks like an assembly line that alternates between software and wet lab: design on a screen, build in living cells, measure, redesign, repeat. The biggest shift is not a single tool but a convergence: de novo protein design that creates proteins not found in nature; directed evolution that “breeds” better variants in the lab; and computational approaches, including generative methods, that propose structures and sequences at a pace that human intuition cannot match.
In 2024, that convergence was effectively stamped into the scientific mainstream when the Nobel Prize in Chemistry recognised work spanning computational protein design and artificial intelligence (AI) protein structure prediction. The prize was not a proof of safety or clinical value. It was a signal that protein design has become a core technology, and that its consequences will not be confined to biology.
De novo design, in plain English
De novo design means starting from the goal, not the template. Instead of asking “how do we tweak this natural protein?”, designers ask “what shape would do this job?”, then search for amino acid sequences that fold into that shape and behave as required.
A modern wave of de novo work uses generative models to propose protein backbones and binders that match a specification, such as “wrap around this surface” or “hold this catalytic motif in the right geometry”. A highly cited Nature paper in 2023 described a diffusion-based approach, RoseTTAFold diffusion (RFdiffusion), used across several tasks including binder design and enzyme active-site scaffolding.
The sober point is that “designed” does not mean “correct”. Proteins are fussy. They misfold. They clump. They do unexpected chemistry. The lab is where the software meets its audit.
Directed evolution: when you let biology do the debugging
Directed evolution is the counterweight to pure computation. It is a laboratory method that mimics natural selection: make many variants, screen or select the ones that perform best, then repeat. It is faster than waiting for nature and less reliant on understanding every detail of mechanism.
This approach is not new, but it has become newly complementary. Computational design can now propose a plausible starting point, and directed evolution can improve what the design gets wrong, often by finding solutions that a human would not have predicted. Reviews on computational enzyme design have long noted that de novo enzyme activities are often modest initially, and directed evolution can substantially increase activity.
If de novo design is the blueprint, directed evolution is the wind tunnel.
Why “made-to-order” proteins matter
Most medicines and many industrial processes already rely on proteins: antibodies, hormones, enzymes, and biologics of every flavour. What changes with modern design is not the existence of protein drugs but the scope of what can be created on demand.
Three use cases matter most.
- Binders, proteins that latch onto a target and block it, escort it, or label it.
- Enzymes, proteins that speed up reactions, ideally in greener ways than petrochemical processes.
- Molecular tools, proteins that detect signals, assemble structures, or control biological systems.
The temptation is to say “anything is possible”. The more accurate statement is: many things are now testable, and testing is getting cheaper.
Medical uses: neutralising toxins, targeted binders, antivirals
The most dramatic medical examples are binders that neutralise toxins.
A 2024 study described de novo designed proteins that bind and neutralise lethal snake venom toxins, designed using deep learning methods. Another 2024 Nature Communications paper reported de novo designed mini-protein binders against major subtypes of Clostridioides difficile toxin B (TcdB), aiming for broad binding across variants.
These are not replacements for approved therapies by default. They are demonstrations of a capability: designing small, stable proteins that bind difficult targets with high affinity.
Antivirals are another natural fit because viruses present conserved surfaces that antibodies sometimes struggle to hit consistently. Computational binder design has a history here, including designed proteins that bind a conserved patch on influenza haemagglutinin. RFdiffusion-era work reports binder designs with structural validation against influenza haemagglutinin, suggesting the design models can, at least sometimes, land close to reality.
Targeted binders can also be used as diagnostics, delivery escorts, or “molecular Velcro” in engineered therapies. The field is moving quickly, but its most credible successes tend to be those that can be measured cleanly: binding strength, structural match, neutralisation in controlled systems.
Interview placeholder (paraphrase): [Protein engineer explains what counts as a genuine success in binder design, and why many failures are simply proteins refusing to behave.]
Industrial uses: greener chemistry and waste breakdown
Industry already uses enzymes at scale because they can be selective and operate under milder conditions than traditional chemistry. Directed evolution has been a major driver of industrial biocatalysts, enabling enzymes to work faster, withstand heat, or accept new substrates.
Modern computational design aims to expand the menu: enzymes tailored to reactions that industry cares about, not only the ones biology happened to evolve. Reviews outline how computational approaches can design catalytic sites and scaffolds, although achieving high activity remains challenging and often benefits from subsequent optimisation.
Waste breakdown has become a visible test case. Poly(ethylene terephthalate) (PET) enzyme engineering has accelerated, with recent reviews describing advances in engineering PET-degrading enzymes and the outstanding bottlenecks, including stability and efficiency. This is not a single miracle enzyme that eats landfills. It is a set of incremental improvements that make enzymatic recycling more plausible in defined processes.
In the best industrial scenarios, made-to-order proteins mean cleaner reactions, lower energy inputs, and fewer toxic by-products. In the worst ones, they mean promising papers that do not survive scale-up.
Built-in yeast: the quiet workhorse behind the headlines
Design is cheap. Building and testing is the cost.
A great deal of protein engineering relies on expression systems that can manufacture proteins rapidly and reproducibly. Yeast is a major platform because it is easy to grow, genetically tractable, and capable of secreting proteins. A widely used system is Komagataella phaffii (formerly Pichia pastoris), which is common in industrial biotechnology and biopharmaceutical contexts, with ongoing work on improving secretion, folding, and glycosylation.
Yeast has strengths and quirks. It can produce high yields and perform some post-translational modifications, but glycosylation patterns differ from mammalian systems, which can matter for therapeutic proteins. Reviews discuss these trade-offs and note efforts to engineer yeast glycosylation towards more human-like profiles.
This is where the phrase “built-in yeast” earns its keep. Yeast is not the glamour. It is the production reality that turns a designed sequence into milligrams of material you can actually test.
How it works: from idea to protein, step by step
How it works
- Pick a job: bind this toxin, catalyse this reaction, block this viral surface.
- Specify constraints: target site, stability requirements, size, solubility, and any safety constraints.
- Generate candidates: computational design proposes sequences and structures, often in large numbers.
- Filter and score: predict folding, stability, binding, and potential liabilities, using modelling and heuristics.
- Build: synthesise DNA encoding top candidates and express proteins, often in microbes or yeast.
- Measure: test binding, activity, stability, and specificity, then solve structures for the best hits when possible.
- Optimise: improve with directed evolution or targeted redesign, and repeat.
- Translate: if it is a drug, assess safety, immunogenicity risk, manufacturability, and regulatory pathway.
The central feature is iteration. The field is learning, systematically, what designs survive contact with biology.
Risks: dual use, unexpected interactions, immunogenicity
Protein design is a dual-use technology in the plain sense that tools that accelerate beneficial design could also lower barriers to harmful design. Recent reviews on AI-enabled protein design discuss biosecurity concerns and argue for governance and evaluation approaches that take dual-use seriously.
There are also more prosaic risks.
Unexpected interactions: a binder that latches onto the wrong protein, or an enzyme that does unplanned chemistry in a biological environment. These risks are not unique to designed proteins, but novelty increases uncertainty.
Immunogenicity: all therapeutic proteins can trigger immune responses, potentially reducing efficacy or causing harm. Regulators have published guidance for immunogenicity assessment, including risk-based approaches to evaluating and mitigating immune responses.
Design does not remove immunogenicity risk. It can either reduce it, by avoiding known problematic motifs, or increase it, by creating unfamiliar structures that the immune system flags as suspicious. Either way, the risk has to be measured, not assumed.
The bottlenecks: validation, manufacturing, regulation
The field’s bottlenecks are less glamorous than the algorithms.
Validation: Many designs look plausible until they are expressed and purified. Even then, a protein that binds in vitro may fail in vivo. High-quality validation includes orthogonal assays and structural confirmation where feasible. The RFdiffusion work highlights the value of structural confirmation for at least a subset of designs.
Manufacturing: A protein that works at microgram scale may be hard to produce at kilogram scale with consistent quality. Yeast systems such as K. phaffii can help, but process development, folding, secretion, and glycosylation remain non-trivial.
Regulation: For therapeutics, “engineered” does not change the basics. Safety, efficacy, quality, and immunogenicity must be established, and regulators will focus on manufacturing control as much as molecular novelty.
The practical bottleneck is often time. The design loop can be fast, but clinical evidence moves at clinical speed.
What would make the field trustworthy: standards, transparency, oversight
Trustworthiness will not come from more impressive demos. It will come from boring discipline.
- Transparent reporting: clear descriptions of design objectives, training data assumptions where relevant, and what was actually tested, including failures. Calls for reporting guidelines in protein modelling reflect this broader need.
- Benchmarking and reproducibility: continuous, public benchmarks and reproducible pipelines help prevent a field from drifting into untestable claims. Work on benchmark frameworks in macromolecular modelling illustrates how sustained benchmarking can be institutionalised.
- Safety and biosecurity governance: dual-use awareness, evaluation before deployment of high-capability models, and clear escalation paths for risks. Recent scholarship argues for prioritising evaluation of high-consequence dual-use capabilities in biological AI contexts.
- Regulatory alignment on immunogenicity and quality: consistent expectations across agencies and robust risk-based immunogenicity assessment.
The quiet revolution in protein design is real, but it is not magic. Algorithms can draft an enormous number of molecular blueprints. Biology still decides which ones stand up, which ones are safe, and which ones deserve a place outside the lab.