Jake Feala is a cofounder at Lila Sciences, a company unveiled in March 2025 aiming to use AI and autonomous labs to accelerate scientific discovery.
We invited him to contribute to our historical timeline, from PDB to AlphaFold. Specifically, we asked him about the reasons why the protein folding problem was solved by a VC-backed company, DeepMind, and not by an academic group, and also if he thinks that this achievement provides a general solution for the future of AI in science.
Thanks for the opportunity to contribute!
I think it's not exactly the right question to ask why the protein structure prediction problem was solved by a company and not an academic group. A more fitting question is why DeepMind solved it and not some other entity, academic or not. Back then DeepMind was a totally unique company and not only beat out academia but also the entire biopharma industry which had plenty of resources and interest in the problem.
After the competition, by far the best take on what had happened, and better than I could provide, was in a blog post from Mohammed AlQuraishi. He poses the question of "why DeepMind" as well, and points out that AlphaFold's success was not only a win over academia, but just as much an indictment of big pharma's inability to innovate.
Some context on my own perspective at the time: we were working on deep learning for proteins at Generate Biomedicines in 2018, just before the CASP competition where AlphaFold was unveiled. We were trying to solve a different problem -- generate protein sequences for a given structure or function, rather than predict structure from sequence -- but we were still watching the field closely. Here are the best answers I heard at the time for why DeepMind won.
Engineers accelerate the researchers
AlQuraishi points out that "...competitively-compensated research engineers with software and computer science expertise are almost entirely absent from academic labs, despite the critical role they play in industrial research labs. Much of AlphaFold’s success likely stems from the team’s ability to scale up model training to large systems, which in many ways is primarily a software engineering challenge."
This is completely true but is too generous to "industrial research labs." Lack of engineering investment has been a problem with the industry for my whole career. I've been lucky enough to be part of some computationally well-resourced biotech companies, but these were the exception. Most biopharma companies and nearly all academic research groups are starving for talented software engineers relative to the tech industry. This has started to change recently, but we still have a lot of catching up to do in terms of culture, compensation, and technological maturity.
Protein folding as a game
Demis Hassabis is a master at picking problems. Recognizing that reinforcement learning (RL), DeepMind's bread and butter, worked best at solving games back then, he strategically went after both literal games (chess, Go, video games) and problems that could be "gamified." Keep in mind that at the time there already was essentially a massively multiplayer game for protein folding (Folding @ home).
Typically a game has a fixed environment with known rules that the RL algorithm can self-play toward mastery. While that's not exactly the case with protein structure prediction, and there was much more to the solution than RL, there is nevertheless a game-like aspect to the problem. It has a very clear objective where you know when you've won (a "finite game"). It has rules and, while not all of them are known, there is enough prior understanding (symmetries, 3D distances, bond angles, etc.) to get a head start. As they learned with AlphaZero, the algorithm can learn a strategy for playing the game while simultaneously learning the rules.
Protein structure of course also has lots of data. While there may be other problems in biology that can be gamified, none had a vast, clean dataset so perfectly matched to the objective of the game.
Finally, games invite competition, and competitions have winners. Over and over, DeepMind chose AI problems that they could objectively win in a loud, splashy way. The existence of the CASP competition was likely very enticing, if not a key reason they chose to work on this problem.
The machine
Once the perfect game was identified, it seemed like the specifics of the problem almost didn't matter -- DeepMind seemed to apply the same formula: recruit incredible talent, compensate them well, supply them with endless resources, and leave them alone to win the game. They could point their machine at any well-posed game and have a great chance of winning.
I know very little about the internal culture of DeepMind except that John Jumper was widely respected and a fantastic pick to lead the group, and that the team applied more compute and more software and data engineering resources to the problem than any other competing group.
I would add that DeepMind is not at all a typical VC-backed venture, especially for that time. During the 2010s, VC-backed software startups were mostly in the "Lean Startup" tradition of customer obsession and finding early product-market fit. DeepMind is pretty much the opposite of that. Their success, and that of OpenAI, SpaceX, etc., is part of why we see many more very future-looking, heavily funded "moonshot" companies now than we did back then. But back then they were completely unique in their huge upfront funding, lack of attention to near-term products or revenue, and grand long-term vision.
The future of AI for science
I have much to say about the future of AI in science that I won't get into here. Suffice to say it will obviously be a major driver of progress, but not the whole story. For one thing, I'm skeptical of aspirations to build ground-up simulations of biology, or to train a superintelligent "oracle" that can answer any scientific question. I think nature is too complex, and obviously we'll always need deep and constant contact with reality through experimentation.
For this reason, while I highly respect DeepMind, I have doubts about their further aspirations in biology. There may be other problems in the field that can be similarly "gamified” with existing data, but I think the opportunities are limited. In interviews, Demis has hinted at building a "virtual cell," which is a worthy but wildly underspecified challenge, especially for his approach. The datasets in cellular biology are messier and harder to interpret than protein structures, and there are so many potential objective functions to choose from that any successful solution has a much narrower range of values.
For example, you might build a model to perfectly predict gene expression profiles from single-cell sequencing data. Great! Extremely cool and useful. Or you might extend AlphaFold to predict structure of multi-protein complexes and binding to other molecules such as RNA or metabolites. That would be truly amazing! But while these would be incredible capabilities, they are both far from a "virtual cell," which would require dozens or even hundreds of such models across all of the metabolic, structural, and information processing systems of the cell, integrated and trained over every possible context (e.g. cell type, tissue or organ) or perturbation (e.g. a drug or mechanical stimulus).
Another problem is that there are few competitions to win in these areas, and so your model will instead be subjected to the less sexy arena of peer review and citations to signal superiority. Or worse, you'll have to compete for market share as a tool for the struggling drug industry. The incentives start to trail off both for investors and talent.
I hope that I’m wrong and genuinely wish them the best, as we're all working toward the same long-term goal of improving human health, but we are pursuing a different approach at Lila. You can read more on our website, but essentially we are working toward integrating AI with automated experimentation in a continuous loop, through which AI autonomously learns and proposes and carries out the best next experiment. We believe science is a process that can be accelerated, not a game that can be won.
REFERENCES
- Steve Lohr - The Quest for A.I. ‘Scientific Superintelligence’ New York Times, March 10, 2025 https://www.nytimes.com/2025/03/10/technology/ai-science-lab-lila.html