top of page

AI and Collective Intelligence

​

AI in Science: From PDB (the Protein Data Bank) to virtual cells and organisms

​

             We are currently discussing the following message for Foundations about virtual cells and organisms. The most useful virtual organism is clearly a digital human model, a worthwhile task that will benefit all of us, but also a challenge that will require the effort of a large part of the biomedical scientific community.

​​

Message draft:

Subject: A non-financial role for [Foundation Name] in advancing virtual cells and organisms.

Dear [Foundation Name],

We know from your support for [example] that you care about research related to virtual cells and organisms. A core constraint today is data and progress hinges on organizing knowledge. Much of biology cannot be compressed into simple equations; it lives in detailed, distributed expertise—often held by small groups. If captured and structured by AI and human scientists working together, that expertise becomes data. As LLMs demonstrate, ideas are data and they can help define what is known, what is uncertain, and what to measure next.

Our request is straightforward and non-financial: please agree to speak with the leaders of the scientific institutions whose researchers contribute the most compelling ideas to a shared, community-run discussion about the components of digital cells and organisms. Institutional leaders have told us they welcome foundation engagement and can ensure proper recognition for their scientists. The discussion platform will be co-governed by participating institutions and foundations, ensuring transparency and scientific stewardship.

This is an opportunity to help design a responsible path for the field, position your foundation for constructive leadership, and accelerate an outcome with global benefit.

As a precedent, consider the community pathway from the Protein Data Bank to AlphaFold—an advance enabled by shared datasets and collective norms: https://www.cellcomm.org/from-pdb-to-alphafold

If you are interested, please let us know. If not, could you share your main concerns? We will circulate them to the hundreds of scientists, economists, and historians engaged at cellcomm.org and report back with proposed solutions.

 

A diagram shows the positive feedback loop generated by the message:

​

​

​

​

 

 

 

 

 

 

 

 

 

 

The diagram shows a high-level architecture. The sub-problems that compose it will be defined by the participants.

​

--------------------------------------

 Summary of previous discussions about AI in Science: From PDB to virtual cells.         

                  AlphaFold is a major advance for AI in science recognized by the 2024 Nobel Prize in Chemistry. It was made possible by the data of the Protein Data Bank (PDB) and developed by a company, DeepMind.

                  A recent initiative is a collective history called "From PDB to AlphaFold":

https://www.cellcomm.org/from-pdb-to-alphafold

                  This history points to data as key to the synergy between AI companies and academia. Appreciating the implications of this potential synergy could lead to a joint effort to encourage the rest of society to increase support for science. It could also lead to support from the AI industry (for example taking the form of open-source software) for transparent, nonprofit AI dedicated to basic science, which would not compete with commercial applications.

                  As pointed out by Medicine Nobel Prize winner Baruch Blumberg, who was also mentioned by Helen Berman in her interview as an inspiration, the collection of large scientific datasets necessarily starts from broad working hypotheses, and therefore from scientific thinking:

https://www.cellcomm.org/forum/from-pdb-to-alphafold/reflections-from-barry-blumberg

                  At the end of the history, we mention the statement by Demis Hassabis about the virtual cell as the next big challenge DeepMind is working on. Several Foundations and recent scientific reviews also view virtual cells as a major opportunity for AI in science.

                  It has been suggested that AI simulations of working cells should also include the multicellular scale, where cell-cell interactions determine how cells combine into tissues, organs and organisms (Bunne et al, 2024). It is certainly the case that many biomedically relevant aspects of cell function can only be understood including the organism level.

                  The history we have assembled is not only a demonstration of the importance of data for AI but also an example of a method for the sharing of ideas by the history contributors and of discussions about their significance. This was done preserving all the individual points of views, so that alternative conclusions could also be reached. In the case of datasets useful for major scientific questions we will need a similar collective reflection to evaluate which data should be collected, and how. The contributors to the history were motivated by the recognition of their peers and by their belief in the utility of the historical analysis.

                  We now wish to encourage the sharing of a dataset of ideas about the components of virtual cells and organisms (knowledge possessed by the scientific community as a whole, not all of it in written form) and also, in parallel, of ideas on how best to recognize and therefore motivate the contributors.

                  The platform containing this knowledge, and its discussion, would be most effective if controlled by the entire scientific community rather than by a particular group. We therefore invite current and former leaders of scientific institutions (including departments, foundations, associations, science-based companies) and winners of major science prizes to oversee this initiative. Several have already accepted. Oversight implies sight, and therefore recognition. The oversight group will therefore contribute not only to oversee but also to motivate the scientists.

                  This dataset of scientific ideas can be the foundation of a new form of scientific collective intelligence. It will allow the scientific community to reflect on which experimental datasets should be collected to solve the broadest scientific problems, those that are beyond the capabilities of one or a few labs and do not overlap with the aims of a standard grant. As LLMs have shown, ideas are also data that can be used by AI. The scientific community can be in control of this process.

                  We might consider how the scaling of AI systems, which involved more data, more model parameters and more computational resources led to the emergence of new capabilities. A similar association, especially with cortical neuron counts, has been observed in biological brains.  We do not know if this scaling analogy will hold for new types of scientific collective intelligence, but it is an exciting prospect.

                   The next effort on the virtual cells and organisms will benefit from the methods developed during the historical research. For example, we found that potential participants are more likely to contribute if they are sent a specific question, based on their work and coming from a group of their peers. This process is partially similar to the requests scientists receive to review papers and grants, but with a more explicit recognition for the intellectual contributions.

                  History participants benefited from the provision of a preliminary timeline, which served as a structural framework for their contributions. For virtual cells and organisms, the most effective starting frameworks are likely to be defined by chemical and morphological spaces. LLMs will be used, in a transparent manner, to assist human led efforts in finding interdisciplinary connections.

 

REFERENCES

Bunne, Charlotte, Yusuf Roohani, Yanay Rosen, Ankit Gupta, Xikun Zhang, Marcel Roed, Theo Alexandrov et al. "How to build the virtual cell with artificial intelligence: Priorities and opportunities." Cell 187, no. 25 (2024): 7045-7063.

 

Callaway, Ewen. "Can AI build a virtual cell? Scientists race to model life’s smallest unit." Nature 643, no. 8070 (2025): 13-14.

 

Chan Zuckerberg Initiative

Virtual cells

https://chanzuckerberg.com/science/technology/virtual-cells/

​

​

​

​

​

loop diagram.png

​

bottom of page