AI and Collective Intelligence

Public·1 member

June 3, 2026

Previous discussion summary - from PDB to virtual cells

AlphaFold is a major advance for AI in science recognized by the 2024 Nobel Prize in Chemistry. It was made possible by the data of the Protein Data Bank (PDB) and developed by a company, DeepMind.

A recent initiative is a collective history called "From PDB to AlphaFold":

https://www.cellcomm.org/from-pdb-to-alphafold

This history points to data as key to the synergy between AI companies and academia. Appreciating the implications of this potential synergy could lead to a joint effort to encourage the rest of society to increase support for science. It could also lead to support from the AI industry (for example taking the form of open-source software) for transparent, nonprofit AI dedicated to basic science, which would not compete with commercial applications.

As pointed out by Medicine Nobel Prize winner Baruch Blumberg, who was also mentioned by Helen Berman in her interview as an inspiration, the collection of large scientific datasets necessarily starts from broad working hypotheses, and therefore from scientific thinking:

https://www.cellcomm.org/forum/from-pdb-to-alphafold/reflections-from-barry-blumberg

At the end of the history, we mention the statement by Demis Hassabis about the virtual cell as the next big challenge DeepMind is working on. Several Foundations and recent scientific reviews also view virtual cells as a major opportunity for AI in science.

It has been suggested that AI simulations of working cells should also include the multicellular scale, where cell-cell interactions determine how cells combine into tissues, organs and organisms (Bunne et al, 2024). It is certainly the case that many biomedically relevant aspects of cell function can only be understood including the organism level.

The history we have assembled is not only a demonstration of the importance of data for AI but also an example of a method for the sharing of ideas by the history contributors and of discussions about their significance. This was done preserving all the individual points of views, so that alternative conclusions could also be reached. In the case of datasets useful for major scientific questions we will need a similar collective reflection to evaluate which data should be collected, and how. The contributors to the history were motivated by the recognition of their peers and by their belief in the utility of the historical analysis.

We now wish to encourage the sharing of a dataset of ideas about the components of virtual cells and organisms (knowledge possessed by the scientific community as a whole, not all of it in written form) and also, in parallel, of ideas on how best to recognize and therefore motivate the contributors.

The platform containing this knowledge, and its discussion, would be most effective if controlled by the entire scientific community rather than by a particular group. We therefore invite current and former leaders of scientific institutions (including departments, foundations, associations, science-based companies) and winners of major science prizes to oversee this initiative. Several have already accepted. Oversight implies sight, and therefore recognition. The oversight group will therefore contribute not only to oversee but also to motivate the scientists.

This dataset of scientific ideas can be the foundation of a new form of scientific collective intelligence. It will allow the scientific community to reflect on which experimental datasets should be collected to solve the broadest scientific problems, those that are beyond the capabilities of one or a few labs and do not overlap with the aims of a standard grant. As LLMs have shown, ideas, when shared with language, are also data that can be used by AI. The scientific community can be in control of this process.

We might consider how the scaling of AI systems, which involved more data, more model parameters and more computational resources led to the emergence of new capabilities. A similar association, especially with cortical neuron counts, has been observed in biological brains. We do not know if this scaling analogy will hold for new types of scientific collective intelligence, but it is an exciting prospect.

The next effort on the virtual cells and organisms will benefit from the methods developed during the historical research. For example, we found that potential participants are more likely to contribute if they are sent a specific question, based on their work and coming from a group of their peers. This process is partially similar to the requests scientists receive to review papers and grants, but with a more explicit recognition for the intellectual contributions.

History participants benefited from the provision of a preliminary timeline, which served as a structural framework for their contributions. For virtual cells and organisms, the most effective starting frameworks are likely to be defined by chemical and morphological spaces. LLMs will be used, in a transparent manner, to assist human led efforts in finding interdisciplinary connections.

REFERENCES

Bunne, Charlotte, Yusuf Roohani, Yanay Rosen, Ankit Gupta, Xikun Zhang, Marcel Roed, Theo Alexandrov et al. "How to build the virtual cell with artificial intelligence: Priorities and opportunities." Cell 187, no. 25 (2024): 7045-7063.

Callaway, Ewen. "Can AI build a virtual cell? Scientists race to model life’s smallest unit." Nature 643, no. 8070 (2025): 13-14.

Chan Zuckerberg Initiative

Virtual cells

https://chanzuckerberg.com/science/technology/virtual-cells/

Jacob, Margaret C. "Scientific culture and the making of the industrial West." Oxford University Press (1997).

Mokyr, Joel. "A culture of growth: The origins of the modern economy." Princeton University Press (2016).

33 Views

Members

Giovanni Paternostro
Giovanni Paternostro

See All Members (1)