cellcomm.org

Surveys

Background information about the Artificial Intelligence survey and initial results

Some of us organized in 2020 a survey of all NIH grantees about the COVID-19 response. We received more than 4,000 replies and Nature wrote a piece about it:

https://www.nature.com/articles/d41586-020-01154-6

This is an example of how community-wide surveys can have positive effects in science.

The current survey promotes a discussion about the control of Artificial Intelligence (AI) by the scientific community. The discussion focuses on recent AI developments important for science, including Large Language Models (LLMs) like ChatGPT and Bard and scientific tools like AlphaFold and AlphaMissense. We have already obtained opinions from biomedical scientists and from economists and we plan to extend the discussion to specialists in other scientific disciplines and to historians. This is not a problem that a single individual or even a single scientific field can solve in isolation. Most likely, it will require a form of collective thinking in which opinions are first shared and then revised after reading what specialists in other fields have said.

In addition to the survey we also present here research based on published documents, interviews and perspectives from experts.

Background information

Can we really expect scientists to act as a community?

There are several reasons why this is a realistic prospect:

- We show historical sources and an analysis describing times when scientist in molecular biology were part of a community freely exchanging ideas. This later changed, in part due to increased competitive concerns. In the case of the topic of this survey, however, there would be no individual advantage in secrecy. In these historical sections we also present descriptions of the beginning of open science at the time of the Scientific Revolution, in the seventeenth century, and of its effect on economic development.

- Another change compared to the early times of molecular biology is the increase in size of the scientific community. There is however an example of a large scientific field, particle physics, that has developed a mechanism for community-wide discussions of ideas, when faced with problems that require concerted efforts.

- Global collaborations in biology are emerging when the problem to be addressed requires it. One of these is the Human Cell Atlas. You can read an interview with Aviv Regev and Sarah Teichmann, the leaders of this initiative, which now involves more than 3,000 scientists from 95 different countries. Another example is the response to the COVID-19 pandemics, described by Alessandro Sette.

- The development of Artificial Intelligence challenges human understanding. Historians and social scientists have found that external threats often make groups more cohesive, strengthening group identity, in this case human identity. We are obtain opinions from historians. An example is the interview with Carlo Ginzburg.

Why would more transparency be helpful?

We cannot always explain the reasons on which AI statements are based. As discussed in the case of medical applications by Ghassemi et al (1) this is in part due to the nature of the algorithms, and in some cases we can mainly expect validation rather than explanation. Obstacles to understanding, however, are also due to incomplete disclosure by the developers of the data used for training, of the details of the methods and of the validation steps. Burnell et al (2) show how published reports of AI systems performance often do not provide sufficient information for a complete evaluation. In a recent article (3) Melanie Mitchell stated that:

"To scientifically evaluate claims of humanlike and even superhuman machine intelligence, we need more transparency on the ways these models are trained, and better experimental methods and benchmarks. Transparency will rely on the development of open-source (rather than closed, commercial) AI models. "

How does interacting with humans contribute to the training of AI systems?

An example is "reinforcement learning from human feedback". This method has been mentioned as one of the main factors for the success of ChatGPT (4). It has been developed by scientists at OpenAI with the aim of training AI systems "to do what a given set of humans want them to do." (5) Similar approaches are likely to be used in more advanced systems (6).

Who should control AI systems?

Different parts of society can play a role in AI control (7). The scientific community can participate in this control and make sure that fundamental scientific knowledge remains a public resource.

The case of drug development shows how private companies can contribute to specific applications but also benefit from openly shared fundamental biological knowledge. AI systems contain scientfic knowledge in a form that is not equivalent to that available from articles, books or individual human experts.

You can read the responses to the survey given by two AI systems, ChatGPT and Bard.

----------------

Survey results (July-August 2023) - biomedical scientists

The survey of biomedical scientists received 187 responses. 80% of respondents had previously received an NIH grant (among them a Nobel Prize winner). Most of the other respondents were international scientists that have published papers included in Pubmed.

After a short introduction the survey asked 3 questions and this page was mentioned for those that wanted more background information. Here are the questions and a summary of the responses:

Artificial Intelligence (AI) systems contain a large and increasing amount of fundamental knowledge about human biology. Interacting with AI systems contributes to training them; an example of this is reinforcement learning from human feedback.

Q1. Do you think that it is useful for the biomedical community to discuss the control of AI systems?

96% of respondents answered YES to this fixed choice question.

Q2. Which are the advantages and disadvantages of promoting the development of AI systems that are more transparent, and where the biomedical community participates in the control?

You can see here all the individual responses to this question.

Scientists provided a wider range of ideas compared to the responses from AI systems mentioned above.

Q3. Should biomedical scientists interact preferentially with AI systems that have these features?

You can see here all the individual replies to this question.

55% of respondents answered YES or gave an equivalent response, 4% answered NO, 29% made a more nuanced comment and 12% did not answer this question.

You can also read a summary of the replies given by biomedical scientists to both free text questions.

------------

Several Editors of scientific journals have shared with us their perspectives about the recent developments of AI in science.

We have also received perspectives from computational biologists and bioengineers.

Demis Hassabis is the CEO of Google DeepMind and he has won the 2023 Lasker Award in Basic Medical Science for his contribution to AlphaFold, an AI system that predicts the three-dimensional structure of proteins from the amino acid sequence. His vision is that large language models (LLMs) could use language to call one or more AI tools, like AlphaFold, after receiving a request from a user. Another AI tool developed by his group, published in Science in September 2023, is AlphaMissense, which predicts the pathogenicity of all possible human single amino acid substitutions. All the components of the AlphaFold AI model were shared openly, but in the case of AlphaMissense the trained weights, a set of parameters essential for running the model, were not shared.

When AlphaFold 3 was published in Nature in May 2024 the code was not provided (8). A server was offered for non-commercial use but the the number and types of queries allowed was limited (8).

These decisions and similar ones adopted for several LLMs are example of the limitations to open science that are emerging in the AI field, even when AI is used for fundamental science.

------------

Survey results (October 2023) - economists

The survey of economists received 33 replies. 62% of the respondents are affiliated with NBER (National Bureau of Economic Research), one of the foremost economic think tanks. These are primarily well-known academics from North American Universities. We also received replies from an international group of economists affiliated with Universities in US, Canada, UK, Norway, Italy, Denmark, India, Japan, New Zealand and with the World Bank and the IMF.

We asked the same questions that had been answered by biomedical scientists, now directed to scientists in general, and a few additional questions for economists:

Artificial Intelligence (AI) systems contain a large and increasing amount of fundamental knowledge about science. Interacting with AI systems contributes to training them; an example of this is reinforcement learning from human feedback.

General questions for all scientists:

Q1. Do you think that it is useful for the scientific community to discuss the control of AI systems?

Q2. Which are the advantages and disadvantages of promoting the development of AI systems that are more transparent, and where the scientific community participates in the control?

Q3. Should scientists interact preferentially with AI systems that have these features (more transparency and control participation)?

Additional introduction and questions for economists:

Since the beginning of the Scientific Revolution in the seventeenth century, fundamental scientific knowledge of Nature has been a public resource, while applied knowledge of commercial value has either been protected by patents, which require disclosure in exchange for temporary exclusivity, or kept as a secret. Several large AI systems important for both fundamental and applied science, including Large Language Models (LLMs) and other AI tools, are now being developed by a few companies that keep key quantitative parameters as a proprietary secret. These AI systems require large investments.

Q4. How might the rise of these private AI systems affect the flow of ideas? What does this imply about data and scientific integrity in the future? What does this imply about the pace and distribution of economic growth?

Q5. What role should the public sector play in regulating these AI models, to maximize their benefits and minimize their risks?

Q6. What would be the economic consequences if public AI systems, dedicated to fundamental scientific knowledge, would co-exist with private AI efforts, focused on specific applications?

All economist responders answered YES to Question 1. You can read a summary of the responses and all the individual replies. We used AI (ChatGPT 3.5) to assist in the survey summaries, with final human editing.

REFERENCES

1- Ghassemi M, Oakden-Rayner L, Beam AL. "The false hope of current approaches to explainable artificial intelligence in health care." The Lancet Digital Health. 2021 Nov 1;3(11):e745-50.

2- Burnell R, Schellaert W, Burden J, Ullman TD, Martinez-Plumed F, Tenenbaum JB, Rutar D, Cheke LG, Sohl-Dickstein J, Mitchell M, Kiela D. "Rethink reporting of evaluation results in AI." Science. 2023 Apr 14;380(6641):136-8.

3- Mitchell M. "How do we know how smart AI systems are?". Science. 2023 Jul 13;381(6654):adj5957.

4- Heaven WD "The inside story of how ChatGPT was built from the people who made it."

MIT Technology Review 2023

https://www.technologyreview.com/2023/03/03/1069311/inside-story-oral-history-how-chatgpt-built-openai/

5- Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J. "Training language models to follow instructions with human feedback." Advances in Neural Information Processing Systems. 2022 Dec 6;35:27730-44.

6- Thirunavukarasu AJ, Ting DS, Elangovan K, Gutierrez L, Tan TF, Ting DS. "Large language models in medicine." Nature Medicine. 2023 Jul 17:1-1

7- Taddeo M, Floridi L. "How AI can be a force for good." Science. 2018 Aug 24;361(6404):751-2.

8- Callaway E “Major AlphaFold upgrade offers boost for drug discovery” Nature NEWS 08 May 2024
https://www.nature.com/articles/d41586-024-01383-z

Screen Shot 2023-09-22 at 9.38.45 AM.png