Artificial intelligence can revolutionise science

silviot · on Sept 15, 2023

gustavus · on Sept 15, 2023

Here's the heart of how they say it can be done

> Two areas in particular look promising. The first is “literature-based discovery” (LBD), which involves analysing existing scientific literature, using ChatGPT-style language analysis, to look for new hypotheses, connections or ideas that humans may have missed. LBD is showing promise in identifying new experiments to try—and even suggesting potential research collaborators. This could stimulate interdisciplinary work and foster innovation at the boundaries between fields. LBD systems can also identify “blind spots” in a given field, and even predict future discoveries and who will make them.

I have to question to the value of looking over existing papers given the current replication crises, is using an LLM to review existing papers just going to enable us to do more bad science faster? Does that do anything for us?

> The second area is “robot scientists”, also known as “self-driving labs”. These are robotic systems that use AI to form new hypotheses, based on analysis of existing data and literature, and then test those hypotheses by performing hundreds or thousands of experiments, in fields including systems biology and materials science. Unlike human scientists, robots are less attached to previous results, less driven by bias—and, crucially, easy to replicate. They could scale up experimental research, develop unexpected theories and explore avenues that human investigators might not have considered.

This sounds basically of just asking a GPT to suggest ideas for experiments?

Honestly based on what I understand of science right now the biggest way generative AI could help advanced things is by assisting in the writing of grant applications faster.

EDIT: Relevant XKCD https://xkcd.com/2341/

vharuck · on Sept 15, 2023

>Honestly based on what I understand of science right now the biggest way generative AI could help advanced things is by assisting in the writing of grant applications faster.

This matches what I was told earlier this year when looking into what LLMs could do. I work in public health, and asked staff from a large registry what they'd like to me try. I expected something like generating reports from aggregate data (with some of that automatic exploration mentioned in the article). What they really wanted was:

1. A nice chat bot to answer questions from data submitters about reporting policies.

2. A tool to ingest federal grant announcements and filter down to those the registry could apply for.

#1 is useful because that's a lot of their job. Getting people to send accurate data in the correct format on time is exhausting. #2 helps with everything, because it could mean hiring an additional staff member. That extra person can write reports and apply for grants, clean data, answer calls, and cover when somebody's out sick. Humans are still really useful.

pocw · on Sept 15, 2023

I've seen a scientific researcher do literature review manually. Search, refine, print, read, highlight, collate.

Lots of time can be saved by automating those steps (and many researchers don't enjoy it so their job satisfaction could be increased). Also the resulting output could be improved if the researcher had a well structured summary to use as the foundation of their outline.

Improve search with semantic search (search by concept not keyword) Improve refinement by preprocessing and summarizing Don't print, display clean and concise data. Summarize, cite and display.

https://studyrecon.ai

This stops short of literature based discovery, you have to bring your own research question.

We've also had some luck finding a gap in existing research. We did a POC where we scraped pubmed and graphed study results by concept. We then used the graphed concepts to explore the conceptual space.

It seems that vitamin D protects against cancer and heart disease. It seems that vitamin D supplementation protects against cancer but not heart disease. Is this because of some previously unknown effect of sun exposure (the primary natural source of vitamin D) or is it just that people with adequate vitamin D go outside a lot more and therefore also get more exercise? Don't know, would love to read the paper if someone studies it ;- )

ben_w · on Sept 15, 2023

> I have to question to the value of looking over existing papers given the current replication crises, is using an LLM to review existing papers just going to enable us to do more bad science faster? Does that do anything for us?

I think the problems with replication are a separate axis to the problem that there is too much being published for anyone to actually read.

AI in general — never mind LLMs, even the much ones running search engines — help with the content overload.

> This sounds basically of just asking a GPT to suggest ideas for experiments?

I think the car analogy here is that if what you've suggesting was google maps, what the article is suggesting is Level 5 autonomy with no steering wheel and an opaque wall instead of a windscreen.

I have absolutely no idea how hard such a level of automation might be to actually implement, not least because of Moravec's paradox: https://en.wikipedia.org/wiki/Moravec's_paradox

jdeaton · on Sept 15, 2023

> This sounds basically of just asking a GPT to suggest ideas for experiments?

I think actually the second idea extends quite a bit further than that. The idea is that you enable the agent to interact with an entire collection of laboratory equipment through tool-use. The agent not only generates hypotheses and designs experiments to test them, but then also actually executes the experiments through tool-use and iterates.

m3kw9 · on Sept 15, 2023

Game developers should be hired to make an UI that allows scientists to accept “quests” and will show the public who accepted what and is collaborating with who on which problems with rankings. The reason to use game is to make it more engaging for everyone and also an incentive to go thru with it when in public

gustavus · on Sept 15, 2023

I'm now getting an idea for a dystopian cyberpunk novel about a world in which everyone is addicted to an awesome game that they think they are obsessed with that provides gift cards and credits that allow you to purchase real things, but it turns out it's all actually secretly controlled by a massive shadow government/corporation and everything everyone thinks they are doing in the "game" is actually having massive real world impacts.... Hey hollywood if your writers are still on strike HMU.

samr71 · on Sept 15, 2023

Just formalize it and make it a corporation. Perhaps companies should be designed explicitly as games these days.

pixl97 · on Sept 15, 2023

Enders Capitalism?

RandomLensman · on Sept 15, 2023

Both things sound like massive false positive generators when used at scale. Also, where do those robots come from that can do all these experiments beyond "trivial" setups? The "robot scientists" sound very sci-fi outside of standardized high volume automation.

randcraw · on Sept 15, 2023

Yeah, I've found that generating (scientific) hypotheses is easy. But generating more than combinatoric variants of existing hypotheses is hard, and harder still is to propose an insightful interpretation of results or a creative model for a mechanism of action that better explains the observed outcomes.

I see nothing in today's 'deep AI' that addresses these desiderata. And I don't see the current AI strategy of accumulating only existing knowledge as the means to those ends either. Optimization of learning can only learn more facts or do it faster, not think with more creativity or innovation. Memory is but a small part of genius.

eli_gottlieb · on Sept 15, 2023

>The second area is “robot scientists”, also known as “self-driving labs”. These are robotic systems that use AI to form new hypotheses, based on analysis of existing data and literature, and then test those hypotheses by performing hundreds or thousands of experiments, in fields including systems biology and materials science. Unlike human scientists, robots are less attached to previous results, less driven by bias—and, crucially, easy to replicate.

This is half a real insight and half total bullshit. Honestly, what we need robotics and AI for in laboratory science is just old-fashioned standardization and labor-saving.

c7b · on Sept 15, 2023

> I have to question to the value of looking over existing papers given the current replication crises, is using an LLM to review existing papers just going to enable us to do more bad science faster?

It might be worse than that, not only might ChatGPT uncritically accept authors' claims or reduce the effort to produce low-quality research - given the current state of the art, there seems to be no guarantee that LLMs won't claim things about a paper that was never even said.

amelius · on Sept 15, 2023

> Honestly based on what I understand of science right now the biggest way generative AI could help advanced things is by assisting in the writing of grant applications faster.

Yeah, and then the funding agencies summarize the applications by running them through an LLM.

jwuphysics · on Sept 15, 2023

> The idea that AI might transform scientific practice is therefore feasible. But the main barrier is sociological: it can happen only if human scientists are willing and able to use such tools.

As a tenure-track scientist who works in ML applications for astrophysics, I disagree with this sentiment. The main issue isn't that enough scientists are using tools to search through literature or form new hypotheses, the main issue is that scientists now have to validate and sift through AI-generated outputs in order to find useful signals, rather than validate and sift through experimentally derived or observed signals.

AI can be useful for hypothesis generation in my field [0], and I think that there are lots of great use cases where it can be used to summarize information. However, it always comes with the possibility that it might output complete nonsense [1], so scientists who adopt these tools will have to spend some of their time verifying their outputs.

[0] https://arxiv.org/abs/2306.11648

[1] https://web.archive.org/web/20230913230733/https://www.msn.c...

timdellinger · on Sept 15, 2023

As someone who probably puts way too much time into literature reviews, my big hope is that literature reviews at the beginning if a research project will be revolutionized.

Things are re-discovered across adjacent fields of study all the time. There are also a bunch of times I've come across a paper when I was a year into a project, and wished that I'd had it at the beginning of the project.

crucialfelix · on Sept 15, 2023

Check out https://elicit.org/

I think it might be exactly what you are looking for.

folli · on Sept 15, 2023

Not the OP, but thanks for the suggestion! Looks like a cool tool!

c7b · on Sept 15, 2023

Not sure the current generation of confidently-wrong language models will be of overall good use in summarizing literature, as suggested in the article. Sounds like the perfect disaster recipe for the "citogenesis" ((C) xkxcd) of spurious facts. Not that they won't be used like that (they probably already are), but that sounds like an expectable outcome, and a net negative to me. There is enough of a replication crisis with things that are stated in published papers already [0], we don't need another one with things that were never originally stated in any papers.

One idea that I find interesting is to combine LLMs with formal verification and theorem prover tools like Coq, Lean, etc. Any mistakes by the LLM should be detectable by the verification engine. Maybe this could be useful in automating the currently ongoing efforts of 're-proving' the existing body of mathematical knowledge with theorem provers ('ChatGPT, please take this paper and verify the proofs with Lean'). And who knows, maybe one day the machines will produce some interesting mathematics on their own. Would be curious if anyone has links to works in this direction, or blogs/content by mathematicians discussing this.

[0] https://en.wikipedia.org/wiki/Replication_crisis

reallyeli · on Sept 15, 2023

> Luminaries in the field such as Demis Hassabis and Yann LeCun believe that AI can turbocharge scientific progress and lead to a golden age of discovery. Could they be right?... Such claims are worth examining, and may provide a useful counterbalance to fears about large-scale unemployment and killer robots.

But will we maintain control of the above-human-ability, autonomous AI systems these companies are racing to build? This is the AI control problem.

If not, then "AI can automate science" isn't much of a counterpoint or reason to be optimistic -- science may be automated, but not under any human's control and not for any human's benefit. In fact, if we're in this situation, the ability of AI systems to automate science is worse news than otherwise, in the same way that the invention of science by humans was bad (or at best, very mixed) news for the animals of Earth.

Vecr · on Sept 16, 2023

I'm pretty sure Yann LeCun at one point didn't really care if AIs replaced humans, but I think people got through to him that what happens after AIs take over would almost certainly look really boring to a hypothetical intelligent human observer, even assuming the AI system survives into the long term. In the set of possible AIs that kill all humans, I'd suggest that almost all of them are not properly aligned to their own long-term survival either.

soperj · on Sept 15, 2023

> It can identify promising candidates for analysis, such as molecules with particular properties in drug discovery, or materials with the characteristics needed in batteries or solar cells. It can sift through piles of data such as those produced by particle colliders or robotic telescopes, looking for patterns. And AI can model and analyse even more complex systems, such as the folding of proteins and the formation of galaxies. AI tools have been used to identify new antibiotics, reveal the Higgs boson and spot regional accents in wolves, among other things.

Wonder how much AI is actually doing here, and how much it's just paper hype, in the same way that AI companies have been shown to actually just use human resources (until they figured out the AI part ;]), I wonder how much these were just to juice up the paper a little bit.

jonhohle · on Sept 15, 2023

Bias, or lack of bias, in experimental results and hypotheses was an interesting angle I hadn’t considered before, especially if it’s combined with the ability to conduct the experiment mechanically.

Would researchers have less issue publishing papers that contradict sensitive findings in earlier papers if they can pass responsibility back to the robot? (e.g. robot hypothesizes and disproves widely accepted result that years of other research is based and that anyone who challenged in the past has been dismissed as a quack.)

swayvil · on Sept 15, 2023

As far as modeling relatively easy to model phenomena is concerned. I could imagine automating model-creation and the drawing of logical connections between models.

In fact I'd be surprised if a bunch of scientists weren't already doing that. But I'm not sure of the utility of that.

In fact, using something like our new chat-ai, we wouldn't even need to understand the connection.

n3storm · on Sept 15, 2023

Cloud computing, Big Data and AI and ML has been revolutionising science for more than 10 years, but we still publish news about how they can revolutionise science and instead we actually can lolifakes your neighbour thanks to Cloud computing, Big Data, ML and AI.

tgbugs · on Sept 15, 2023

We are still quite far from being able to implement these kinds of things at scale.

For the literature review piece the key problem is that LLMs are exquisitely bad at working with even the simplest kind of scientific evidence: citations [1, 2]. They will get better, but it is not clear that LLMs can deal effectively with the very sparse kind of evidence that appears in the literature. Also, generating hypotheses isn't exactly the rate limiting step, the bigger issue tends to be when you get people with pet projects/hypotheses in positions of power that dictate funding priorities (e.g. the decades long Alzheimer's Aβ disaster).

For automation and instrumentation of labs the vision is on point and there is interest, and active work, if not large amounts of funding, to bring that vision to reality [3, 4, 5]. However, we simply don't have the tooling needed to be able to express the full complexity of experimental protocols in a way that can be verified. Sure you can write a python script to control a robot, but it is exceptionally difficult to extract the scientific meaning from that.

My PhD work was to develop a formal language for scientific protocols, and I'll be continuing to develop it, but there is still a long way to go.

1. https://doi.org/10.7759/cureus.39238 2. https://doi.org/10.1016/j.mcpdig.2023.05.004 3. https://www.youtube.com/watch?v=_gXiVOmaVSo&t=865s 4. https://doi.org/10.1109/JIOT.2020.2995323 5. https://ccc.ucsf.edu/sites/ccc.ucsf.edu/files/Marshall_W_CCC...

pocw · on Sept 15, 2023

It's actually possible to use LLMs to assist literature reviews. We built a product that works smashingly. The key is to keyword extract, use a vector database and do search based generation.

Our key insight is that the process of citation needs to be handled outside the LLM. They're good for text processing and summarization but as you said, the LLM itself is poor at citation.

https://studyrecon.ai

tgbugs · on Sept 15, 2023

> Our key insight is that the process of citation needs to be handled outside the LLM.

I can imagine taking the citation tree and using the LLM to compact the hypotheses, results, etc. for each node in the tree and sticking that in the vector database could get you pretty far.

carapace · on Sept 15, 2023

Schmidhuber literally says that his goal is to "create an automatic scientist and then retire."

hiddencost · on Sept 15, 2023

We can already automate Schmidhuber.

All he does is write fantastical reviews to grind an axe and harass speakers at conferences.

carapace · on Sept 15, 2023

Character assassination is against site guidelines.

rangerelf · on Sept 15, 2023

I wish it would revolutionize spelling.

zvmaz · on Sept 15, 2023

According to Merriam Webster [1], "revolutionise" is the British spelling of "revolutionize" (I did not know).

[1] https://www.merriam-webster.com/dictionary/revolutionise

johnnyworker · on Sept 15, 2023

Same for civilisation and many other words, see for example the leftmost table in the second row here:

https://www.studyenglishtoday.net/british-american-spelling....

jonas21 · on Sept 15, 2023

Nothing a little revolution can't fix.

j7ake · on Sept 15, 2023

Any specific examples? What new understanding do we have of our world because of AI?

TaupeRanger · on Sept 15, 2023

TLDR: Article starts by mentioning a bunch of current/past applications of AI to science, many of which are in the medical field where (the article fails to mention) no AI system has ever been shown to improve the quality or length of human life in a randomized clinical trial. It takes a very long time to then mention just 2 things that could "revolutionize" science: LLM style systems could help you find collaborators or generate hypotheses, or "self-driving labs" could generate new hypotheses on their own, and run experiments with (presumably) minimal human intervention.

But it remains totally unclear whether those things are achievable, and to what extent they can actually do real, useful science, rather than just exist as a novelty. Of course AI "can" revolutionize science. But the proof is in the pudding. Write an article when something has happened, rather than predicting that it will (and being wrong, like every such article written for the past 70 years).

febra_ · on Sept 15, 2023

It's just the Economist. I wouldn't expect more from them.

tegmark · on Sept 15, 2023

these upsides are a sirens song. it wont matter how advanced science is or whatever benefit you can imagine if human society is destroyed. the upsides will be pointless. destruction of human society is inevitable and intrinsic to artificial intelligence.

brutusborn · on Sept 16, 2023

Got an argument for that conjecture?

AI existential risk isn’t “inevitable” in the same way that eg heat death of the universe is.

Vecr · on Sept 16, 2023

I think it's less popular that in was at one point, but lots of people (thousands, not that many people think much on the subject on an absolute level) essentially define the "win condition" for humanity as trillions of humans on hundreds of planets across many star systems, with suffering minimized as much as possible, in some people's conceptions even to the extent of using highly coercive methods against people who would prefer some level of suffering (e.g. effectively being sterilized and unable to have children unless living in an approved minimal suffering environment, in many conceptions ruling out any common lifestyle in any country on earth, including the US and Norway). In many of these people's conceptions, not reaching the "win condition" as soon as reasonably achievable is a moral wrong along the same lines as murdering multiple people would be to a standard person, so ASI (artificial super intelligence) must be developed as soon as possible to solve all the implementation and coordination problems required to reach the "win state". This viewpoint does not lead to the same decisions as standard AI risk mitigation theories, as those have much lower upsides to developing ASI, but share the same catastrophically massive downsides.

tegmark · on Sept 16, 2023

i do. do you have one for the conjecture of your camp? i certainly havent seen it. if you want to hash it out then lets do a twitter space. im not going to waste my time typing it out for you to gloss over. and unlike you, i take this matter seriously.

brutusborn · on Sept 17, 2023

I don’t have a camp, only curiosity and skepticism.

seydor · on Sept 15, 2023

... and take it private. In the end, the role of scientist itself is being put in question

kurthr · on Sept 15, 2023

Just remember you privatize the profits, and socialize the losses!

The costs (drug research and healthcare) should be born by by the public without negotiation, while the profits (drug pricing and patents) should be monopolized.

Syzygies · on Sept 15, 2023

No byline? Oh yeah, The Economist tends to be written by interns.

When my college frets that ChatGPT writes better than our B students, there are various reactions. So I had to see. I signed up for ChatGTP-3.5 and asked it "How can artificial intelligence revolutionise science?"

Huh. It didn't simply plagiarize The Economist. It gave a better answer I found easier to read.