We ourselves are enthusiastic users of AI in our scientific workflows. On a day-to-day basis, it all feels very exciting. But the impact of AI on science as an institution, rather than individual scientists, is a different question that demands a different kind of analysis. Writing this essay required fighting our own intuitions in many cases. If you are a scientist who is similarly excited about using these tools, we urge you to keep this difference in mind. Could AI slow science? Confronting the production-progress paradox The tragedy is that there are many AI-for-science tools that would make a real difference, such as AI for flagging potential errors in scientific code. But the labs are fixated on other things like literature review / "deep research". This is not an actual bottleneck, so it doesn't matter how much faster you make it. Meanwhile the risks of short-circuiting human understanding are enormous. Evals are part of the problem. There are three kinds of questions one can ask about a lit-review tool: Does it save a researcher time and produce results of comparable quality to existing tools? How does the use of the tool impact the researcher’s understanding of the literature compared to traditional search? What will the collective impacts on the community be if the tool were widely adopted? For example, will everyone end up citing the same few papers? Currently, only the first question is considered part of what evaluation means. The latter two are out of scope, and there aren’t even established methods or metrics for such measurement. That means that AI-for-science evaluation is guaranteed to provide a highly incomplete and biased picture of the usefulness of these tools and minimize their potential harms. Ultimately it comes down to the messed-up incentives of the AI-for-science labs, especially the ones in big AI companies. They want flashy “AI discovers X!” headlines so that they can sustain the narrative that AI will solve humanity’s problems, which buys them favorable policy treatment. We are not holding our breath for this to change. But an awareness that something is seriously wrong with the current trajectory is a good first step, and that's the goal of this essay by @sayashk and me.
6,37K