Are Large Language Models a Threat to Digital Public Goods? Evidence from Activity on Stack Overflow

🔬 Research Summary by Maria del Rio-Chanona, Nadzeya Laurentsyeva, and Johannes Wachs.

MdRC is a JSMF research fellow at the Complexity Science Hub in Vienna and visiting scholar at the Harvard Kennedy School

NL is an Assistant Professor at the Faculty of Economics, LMU Munich, working at the Chair of Organizational Economics.

JW is an associate professor at Corvinus University of Budapest and a senior research fellow at the Hungarian Centre for Economic and Regional Studies.

[Original paper by R. Maria del Rio-Chanona, Nadzeya Laurentsyeva, and Johannes Wachs]

Overview: The widespread adoption of large language models can substitute for public knowledge sharing in online communities. This paper finds that the release of ChatGPT led to a significant decrease in content creation on Stack Overflow, the biggest question-and-answer (Q&A) community for computer programming. We argue that this increasingly displaced content is an important public good, which provides essential information for learners, both human and artificial.

Introduction

Have you ever wondered how the rise of AI language models like ChatGPT might change how we share and access information online? In our recent study, we measured the impact of ChatGPT on Stack Overflow. On this popular online platform, computer programmers ask and answer questions, forming a library of content that anyone with an internet connection can learn from.

To investigate the consequence of AI adoption on digital public goods, we analyzed the activity on Stack Overflow before and after the release of ChatGPT. We compared this with the activity on Mathematics and Maths Overflow, two Stack Exchange platforms for which ChatGPT is less able to answer questions, as well as the Russian and Chinese language versions of Stack Overflow, for whose users ChatGPT was harder to access.

We observe a 16% decrease in activity on Stack Overflow relative to the less affected platforms following the release of ChatGPT. The effect’s magnitude increases over time, reaching 25% by the end of May 2023. While we do not find major differences in the quality of displaced content, we note significant heterogeneity between programming languages, with more popular ones being more strongly affected. This suggests that as more people turn to AI for answers, less knowledge is publicly shared.

Key Insights

The Impact of AI on Digital Public Goods

Large language models (LLMs) like ChatGPT can provide users with information on various topics, making them a convenient alternative to traditional web searches or online Q&A communities. But what happens to the wealth of human-generated data on the web when more people start turning to AI for answers?

Our research focused on this question, investigating the potential impact of AI language models on digital public goods. Digital public goods, in this context, refer to the vast library of human-generated data and knowledge resources available on the web. These resources, such as the information shared on platforms like Stack Overflow, Wikipedia, or Reddit, serve as crucial resources for learning, problem-solving, and even training future AI models.

The Case of Stack Overflow

We focused on Stack Overflow because it is a rich source of human-generated data, with tens of millions of posts since its launch in 2008. It covers a wide range of programming languages and topics. Moreover, LLMs, in general, are relatively good at coding. We analyzed posting activity on Stack Overflow before and after the release of ChatGPT, comparing the changes on Stack Overflow against similar platforms, where ChatGPT was less likely to make an impact. Specifically, we considered Russian and Chinese language counterparts to Stack Overflow because ChatGPT is not available in Russia or China, and question and answer communities focusing on advanced mathematics, which ChatGPT cannot (yet) provide much help with.

Findings: A Decrease in Activity

Our analysis revealed a significant decrease in activity on Stack Overflow following the release of ChatGPT compared to the control platforms. We estimate a 16% relative decrease in weekly posts since the release of ChatGPT, with the effect’s magnitude reaching a 25% decrease by June 2023. Interestingly, the decrease in activity was not limited to duplicate or low-quality content. We found that posts made after ChatGPT received similar positive and negative voting scores to those made before, indicating that high-quality content was also being displaced.

Furthermore, the impact of ChatGPT varied across different programming languages, with ChatGPT being a better substitute for Stack Overflow when more training data had been available. Accordingly, posting activity in popular languages, like Python and Javascript, decreased significantly more than the global site average.

Implications: A Shift in Information Exchange

Why is this displacement important? We discuss four implications for the field of artificial intelligence in our paper. First, if language models crowd out open data, they will be limiting their own future training data. There is a growing body of literature that LLMs cannot learn effectively from the content they generate. In this way, successful LLMs may ironically be limiting their future training sources.

Second, current leaders like OpenAI are accumulating a significant advantage over competitors: their models can learn from their user inputs and feedback while they drain the pool of open data. Third, the shift from a public provision of information on the web to a private one may have significant economic consequences, for instance, by amplifying inequalities or limiting the ability of people and firms to signal their abilities.

Finally, while AI models like ChatGPT offer efficiency and convenience, we know that centralized information sources have drawbacks. For instance, when the web and search engines made it easier to search for scientific journals, researchers started citing more recent papers and fewer journals as their sources. Such a narrowing of our collective focus and attention, even if it moves towards relatively high-quality information, limits the diversity of signals we are exposed to and may lead to suboptimal conformity. More generally, it is unclear how LLMs will help us deal with new problems as the world changes.

Between the lines

Our findings present a mixed picture of the interplay between AI and human-generated digital content. While language models like ChatGPT offer undeniable benefits in terms of efficiency and convenience, our research suggests that their widespread adoption could have unintended consequences for the richness and diversity of our shared digital knowledge base. More work is needed to tease out the heterogeneities of the impact of LLMs on digital public goods and how different platforms and communities are affected.

But the big open question is what we should do about this. Can we better incentivize or give credit for contributions to digital public goods? Can we empower people who create data that platforms, firms, and models use to capture some of that value? It seems to us that for the sake of an open and intellectually diverse web, we must address these questions.