What lies behind AGI: ethical concerns related to LLMs

🔬 Research Summary by Giada Pistilli, Ph.D. Candidate in Philosophy, specializing in Conversational AI Ethics. Sorbonne Université – CNRS.

[Original paper by Giada Pistilli]

Overview: Through the lens of moral philosophy, this paper raises questions about AI systems’ capabilities and goals, the treatment of humans hiding behind them, and the risk of perpetuating a monoculture through the English language.

Introduction

The confusion around the term “Artificial General Intelligence” (AGI) is often disputed between the marketing and research fields. To answer the question “is a machine capable of thinking?”, in 1980 American philosopher John Searle published an article in which he argued against what was then called “strong AI”. Searle argues that although AI can provide answers in Chinese, it has no background knowledge of the language. In other words, syntax is not a sufficient condition for the determination of semantics.

More recently, machine learning engineer Shane Legg has described AGI as “AI systems that aim to be quite general, for example, as general as human intelligence”. This definition seems to be more of a philosophical position than an engineering argument. The interpretation used for the term “Artificial General Intelligence” is that AI systems are becoming increasingly specialized in precise tasks, specifically in processing natural language. The definition of an AGI, according to Goertzel and Pennacin, is a broad and general-purpose AI capability that is not limited to any one task or domain. The idea is to multiply (scale) exponentially the capabilities of a given AI system. However, there are philosophical implications and moral implications to this idea that need to be considered.

Key Insights

Natural Language Processing

To address said philosophical implications of AGI, we will focus on a specific branch of AI: Natural Language Processing (NLP). NLP is a field of AI that focuses on human language, and machine learning algorithms are increasingly popular in this field. A well-known example is GPT-3, an autoregressive Large Language Model (LLM) that uses deep learning to produce human-like text. OpenAI’s API can be applied to virtually any task that involves understanding or generating natural language. There is a spectrum of templates on their API web page with different power levels suitable for various tasks.

Ethical concerns

Three particular ethical concerns, among others, can be raised when addressing the problem of AGI related to NLP.

1. Large Language Models (LLMs) have the potential to be very powerful, but it can be difficult to control and assess their impact because their capabilities are often open-ended. Additionally, there is a general confusion about the difference between LLMs and Human-Level AI, which raises questions about how to control and limit their learning. It is difficult to assess and make value judgments about something whose full range of capabilities is still unknown. Also, it will be challenging to control possible malicious uses, such as spam, fake news, automated bots, homework cheating, etc.

2. The training of Large Language Models has come under fire for its potential to perpetuate wage inequality and exploitation of workers. Crowdworkers are often poorly paid and lack benefits or protections, making them vulnerable to exploitation. The famous ImageNet dataset was labeled by Amazon’s Mechanical Turk, which has been criticized for its lack of transparency and potential to exploit workers. This set of issues raised refers to the logic of what the French philosopher Éric Sadin calls the “technoeconomy”. According to this logic, the economy would drive technical and technological developments, seeking to minimize their costs to produce maximum benefits.

3. Two additional ethical problems are directly related to language: the difficulty in controlling the text generated by the model and the lack of diversity in the training data. First, because the text generation is a probabilistic calculation, the model can produce different results depending on the input data. This way of operating can lead to the generation of toxic content, even if the input data is not harmful per se. The second problem is that the training data is overwhelmingly English, which means that other languages are not well represented. This peculiar characteristic of the language model training can lead to propagating a monoculture with a predominance of American values.

Possible solutions

The purpose of AGI is still unclear, but it is generally agreed that automating it would be desirable after safeguards have been put in place. However, developing capabilities limits ex-ante would be necessary in order to ensure that AGI does not become a danger to society. Because without a clear purpose, it will be difficult to evaluate the technology morally. Moreover, there is a growing demand for human labor to produce the datasets needed to run Large Language Models for artificial intelligence. Unfortunately, the poorest part of the world often does this in exploitative conditions. National and international institutions need to start asking questions to bring answers and a clear legislative framework for these new “data labeler-proletarians”. In addition, regarding the ethical issue related to language, creating a truly “universal” AGI does not look like something desirable. Rather than creating one monolithic AI system, making many smaller systems tailored (fine-tuned) to a specific context may be better. By doing so, AI systems development could avoid the pitfalls of creating LLMs biased towards a single culture. Instead, ML practitioners could focus on a greater diversity of values and perspectives relevant to the model’s social context.

Between the lines

Those few examples of some of the ethical concerns that Large Language Models and the idea of developing AGIs could raise show how technical and philosophical problems are often correlated. We will only be able to solve those problems if engineers and computer scientists start working alongside philosophers and social scientists. Because if science describes reality, ethics suggests how reality should be tomorrow.