Communicative AI

Name: Communicative AI | A Critical Introduction to Large Language Models
Brand: Wiley
Price: 17.99 EUR
Availability: OnlineOnly

A Critical Introduction to Large Language Models

Mark Coeckelbergh David J. Gunkel(Author)

Wiley (Publisher)

1st Edition

Published on 17. April 2025

189 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-5095-6761-4 (ISBN)

€17.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

More details

Other editions

Persons

Content

Foreword

Introduction
1. LLM 101
2. Ethical, Legal and Societal Challenges
3. Intelligence, Consciousness and the Problem of Other Minds
4. Language, Meaning and Communication
5. Authorship and Authority
6. Truth, Lies and Hallucinations
7. Does Writing Have a Future?

References

1
LLM 101

The large language model (LLM) is a recent innovation in natural language processing (NLP) that employs transformer architectures pre-trained on massive amounts of digital text scraped from the internet. As a result, applications such as OpenAI's generative pre-trained transformer (GPT) series, Google's LaMDA (Language Model for Dialogue Applications), Google's Bidirectional Encoder Representations from Transformers (BERT), and ANTHROP\C's Claude can generate original text content that is in many cases indistinguishable from human-written material.

This chapter aims to demystify the technology of LLMs by (1) situating LLMs within the larger context of NLP artificial intelligence (AI), (2) providing a high-level explanation of the technical operations and features of LLM applications, and (3) identifying and explaining some of the important technical challenges of LLM AI. In effect, this chapter pops the hood on the black box, in an effort to provide those with little or no background in the subject with the basic knowledge they need to make sense of and engage with the critical analyses of communicative AI and LLMs that follow in subsequent chapters.

Natural Language Processing

Creating machines that can talk or communicate with human users in and by employing what is called "natural language" has not only been prototyped in decades of science fiction; it has also been one of the principal objectives of the science and engineering practice of AI from the very beginning. It was, for instance, the first item on the list of proposed tasks to be addressed and accomplished during the Dartmouth summer seminar of 1956 - the pivotal event that first gave us the term "artificial intelligence." It was the defining condition and test case for "machine intelligence" in Alan Turing's agenda-setting paper from 1950. And it was implemented and demonstrated in some of the earliest applications of AI technology, for instance Joseph Weizenbaum's ELIZA chatbot program and Terry Winograd's SHRDLU. For this reason, working with, processing, and reproducing natural human language content is not one application among others; it is the definitive application of AI.

But computers, which process numeric data, do not understand language, at least not in the sense in which we understand the understanding of language. Consequently, developing algorithms that can work with and simulate the understanding of natural language content needs to proceed in a manner that is radically different from the ways in which we deploy and makes sense of language. And the key linguistic insight that makes all this possible is the fact that human languages are probabilistic systems.

Saying that natural human languages are probabilistic simply means that, for any language, some sequences of characters that make up words and some sequences of words that make up phrases are more likely to occur than other sequences. In English, for example, the characters that constitute the word "toaster" are more likely to occur in the sequence that produces this word than in any other sequence - say, "rttaore." Likewise, a word sequence like "the book is on the table" is more likely to occur and be used by English speakers than any other sequence of the same words, say "on is the table book the."

For this reason, producing legible texts can be accomplished through a process of selecting and arranging the right sequences of words, specifically those sequences that have a high probability of actually occurring in the language. Consequently, if we are given a particular set of words, it is possible to generate sequences that would be considered intelligible and meaningful - such as "this sentence is meaningful" or "is this sentence meaningful?" - and a number, typically greater, of sequences that would be considered incorrect or nonsense - "sentence this meaningful is," "meaningful this is sentence," "this meaningful is sentence," and so on. This means that it is entirely possible to generate valid word sequences at random - that is, without the generating system needing to "know" or "understand" anything about what is being produced. This insight can be illustrated with the infinite monkey theorem, which states that an infinite number of monkeys operating an infinite number of typewriters for an infinite period of time will eventually produce all the great works of literature - for example William Shakespeare's Hamlet. The theorem is theoretically true - meaning that the probability is not zero - but practically impossible, given the sheer magnitude of the task.

To gain some control over the problem represented by this theorem, one can use labeled data and prewritten assembly rules or templates. We could, for example, assemble a database of words where each word is categorized (or labeled) as a noun, verb, preposition, article, and so on. Then we could write an algorithm - a set of instructions - that selects a word from each category at random and arranges these randomly selected elements according to some predefined assembly rule (see Table 1.1).

This approach to language generation employs a standard method in AI development called "symbolic reasoning." In this case, the method by which the words or linguistic tokens come to be arranged is something that is predefined by a human programmer and then applied to the labeled data. This way of proceeding - which is also called "good old-fashioned AI" (GOFAI), since it was the method initially employed in the first several decades of AI research and development - is obviously a more effective manner of producing legible content, but is not necessarily the best when it comes to automatically generating actually useable content such as stories about a sporting event, weather and financial reports, or personalized correspondence.

For these kinds of applications we can go one step further and combine labeled data with prefabricated templates. You probably already have some familiarity with this approach, because it has been around for a number of decades, in the form of the form letter. Consider a situation where you have a number of individuals contributing money to charity, and you want to send each one a personalized thank-you message. You could obviously write each letter individually, or you could write a basic letter template with blanks or open slots that can be filled with specific pieces of information stored on a spreadsheet (Table 1.2).

Database:

Noun Verb Article Preposition man talked the to woman walked a with dog thought about city in robot

Assembly Rule:

Article + Noun + Verb + Preposition + Article + Noun

Example Results:

The woman talked about a man.
The city thought to a dog.
The woman walked with a robot.
The woman thought to a man.
The city talked with a robot.

Table 1.1 Random sentence generation using labeled data and a predefined assembly rule. From Gunkel 2020, 175.

By combining structured data with a predefined template, one can automatically generate a number of different kinds of texts, all of which are a kind of variation on a common theme. But, like many GOFAI approaches, this produces NLP applications that are brittle and static and do not scale. It does not take much to produce an error with these systems, as a result of missing or mislabeled data. Additionally, hand-coded templates often cannot handle new kinds of data without human programmers going into the code and reworking the template. This means not only that these systems have difficulty accommodating new data and tasks but also that maintaining them can be labor-intensive and expensive over time. Finally and perhaps most importantly for users, because the assembly rules in the template are fixed and static, the resulting texts often feel artificial, mechanical, or "canned." After reading a handful of texts produced by one of these NLP systems, everything starts to sound the same and there is virtually no variation in the way the content is formulated and presented.

name email amount charity Grace Hopper ghopper@gmail.com 450.00 The March of Dimes John McCarthy jmccarthy@aol.com 500.00 The Sierra Club Ada Lovelace ada@itc.co.uk 525.00 Doctors without Borders Tristan Tzara ttzara@dada.org 795.00 The World Wildlife Fund Claude Shannon cshannon@att.com 475.00 The American Red Cross

<name>
<email>
Dear <name>,
Thank you for your...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Communicative AI

Description

More details

Other editions

Additional editions

Persons

Content

1
LLM 101

Natural Language Processing

System requirements

Schweitzer Fachinformationen

Communicative AI

Description

More details

Other editions

Additional editions

Persons

Content

1 LLM 101

Natural Language Processing

System requirements

1
LLM 101