
Deep Learning Enabled Semantic Communications
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Comprehensive overview of the principles, theories, and techniques behind deep learning enabled semantic communications
Deep Learning Enabled Semantic Communications explores the synergy between deep learning and semantic communication, particularly in the context of advancing 6G networks. It provides a focused introduction to the subject, systematically covering deep learning enabled semantic communication systems and task-oriented semantic transmission paradigms in wireless communication.
The book reviews various aspects of semantic communications, including information theory, multimodal technologies, semantic noise, and semantic sensing. It explores cutting-edge semantic communication architectures, highlighting their advantages over traditional approaches and their potential to drive the future of intelligent information industry. The book also details applications of deep learning-based semantic communication systems across various sources, including text, speech, images, and videos, comprehensively addressing system design, performance optimization, and measurement metrics.
The book is divided into eight main parts, which cover foundational knowledge, system design, multimodal and multitask-oriented semantic communication systems, joint semantic sensing and sampling, semantic noise suppression, and generative AI enabled systems.
Written by a diverse group of experts in academia and research institutions, Deep Learning Enabled Semantic Communications includes information on:
- Fundamental knowledge about deep learning and semantic communications, including the history, neural networks, and semantic information theory
- Compression of multimodal inputs, extraction of global semantic information, and the design of neural networks to boost the capability of handling lengthy speech
- Incorporation of different sources to extract semantic features and serve diverse intelligent tasks at the receiver
- Introduction of semantic impairments in communications to uncover how to design robust systems
- Joint design of data sampling, compression, and coding schemes under the guidance of semantic information
- Framework of generative semantic communications to detail the principles of incorporating generative models into semantic communications
Deep Learning Enabled Semantic Communications is an essential learning resource and reference for graduate and undergraduate students pursuing degrees in wireless communications, signal processing, or deep learning as well as engineers in the telecommunications and IT industries focusing on wireless communication techniques.
Zhijin Qin is an Associate Professor with Tsinghua University, China. She is an Associate Editor for IEEE Transactions on Communications, IEEE Transactions on Cognitive Networking, and IEEE Communications Letters.
Huiqiang Xie, PhD, is an Associate Professor at Jinan University, Guangzhou, Guangdong, China.
Zhenzi Weng is a Postdoctoral researcher at Imperial College London, UK.
Xiaoming Tao is a Professor with the Department of Electronic Engineering at Tsinghua University. She is also a Senior Member of the IEEE.
More details
Other editions
Additional editions

Content
Foreword ix
Preface xi
Acknowledgments xv
Acronyms xvii
Notation xxi
1 Introduction 1
1.1 Conventional Communications versus Semantic Communications 2
1.1.1 Three-level Communications 2
1.1.2 History of Semantic Communications 3
1.2 Introducing Deep Learning to Semantic Communications 4
1.2.1 Deep Learning Basics 4
1.2.2 Deep Learning Enabled Semantic Communications 11
1.3 Semantic Communications for Further Networks 13
References 15
2 Semantic Information Theory 19
2.1 Semantic Entropy 19
2.1.1 Logical Probability Based 19
2.1.2 Synonymous Mapping Based 21
2.1.3 Fuzzy Theory Based 23
2.1.4 Task Based 24
2.2 Semantic Channel Capacity 24
2.2.1 Logical Probability Based 24
2.2.2 Synonymous Mapping Based 25
2.3 Semantic Source Coding Theorem 26
2.3.1 Logical Probability Based 26
2.3.2 Synonymous Mapping Based 27
2.4 Semantic Channel Coding Theorem 28
2.4.1 Logical Probability Based 28
2.4.2 Synonymous Mapping Based 28
2.5 Information Bottleneck 29
2.5.1 Classical Information Bottleneck 29
2.5.2 Knowledge Collision-based Information Bottleneck 30
References 30
3 Joint Semantic-channel Coding for Source Reconstruction 33
3.1 Semantic Communications for Text 34
3.1.1 Joint Semantic-channel Coding for Text 35
3.2 Semantic Communications for Speech 38
3.2.1 Joint Semantic-channel Coding for Speech 39
3.3 Semantic Communications for Image 42
3.3.1 Joint Semantic-channel Coding for Image 42
3.4 Performance Metrics 48
3.4.1 Performance Metrics for Text Accuracy 48
3.4.2 Performance Metrics for Speech Quality 49
3.4.3 Performance Metrics for Image Quality 49
References 52
4 Task-oriented Semantic Communications 55
4.1 Single-modal Task-oriented Semantic Communications 55
4.1.1 Semantic Communications for Machine Translation 56
4.1.2 Semantic Communications for Speech Recognition and Synthesis 59
4.2 Multimodal Task-oriented Semantic Communications 69
4.2.1 Semantic Communication Systems for Visual Question Answering 69
References 74
5 Joint Sensing and Semantic Communications 77
5.1 Introduction and Framework of Joint Sampling and Coding 77
5.1.1 Semantic Sampling 78
5.1.2 Semantic Reconstruction 79
5.2 Joint Semantic Sampling and Coding for Image 79
5.2.1 Semantic-aware Image Compressed Sensing 80
5.2.2 Adaptive Sampling and Semantic-channel Coding 84
5.3 Joint Semantic Sampling and Coding for Video 89
5.3.1 Semantic-based Video Sampling and Reconstruction 92
References 95
6 Semantic Impairments in Communications 97
6.1 JSCC Framework with Semantic Impairments 98
6.2 Source Semantic Impairments Suppression 100
6.2.1 Robust Semantic Communications for Text 100
6.2.2 Robust Semantic Communications for Speech 106
6.2.3 Robust Semantic Communications for Image 113
6.3 Knowledge Base Semantic Impairments Suppression 120
6.3.1 Robust SKB 120
References 125
7 Generative AI-enabled Semantic Communications 129
7.1 Introducing Generative Models to Semantic Communications 129
7.2 Framework of Generative Semantic Communications 131
7.2.1 Main Components of Generative Semantic Communication System 133
7.2.2 Key Interactions and Processes 135
7.3 Demonstration of Generative Semantic Communication for Video Conferencing 136
7.4 Applications of Semantic Communications in Other Scenarios 138
7.4.1 Immersive Communications 138
7.4.2 Autonomous Driving 139
7.4.3 Smart Cities 140
7.4.4 Satellite Networks 141
References 142
8 Conclusion and Challenges 145
Index 149
Chapter 1
Introduction
Wireless communication systems have undergone vigorous advancements from the first generation (1G) to the fifth generation (5G) over the past few decades by developing numerous coding algorithms and channel models to recover accurate sources at the bit level. However, in recent years, the flourishing of artificial intelligence (AI) has revolutionized various industries and incubated multifarious intelligent tasks, which increases the amount of data transmission to the zetta-byte level and requires massive machine connectivity with low transmission latency and energy consumption. In this context, conventional communication systems face severe challenges imposed by ubiquitous AI tasks. Therefore, it is inevitable that a new communication paradigm needs to be developed. Semantic communications have been proposed to address the challenges by extracting semantic information inherent in source data while omitting irrelative redundant information to reduce the transmission data, thereby lowering communication resources and facilitating high semantic fidelity transmission. Nevertheless, the exploration of semantic communications has gone through decades of stagnation since it was first identified because of the inadequacy of mathematical models for semantic information. Inspired by the thriving of AI, deep learning (DL)-enabled semantic communications have been scrutinized as promising solutions to the bottlenecks in conventional communications by leveraging the learning and fitting capabilities of neural networks (NNs) to bypass mathematical models for semantic extraction and representation.
To this end, this book introduces DL-enabled semantic communications, including concepts, applications, and challenges. In particular, it comprehensively covers semantic communications for source reconstruction, task-oriented semantic communications, semantic impairments, semantic knowledge bases (SKBs), and large model-driven semantic communications. The objective of this book is to help readers gain a fundamental appreciation and understanding of the emerging DL-enabled semantic communications and their implications for future intelligent wireless networks.
1.1 Conventional Communications versus Semantic Communications
In this section, we introduce the basic yet important theory of wireless communications and briefly present the history of semantic communications.
1.1.1 Three-level Communications
In 1949, Shannon and Weaver [1] categorized communications into three levels:
- Level A: how accurately can the symbols of communication be transmitted? (The technical problem)
- Level B: how precisely do the transmitted symbols convey the desired meaning? (The semantic problem)
- Level C: how effectively does the received meaning affect conduct in the desired way? (The effectiveness problem)
The first level of communications is to deliver the symbol transmission accurately, that is, syntactic communications, which improves the data rate by expanding the bandwidth resources, increasing the transmission power, and adding transmission antennas. The 1G to 5G wireless communication networks belong to syntactic communications, and the system capacity has been proven to be approaching the Shannon limit by utilizing efficient source coding and channel coding algorithms [2]. The different coding modules in conventional communication systems are designed and optimized separately, which converts the input message into the bit sequence and focuses on performance improvement at the bit or symbol level by taking bit-error rate (BER) or symbol-error rate (SER) as metrics to measure the information loss. However, the bit-oriented transmission framework requires the precise alignment of input and recovery messages. Still, it neglects the underlying meaning behind bits, which augments the possibility of transmitting unnecessary data beyond user requirements, accelerating the consumption of communication resources and increasing transmission latency. Moreover, due to the wide deployment of Internet-of-things (IoT) applications, conventional communications are no longer ideal as they transmit information that could be irrelevant to the downstream intelligent tasks at the receiver.
To this end, the second level of communications, that is, semantic communications, considers the inherent semantics of input information to tackle the technical problems in the bit-oriented paradigm. Semantics take into account the meaning and veracity of source information because they can be both informative and factual [3]. Besides, semantic data can be compressed to a proper size for transmission by using a lossless method [4], which leverages the semantic relationship between different messages, while the traditional lossless source coding represents a signal with the minimum number of binary bits by exploring the dependencies or statistical properties of the input message. Moreover, the semantic information varies for different transmission purposes, which could be in various formats, for example, age of information [5], or more complicated semantic features. Inspired by this, semantic communications deviate from the bit-oriented paradigm by extracting semantics from the input message with minimal ambiguity to facilitate semantic exchange between the transmitter and the receiver, committing to reducing network traffic by transmitting low-dimensional semantic information and optimizing the system performance by minimizing the semantic error instead of the BER or SER. It is worth mentioning that there is currently no consensus on the definition of semantic error in the field of semantic communications. In some works, semantic errors refer to inaccurate semantics that lead to misunderstanding and ambiguity, such as spelling errors in the text.
Furthermore, the third level of communications involves the effectiveness of semantic transmission, that is, pragmatic communications, which condenses the input message to obtain the semantics only associated with user requests. Inspired by this mechanism, goal/task-oriented semantic communications are boosted to perform specific downstream tasks required by receiver users by delivering task-related semantic features [6]. Conventional communication systems are constrained to achieve source reconstruction in the same modal before performing any modal conversion at the receiver. However, in goal/task-oriented semantic communications, the recovered message is no longer confined to the same modality as the input message, facilitating flexible cross-modal or single-modal to multimodal transmissions, such as speech-to-text and text-to-image/speech, to satisfy different user requests and improve the user quality of experience (QoE).
1.1.2 History of Semantic Communications
Decades ago, the concept of semantic communication was pioneered by engineers and philosophers. As far back as 1925, Dewey [7] asserted that "communication should be regarded as a mechanism for attaining purposes," while Wittgenstein [8] emphasized the centrality of meaning in philosophical discourse. Later, in 1938, Cherry went on to define semantics in terms of signs, which fall into the following categories:
- Syntax: studies the signs and their relations to other signs
- Semantics: studies the signs and their relation to the world
- Pragmatics: studies the signs and their relations to users
Potentially influenced by the aforementioned definition, in 1949, Weaver [1] formulated semantics with the perspective of engineering and outlined three communication levels, that is, technical-, semantic-, and effectiveness-levels, as noted earlier. Shannon defined communication as the precise or approximate reconstruction of information from one point to another, falling under the technical level. Consequently, Shannon applied statistical methods to quantify the uncertainty of information while overlooking the semantic and effectiveness levels. Despite Shannon's clear indication, many researchers still tried to adapt the statistical framework to interpret or evaluate semantics. Carnap and Bar-Hillel [3] first attempted to develop a "theory of semantic communication" in 1952 with the concept of logical probability. Since then, explorations into semantic communication have continued steadily.
Up until recently, semantic communication gained more and more attention as a possible solution to enable more efficient ways to exchange information. In 2003, Jeong et al. [9] first employed a knowledge graph in semantic communications. After one year, Floridi et al. [10] outlined the "theory of strongly semantic information." In 2011, Bao et al. [11] proposed a generic model of semantic communications. After a period in the wilderness, the bloom of DL has prepared the way for all progress and development of semantic communications because of the significant progress in semantic extraction and understanding. For example, the remarkable works [12, 13] designed the initial deep joint source-channel coding (JSCC) for text transmission and proposed a single-user joint semantic-channel coding in 2018 and 2021, respectively. After that, the research of semantic communications mainly focuses on the DL-enabled paradigm.
1.2 Introducing Deep Learning to Semantic Communications
We first introduce the concept of DL and present some cutting-edge algorithms relevant to the following chapters. Then, we discuss the motivations and details of applying deep learning in semantic communication systems.
1.2.1 Deep Learning...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.