Preface
Today, Arm processors are used in a wide range of systems, such as smartphones, AI SoCs, the automotive sector (for autonomous driving and infotainment), cloud servers, and MacBooks. These processors are mostly based on Armv8-A 64-bit architecture, including popular Arm processors, such as Cortex-A53, Cortex-A57, and Cortex-A78.
In system software development, Armv8-A architecture is now one of the most important topics that engineers should understand.
This book, Reverse Engineering Armv8-A Systems, was written to share practical ways to analyze binaries on Armv8-A systems. My goal is to help readers learn how Armv8-A architecture works and also build real skills through hands-on experience.
The book covers practical content that can be used directly in real-world projects. It is designed for readers who want to start learning binary analysis from the basics and move forward to a deeper understanding of low-level systems in Armv8-A systems.
Why I wrote this book
Reverse engineering means analyzing a system without access to the original source code. When you hear the term "reverse engineering," you might think of binary analysis, security research, or exploit development. These are important skills and are often seen as core skills. Many blog posts and articles talk about using reverse engineering to create exploits from a security point of view.
However, for many system software developers, reverse engineering is usually used for a different purpose: to find bugs, analyze crashes, or investigate system failures, rather than to develop exploits. This book focuses on binary analysis skills that are useful for firmware developers and system software engineers. This book is not written for offensive security. Instead, it explains detailed binary analysis methods and practical debugging techniques.
I believe that the ability to analyze binaries is a core skill for becoming an advanced engineer in embedded systems. With this book, you will learn about the key concepts of the Armv8-A architecture and gain practical experience in analyzing binaries on Armv8-A systems.
Who this book is for
If you are interested in binary analysis, reverse engineering, or debugging on Armv8-A devices, this book is for you. It is especially helpful for system software engineers, security consultants, and ethical hackers who want to expand their binary analysis expertise. To get the most value, you should have a basic understanding of C programming. Familiarity with computer architecture, Linux systems, and security concepts will also help you follow the material in this book more effectively.
What this book covers
Chapter 1, Learning Fundamentals of Arm Architecture, introduces the basic concepts of the Armv8-A architecture, such as exception levels, register usage, AAPCS, and exception handling. These fundamentals will help you understand system behavior and prepare you for binary analysis.
Chapter 2, Understanding the ELF Binary Format, introduces you to the ELF binary format, including the file header, section header, and program header. You will learn how to use the readelf
command to check binary structure and how each header helps during reverse engineering.
Chapter 3, Manipulating Data with Arm Data Processing Instructions, explains data processing instructions for arithmetic, logic, and bit shifts. You will learn how to reconstruct assembly instructions into C. These skills are key background knowledge for binary analysis.
Chapter 4, Reading and Writing with Memory Access Instructions, covers how memory access works in Armv8-A using LDR and STR. You will learn how they move data between registers and memory.
Chapter 5, Controlling Execution with Flow Control Instructions, explains flow control instructions that change how a program runs based on conditions with comparison and branch instructions.
Chapter 6, Introducing Reverse Engineering, introduces reverse engineering, a way to understand software without source code. You will learn about static and dynamic analysis, as well as the compilation process, which are important for binary analysis.
Chapter 7, Setting Up a Practice Environment with an Arm Device, covers how to set up a practice environment using an Arm device such as the Raspberry Pi or QEMU. With these tools, you will perform binary analysis.
Chapter 8, Unpacking the Kernel with Linux Fundamentals, focuses on Linux basics: user space, kernel space, system calls, and process management. You will also learn about memory management and security features such as LSM and KASLR.
Chapter 9, Understanding Basic Static Analysis, covers basic static analysis for reverse engineering by using binary utilities. You will learn how to check the type of a binary file and how to examine a corrupted object file. You will also learn how to read assembly code and understand how to convert it into C code.
Chapter 10, Going Deeper with Advanced Static Analysis, covers advanced static analysis for kernel binaries such as *.ko
and vmlinux
. You will learn about the ELF structure, typical kernel binary patterns, and how to recognize elements such as the .modinfo
section.
Chapter 11, Analyzing Program Behavior with Basic Dynamic Analysis, discusses basic dynamic analysis, including its benefits and limitations. You will use tools such as GDB and GEF to debug various user-space binaries. This chapter also provides case studies related to memory corruption issues.
Chapter 12, Expert Techniques in Advanced Dynamic Analysis, covers advanced dynamic analysis of kernel binaries using the Crash utility. You will learn how to identify kernel structures such as task_struct
using stack patterns and memory addresses. These skills are a key feature of this book.
Chapter 13, Tracing Execution with uftrace, explores uftrace, a powerful open source tool to monitor process execution. You will learn how to install and use uftrace with a simple example that traces function calls and return values.
Chapter 14, Securing Execution with Armv8-A TrustZone, explores TrustZone in Armv8-A. You will also learn how software switches between the non-secure and secure worlds using the SMC instruction. It also explains hardware features that support TrustZone, such as the AxPROT signal.
Chapter 15, Building Defenses with Key Security Features of Armv8-A, explains the latest security features in Armv8-A, including PAN, PAC, BTI, and MTE. These features are used to protect systems by controlling memory access and verifying addresses.
To get the most out of this book
To get the most benefit from this book, we recommend the following:
- You should be familiar with using Linux, especially with the command-line interface (shell).
- You should have basic knowledge of the C programming language.
We have tested all the example code in this book using the following platforms:
- x86_64 Ubuntu 22.04 LTS as a guest OS (running on Oracle VirtualBox 7.0)
- Raspberry Pi 4 Model B (64-bit Arm), tested with both the standard distribution kernel and our custom 6.6 kernel (lightly tested)
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
Everything will be explained step by step. Whether you are a beginner or an experienced developer, this book will guide you through the interesting and practical world of binary analysis for reverse engineering.
Download the example code files
The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Reverse-Engineering-Armv8-A-Systems. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing. Check them out!
Download the color images
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/gbp/9781835088920.
Conventions used
There are a number of text conventions used throughout this book.
CodeInText
: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: "The MOV
instruction is used to copy the value of an operand into the destination register":
A block of code is set as follows:
struct task_struct { int flags; int state; char task_name[15]; };
Any command-line input or output is written as follows:
crash> rd ffffff8040238018 ffffff8040238018: ffffffc008028000
Bold: Indicates a new term, an important word, or words that you see on the screen. For instance, words in menus or dialog boxes appear in the...