
Inside the Machine
An Illustrated Introduction to Microprocessors and Computer Architecture
Jon Stokes(Author)
No Starch Press
Published on 1. December 2006
320 pages
978-1-59327-132-9 (ISBN)
System requirements
for ePUB without DRM
E-Book Single Licence
You are acquiring a single user licence for this eBook, which you might not transfer. [L]
Available for download
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Computers perform countless tasks ranging from the business critical to the recreational, but regardless of how differently they may look and behave, they're all amazingly similar in basic function. Once you understand how the microprocessor—or central processing unit (CPU)—works, you'll have a firm grasp of the fundamental concepts at the heart of all modern computing.
Inside the Machine, from the co-founder of the highly respected Ars Technica website, explains how microprocessors operate—what they do and how they do it. The book uses analogies, full-color diagrams, and clear language to convey the ideas that form the basis of modern computing. After discussing computers in the abstract, the book examines specific microprocessors from Intel, IBM, and Motorola, from the original models up through today's leading processors. It contains the most comprehensive and up-to-date information available (online or in print) on Intel's latest processors: the Pentium M, Core, and Core 2 Duo. Inside the Machine also explains technology terms and concepts that readers often hear but may not fully understand, such as "pipelining," "L1 cache," "main memory," "superscalar processing," and "out-of-order execution."
Includes discussion of:
-Parts of the computer and microprocessor
-Programming fundamentals (arithmetic instructions, memory accesses, control flow instructions, and data types)
-Intermediate and advanced microprocessor concepts (branch prediction and speculative execution)
-Intermediate and advanced computing concepts (instruction set architectures, RISC and CISC, the memory hierarchy, and encoding and decoding machine language instructions)
-64-bit computing vs. 32-bit computing
-Caching and performance
Inside the Machine is perfect for students of science and engineering, IT and business professionals, and the growing community of hardware tinkerers who like to dig into the guts of their machines.
More details
Language
English
Place of publication
New York
United States
Product notice
Reflowable
File size
7,87 MB
ISBN-13
978-1-59327-132-9 (9781593271329)
Schweitzer Classification
Other editions
Additional editions

Jon Stokes
Inside the Machine
An Illustrated Introduction to Microprocessors and Computer Architecture
Book
12/2006
No Starch Press
€67.00
Shipment within 3-4 weeks

Jon M. Stokes
Inside the Machine
A Illustrated Introduction to Microprocessors and Computer Architecture
Book
01/2006
1st Edition
No Starch Press
€50.00
Article exhausted; check different version
Person
Jon "Hannibal" Stokes is co-founder of and Senior CPU Editor of Ars Technica. He has written for a variety of publications on microprocessor architecture and the technical aspects of personal computing. Stokes holds a degree in computer engineering from Louisiana State University and two advanced degrees in the humanities from Harvard. He is currently pursuing a Ph.D. at the University of Chicago.
Content
- Intro
- Inside the Machine
- Preface
- Acknowledgments
- Introduction
- 1. Basic Computing Concepts
- The Calculator Model of Computing
- The File-Clerk Model of Computing
- The Stored-Program Computer
- Refining the File-Clerk Model
- The Register File
- RAM: When Registers Alone Won't Cut It
- The File-Clerk Model Revisited and Expanded
- An Example: Adding Two Numbers
- A Closer Look at the Code Stream: The Program
- General Instruction Types
- The DLW-1's Basic Architecture and Arithmetic Instruction Format
- The DLW-1's Arithmetic Instruction Format
- The DLW-1's Memory Instruction Format
- An Example DLW-1 Program
- A Closer Look at Memory Accesses: Register vs. Immediate
- Immediate Values
- Register-Relative Addressing
- 2. The Mechanics of Program Execution
- Opcodes and Machine Language
- Machine Language on the DLW-1
- Binary Encoding of Arithmetic Instructions
- Binary Encoding of Memory Access Instructions
- The load Instruction
- The store Instruction
- Translating an Example Program into Machine Language
- The Programming Model and the ISA
- The Programming Model
- The Instruction Register and Program Counter
- The Instruction Fetch: Loading the Instruction Register
- Running a Simple Program: The Fetch-Execute Loop
- The Clock
- Branch Instructions
- Unconditional Branch
- Conditional Branch
- Branch Instructions and the Fetch-Execute Loop
- The Branch Instruction as a Special Type of Load
- Branch Instructions and Labels
- Excursus: Booting Up
- 3. Pipelined Execution
- The Lifecycle of an Instruction
- Basic Instruction Flow
- Pipelining Explained
- Applying the Analogy
- A Non-Pipelined Processor
- A Pipelined Processor
- Shrinking the Clock
- Shrinking Program Execution Time
- The Speedup from Pipelining
- Program Execution Time and Completion Rate
- The Relationship Between Completion Rate and Program Execution Time
- Instruction Throughput and Pipeline Stalls
- Instruction Throughput
- Pipeline Stalls
- Instruction Latency and Pipeline Stalls
- Limits to Pipelining
- Clock Period and Completion Rate
- The Cost of Pipelining
- 4. Superscalar Execution
- Superscalar Computing and IPC
- Expanding Superscalar Processing with Execution Units
- Basic Number Formats and Computer Arithmetic
- Arithmetic Logic Units
- Memory-Access Units
- Microarchitecture and the ISA
- A Brief History of the ISA
- Moving Complexity from Hardware to Software
- Challenges to Pipelining and Superscalar Design
- Data Hazards
- Structural Hazards
- The Register File
- Control Hazards
- 5. The Intel Pentium and Pentium Pro
- The Original Pentium
- Caches
- The Pentium's Pipeline
- The Branch Unit and Branch Prediction
- The Pentium's Back End
- The Integer ALUs
- The Floating-Point ALU
- x86 Overhead on the Pentium
- Summary: The Pentium in Historical Context
- The Intel P6 Microarchitecture: The Pentium Pro
- Decoupling the Front End from the Back End
- The Issue Phase
- The Completion Phase
- The P6's Issue Phase: The Reservation Station
- The P6's Completion Phase: The Reorder Buffer
- The Instruction Window
- The P6 Pipeline
- Branch Prediction on the P6
- The P6 Back End
- CISC, RISC, and Instruction Set Translation
- The P6 Microarchitecture's Instruction Decoding Unit
- The Cost of x86 Legacy Support on the P6
- Summary: The P6 Microarchitecture in Historical Context
- The Pentium Pro
- The Pentium II
- The Pentium III
- Conclusion
- 6. PowerPC Processors: 600 Series, 700 Series, and 7400
- A Brief History of PowerPC
- The PowerPC 601
- The 601's Pipeline and Front End
- The PowerPC Instruction Queue
- Instruction Scheduling on the 601
- The 601's Back End
- The Integer Unit
- The Floating-Point Unit
- The Branch Execution Unit
- The Sequencer Unit
- Latency and Throughput Revisited
- Summary: The 601 in Historical Context
- The PowerPC 603 and 603e
- The 603e's Back End
- The 603e's Front End, Instruction Window, and Branch Prediction
- Summary: The 603 and 603e in Historical Context
- The PowerPC 604
- The 604's Pipeline and Back End
- The 604's Front End and Instruction Window
- The Issue Phase: The 604's Reservation Stations
- The Four Rules of Instruction Dispatch
- The Completion Phase: The 604's Reorder Buffer
- Summary: The 604 in Historical Context
- The PowerPC 604e
- The PowerPC 750 (aka the G3)
- The 750's Front End, Instruction Window, and Branch Instruction
- Summary: The PowerPC 750 in Historical Context
- The PowerPC 7400 (aka the G4)
- The G4's Vector Unit
- Summary: The PowerPC G4 in Historical Context
- Conclusion
- 7. Intel's Pentium 4 vs. Motorola's G4e: Approaches and Design Philosophies
- The Pentium 4's Speed Addiction
- The General Approaches and Design Philosophies of the Pentium 4 and G4e
- An Overview of the G4e's Architecture and Pipeline
- Stages 1 and 2: Instruction Fetch
- Stage 3: Decode/Dispatch
- Stage 4: Issue
- Stage 5: Execute
- Stages 6 and 7: Complete and Write-Back
- Branch Prediction on the G4e and Pentium 4
- An Overview of the Pentium 4's Architecture
- Expanding the Instruction Window
- The Trace Cache
- Shortening Instruction Execution Time
- The Trace Cache's Operation
- An Overview of the Pentium 4's Pipeline
- Stages 1 and 2: Trace Cache Next Instruction Pointer
- Stages 3 and 4: Trace Cache Fetch
- Stage 5: Drive
- Stages 6 Through 8: Allocate and Rename (ROB)
- Stage 9: Queue
- Stages 10 Through 12: Schedule
- Stages 13 and 14: Issue
- Stages 15 and 16: Register Files
- Stage 17: Execute
- Stage 18: Flags
- Stage 19: Branch Check
- Stage 20: Drive
- Stages 21 and Onward: Complete and Commit
- The Pentium 4's Instruction Window
- 8. Intel's Pentium 4 vs. Motorola's G4e: The Back End
- Some Remarks About Operand Formats
- The Integer Execution Units
- The G4e's IUs: Making the Common Case Fast
- The Pentium 4's IUs: Make the Common Case Twice as Fast
- The Floating-Point Units (FPUs)
- The G4e's FPU
- The Pentium 4's FPU
- Concluding Remarks on the G4e's and Pentium 4's FPUs
- The Vector Execution Units
- A Brief Overview of Vector Computing
- Vectors Revisited: The AltiVec Instruction Set
- AltiVec Vector Operations
- Intra-Element Arithmetic and Non-Arithmetic Instructions
- Inter-Element Arithmetic and Non-Arithmetic Instructions
- The G4e's VU: SIMD Done Right
- Intel's MMX
- SSE and SSE2
- The Pentium 4's Vector Unit: Alphabet Soup Done Quickly
- Increasing Floating-Point Performance with SSE2
- Conclusions
- 9. 64-Bit Computing and x86-64
- Intel's IA-64 and AMD's x86-64
- Why 64 Bits?
- What Is 64-Bit Computing?
- Current 64-Bit Applications
- Dynamic Range
- The Benefits of Increased Dynamic Range, or, How the Existing 64-Bit Computing Market Uses 64-Bit Integers
- Virtual Address Space vs. Physical Address Space
- The Benefits of a 64-Bit Address
- The 64-Bit Alternative: x86-64
- Extended Registers
- More Registers
- Switching Modes
- Out with the Old
- Conclusion
- 10. The G5: IBM's PowerPC 970
- Overview: Design Philosophy
- Caches and Front End
- Branch Prediction
- The Trade-Off: Decode, Cracking, and Group Formation
- Dispatching and Issuing Instructions on the PowerPC 970
- The 970's Dispatch Rules
- Predecoding and Group Dispatch
- Some Preliminary Conclusions on the 970's Group Dispatch Scheme
- The PowerPC 970's Back End
- Integer Unit, Condition Register Unit, and Branch Unit
- The Integer Units Are Not Fully Symmetric
- Integer Unit Latencies and Throughput
- The CRU
- The PowerPC Condition Register
- Preliminary Conclusions About the 970's Integer Performance
- Load-Store Units
- Front-Side Bus
- The Floating-Point Units
- Vector Computing on the PowerPC 970
- Floating-Point Issue Queues
- Integer and Load-Store Issue Queues
- BU and CRU Issue Queues
- Vector Issue Queues
- The Performance Implications of the 970's Group Dispatch Scheme
- Conclusions
- 11. Understanding Caching and Performance
- Caching Basics
- The Level 1 Cache
- The Level 2 Cache
- Example: A Byte's Brief Journey Through the Memory Hierarchy
- Cache Misses
- Locality of Reference
- Spatial Locality of Data
- Spatial Locality of Code
- Temporal Locality of Code and Data
- Locality: Conclusions
- Cache Organization: Blocks and Block Frames
- Tag RAM
- Fully Associative Mapping
- Direct Mapping
- N-Way Set Associative Mapping
- Four-Way Set Associative Mapping
- Two-Way Set Associative Mapping
- Two-Way vs. Direct-Mapped
- Two-Way vs. Four-Way
- Associativity: Conclusions
- Temporal and Spatial Locality Revisited: Replacement/Eviction Policies and Block Sizes
- Types of Replacement/Eviction Policies
- Block Sizes
- Write Policies: Write-Through vs. Write-Back
- Conclusions
- 12. Intel's Pentium M, Core Duo, and Core 2 Duo
- Code Names and Brand Names
- The Rise of Power-Efficient Computing
- Power Density
- Dynamic Power Density
- Static Power Density
- The Pentium M
- The Fetch Phase
- The Hardware Loop Buffer
- The Decode Phase: Micro-ops Fusion
- Fused Stores
- Fused Loads
- The Impact of Micro-ops Fusion
- Branch Prediction
- The Loop Detector
- The Indirect Predictor
- The Stack Execution Unit
- Pipeline and Back End
- Summary: The Pentium M in Historical Context
- Core Duo/Solo
- Intel's Line Goes Multi-Core
- Processor Organization and Core Microarchitecture
- Multiprocessing and Chip Multiprocessing
- Core Duo's Improvements
- Micro-ops Fusion of SSE and SSE2 store and load-op Instructions
- Micro-ops Fusion and Lamination of SSE and SSE2 Arithmetic Instructions
- Micro-ops Fusion of Miscellaneous Non-SSE Instructions
- Improved Loop Detector
- SSE3
- Floating-Point Improvement
- Integer Divide Improvement
- Virtualization Technology
- Summary: Core Duo in Historical Context
- Core 2 Duo
- The Fetch Phase
- Macro-Fusion
- The Decode Phase
- Core's Pipeline
- Core's Back End
- Integer Units
- Floating-Point Units
- Vector Processing Improvements
- 128-bit Vector Execution on the P6 Through Core Duo
- 128-bit Vector Execution on Core
- Memory Disambiguation: The Results Stream Version of Speculative Execution
- The Lifecycle of a Memory Access Instruction
- The Memory Reorder Buffer
- Memory Aliasing
- Memory Reordering Rules
- False Aliasing
- Memory Disambiguation
- Summary: Core 2 Duo in Historical Context
- A. Bibliography and Suggested Reading
- Online Resources
- Index
- About the Author
- Colophon
- B. Updates
System requirements
File format: ePUB
Copy protection: without DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use a reader that can handle the file format ePUB, such as Adobe Digital Editions or FBReader – both free (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePUB works well for novels and non-fiction books – i.e., 'flowing' text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook does not use copy protection or Digital Rights Management
For more information, see our eBook Help page.