Inside the Machine

Name: Inside the Machine | An Illustrated Introduction to Microprocessors and Computer Architecture
Brand: No Starch Press
Price: 46.99 EUR
Availability: OnlineOnly

An Illustrated Introduction to Microprocessors and Computer Architecture

Jon Stokes(Author)

No Starch Press

Published on 1. December 2006

320 pages

E-Book

ePUB without DRM

System requirements

978-1-59327-132-9 (ISBN)

€46.99incl. 7% vat

System requirements

for ePUB without DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Computers perform countless tasks ranging from the business critical to the recreational, but regardless of how differently they may look and behave, they're all amazingly similar in basic function. Once you understand how the microprocessor—or central processing unit (CPU)—works, you'll have a firm grasp of the fundamental concepts at the heart of all modern computing. Inside the Machine, from the co-founder of the highly respected Ars Technica website, explains how microprocessors operate—what they do and how they do it. The book uses analogies, full-color diagrams, and clear language to convey the ideas that form the basis of modern computing. After discussing computers in the abstract, the book examines specific microprocessors from Intel, IBM, and Motorola, from the original models up through today's leading processors. It contains the most comprehensive and up-to-date information available (online or in print) on Intel's latest processors: the Pentium M, Core, and Core 2 Duo. Inside the Machine also explains technology terms and concepts that readers often hear but may not fully understand, such as "pipelining," "L1 cache," "main memory," "superscalar processing," and "out-of-order execution." Includes discussion of: -Parts of the computer and microprocessor -Programming fundamentals (arithmetic instructions, memory accesses, control flow instructions, and data types) -Intermediate and advanced microprocessor concepts (branch prediction and speculative execution) -Intermediate and advanced computing concepts (instruction set architectures, RISC and CISC, the memory hierarchy, and encoding and decoding machine language instructions) -64-bit computing vs. 32-bit computing -Caching and performance Inside the Machine is perfect for students of science and engineering, IT and business professionals, and the growing community of hardware tinkerers who like to dig into the guts of their machines.

More details

Other editions

Person

Content

Intro
Inside the Machine
Preface
Acknowledgments
Introduction
1. Basic Computing Concepts
The Calculator Model of Computing
The File-Clerk Model of Computing
The Stored-Program Computer
Refining the File-Clerk Model
The Register File
RAM: When Registers Alone Won't Cut It
The File-Clerk Model Revisited and Expanded
An Example: Adding Two Numbers
A Closer Look at the Code Stream: The Program
General Instruction Types
The DLW-1's Basic Architecture and Arithmetic Instruction Format
The DLW-1's Arithmetic Instruction Format
The DLW-1's Memory Instruction Format
An Example DLW-1 Program
A Closer Look at Memory Accesses: Register vs. Immediate
Immediate Values
Register-Relative Addressing
2. The Mechanics of Program Execution
Opcodes and Machine Language
Machine Language on the DLW-1
Binary Encoding of Arithmetic Instructions
Binary Encoding of Memory Access Instructions
The load Instruction
The store Instruction
Translating an Example Program into Machine Language
The Programming Model and the ISA
The Programming Model
The Instruction Register and Program Counter
The Instruction Fetch: Loading the Instruction Register
Running a Simple Program: The Fetch-Execute Loop
The Clock
Branch Instructions
Unconditional Branch
Conditional Branch
Branch Instructions and the Fetch-Execute Loop
The Branch Instruction as a Special Type of Load
Branch Instructions and Labels
Excursus: Booting Up
3. Pipelined Execution
The Lifecycle of an Instruction
Basic Instruction Flow
Pipelining Explained
Applying the Analogy
A Non-Pipelined Processor
A Pipelined Processor
Shrinking the Clock
Shrinking Program Execution Time
The Speedup from Pipelining
Program Execution Time and Completion Rate
The Relationship Between Completion Rate and Program Execution Time
Instruction Throughput and Pipeline Stalls
Instruction Throughput
Pipeline Stalls
Instruction Latency and Pipeline Stalls
Limits to Pipelining
Clock Period and Completion Rate
The Cost of Pipelining
4. Superscalar Execution
Superscalar Computing and IPC
Expanding Superscalar Processing with Execution Units
Basic Number Formats and Computer Arithmetic
Arithmetic Logic Units
Memory-Access Units
Microarchitecture and the ISA
A Brief History of the ISA
Moving Complexity from Hardware to Software
Challenges to Pipelining and Superscalar Design
Data Hazards
Structural Hazards
The Register File
Control Hazards
5. The Intel Pentium and Pentium Pro
The Original Pentium
Caches
The Pentium's Pipeline
The Branch Unit and Branch Prediction
The Pentium's Back End
The Integer ALUs
The Floating-Point ALU
x86 Overhead on the Pentium
Summary: The Pentium in Historical Context
The Intel P6 Microarchitecture: The Pentium Pro
Decoupling the Front End from the Back End
The Issue Phase
The Completion Phase
The P6's Issue Phase: The Reservation Station
The P6's Completion Phase: The Reorder Buffer
The Instruction Window
The P6 Pipeline
Branch Prediction on the P6
The P6 Back End
CISC, RISC, and Instruction Set Translation
The P6 Microarchitecture's Instruction Decoding Unit
The Cost of x86 Legacy Support on the P6
Summary: The P6 Microarchitecture in Historical Context
The Pentium Pro
The Pentium II
The Pentium III
Conclusion
6. PowerPC Processors: 600 Series, 700 Series, and 7400
A Brief History of PowerPC
The PowerPC 601
The 601's Pipeline and Front End
The PowerPC Instruction Queue
Instruction Scheduling on the 601
The 601's Back End
The Integer Unit
The Floating-Point Unit
The Branch Execution Unit
The Sequencer Unit
Latency and Throughput Revisited
Summary: The 601 in Historical Context
The PowerPC 603 and 603e
The 603e's Back End
The 603e's Front End, Instruction Window, and Branch Prediction
Summary: The 603 and 603e in Historical Context
The PowerPC 604
The 604's Pipeline and Back End
The 604's Front End and Instruction Window
The Issue Phase: The 604's Reservation Stations
The Four Rules of Instruction Dispatch
The Completion Phase: The 604's Reorder Buffer
Summary: The 604 in Historical Context
The PowerPC 604e
The PowerPC 750 (aka the G3)
The 750's Front End, Instruction Window, and Branch Instruction
Summary: The PowerPC 750 in Historical Context
The PowerPC 7400 (aka the G4)
The G4's Vector Unit
Summary: The PowerPC G4 in Historical Context
Conclusion
7. Intel's Pentium 4 vs. Motorola's G4e: Approaches and Design Philosophies
The Pentium 4's Speed Addiction
The General Approaches and Design Philosophies of the Pentium 4 and G4e
An Overview of the G4e's Architecture and Pipeline
Stages 1 and 2: Instruction Fetch
Stage 3: Decode/Dispatch
Stage 4: Issue
Stage 5: Execute
Stages 6 and 7: Complete and Write-Back
Branch Prediction on the G4e and Pentium 4
An Overview of the Pentium 4's Architecture
Expanding the Instruction Window
The Trace Cache
Shortening Instruction Execution Time
The Trace Cache's Operation
An Overview of the Pentium 4's Pipeline
Stages 1 and 2: Trace Cache Next Instruction Pointer
Stages 3 and 4: Trace Cache Fetch
Stage 5: Drive
Stages 6 Through 8: Allocate and Rename (ROB)
Stage 9: Queue
Stages 10 Through 12: Schedule
Stages 13 and 14: Issue
Stages 15 and 16: Register Files
Stage 17: Execute
Stage 18: Flags
Stage 19: Branch Check
Stage 20: Drive
Stages 21 and Onward: Complete and Commit
The Pentium 4's Instruction Window
8. Intel's Pentium 4 vs. Motorola's G4e: The Back End
Some Remarks About Operand Formats
The Integer Execution Units
The G4e's IUs: Making the Common Case Fast
The Pentium 4's IUs: Make the Common Case Twice as Fast
The Floating-Point Units (FPUs)
The G4e's FPU
The Pentium 4's FPU
Concluding Remarks on the G4e's and Pentium 4's FPUs
The Vector Execution Units
A Brief Overview of Vector Computing
Vectors Revisited: The AltiVec Instruction Set
AltiVec Vector Operations
Intra-Element Arithmetic and Non-Arithmetic Instructions
Inter-Element Arithmetic and Non-Arithmetic Instructions
The G4e's VU: SIMD Done Right
Intel's MMX
SSE and SSE2
The Pentium 4's Vector Unit: Alphabet Soup Done Quickly
Increasing Floating-Point Performance with SSE2
Conclusions
9. 64-Bit Computing and x86-64
Intel's IA-64 and AMD's x86-64
Why 64 Bits?
What Is 64-Bit Computing?
Current 64-Bit Applications
Dynamic Range
The Benefits of Increased Dynamic Range, or, How the Existing 64-Bit Computing Market Uses 64-Bit Integers
Virtual Address Space vs. Physical Address Space
The Benefits of a 64-Bit Address
The 64-Bit Alternative: x86-64
Extended Registers
More Registers
Switching Modes
Out with the Old
Conclusion
10. The G5: IBM's PowerPC 970
Overview: Design Philosophy
Caches and Front End
Branch Prediction
The Trade-Off: Decode, Cracking, and Group Formation
Dispatching and Issuing Instructions on the PowerPC 970
The 970's Dispatch Rules
Predecoding and Group Dispatch
Some Preliminary Conclusions on the 970's Group Dispatch Scheme
The PowerPC 970's Back End
Integer Unit, Condition Register Unit, and Branch Unit
The Integer Units Are Not Fully Symmetric
Integer Unit Latencies and Throughput
The CRU
The PowerPC Condition Register
Preliminary Conclusions About the 970's Integer Performance
Load-Store Units
Front-Side Bus
The Floating-Point Units
Vector Computing on the PowerPC 970
Floating-Point Issue Queues
Integer and Load-Store Issue Queues
BU and CRU Issue Queues
Vector Issue Queues
The Performance Implications of the 970's Group Dispatch Scheme
Conclusions
11. Understanding Caching and Performance
Caching Basics
The Level 1 Cache
The Level 2 Cache
Example: A Byte's Brief Journey Through the Memory Hierarchy
Cache Misses
Locality of Reference
Spatial Locality of Data
Spatial Locality of Code
Temporal Locality of Code and Data
Locality: Conclusions
Cache Organization: Blocks and Block Frames
Tag RAM
Fully Associative Mapping
Direct Mapping
N-Way Set Associative Mapping
Four-Way Set Associative Mapping
Two-Way Set Associative Mapping
Two-Way vs. Direct-Mapped
Two-Way vs. Four-Way
Associativity: Conclusions
Temporal and Spatial Locality Revisited: Replacement/Eviction Policies and Block Sizes
Types of Replacement/Eviction Policies
Block Sizes
Write Policies: Write-Through vs. Write-Back
Conclusions
12. Intel's Pentium M, Core Duo, and Core 2 Duo
Code Names and Brand Names
The Rise of Power-Efficient Computing
Power Density
Dynamic Power Density
Static Power Density
The Pentium M
The Fetch Phase
The Hardware Loop Buffer
The Decode Phase: Micro-ops Fusion
Fused Stores
Fused Loads
The Impact of Micro-ops Fusion
Branch Prediction
The Loop Detector
The Indirect Predictor
The Stack Execution Unit
Pipeline and Back End
Summary: The Pentium M in Historical Context
Core Duo/Solo
Intel's Line Goes Multi-Core
Processor Organization and Core Microarchitecture
Multiprocessing and Chip Multiprocessing
Core Duo's Improvements
Micro-ops Fusion of SSE and SSE2 store and load-op Instructions
Micro-ops Fusion and Lamination of SSE and SSE2 Arithmetic Instructions
Micro-ops Fusion of Miscellaneous Non-SSE Instructions
Improved Loop Detector
SSE3
Floating-Point Improvement
Integer Divide Improvement
Virtualization Technology
Summary: Core Duo in Historical Context
Core 2 Duo
The Fetch Phase
Macro-Fusion
The Decode Phase
Core's Pipeline
Core's Back End
Integer Units
Floating-Point Units
Vector Processing Improvements
128-bit Vector Execution on the P6 Through Core Duo
128-bit Vector Execution on Core
Memory Disambiguation: The Results Stream Version of Speculative Execution
The Lifecycle of a Memory Access Instruction
The Memory Reorder Buffer
Memory Aliasing
Memory Reordering Rules
False Aliasing
Memory Disambiguation
Summary: Core 2 Duo in Historical Context
A. Bibliography and Suggested Reading
Online Resources
Index
About the Author
Colophon
B. Updates

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Inside the Machine

Description

More details

Other editions

Additional editions

Person

Content

System requirements