
Computer Architecture
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
This book constitutes the thoroughly refereed post-conference proceedings of the workshops held at the 37th International Symposium on Computer Architecture, ISCA 2010, in Saint-Malo, France, in June 2010. The 28 revised full papers presented were carefully reviewed and selected from the lectures given at 5 of these workshops. The papers address topics ranging from novel memory architectures to emerging application design and performance analysis and encompassed the following workshops: A4MMC, applications for multi- and many-cores; AMAS-BT, 3rd workshop on architectural and micro-architectural support for binary translation; EAMA, the 3rd Workshop for emerging applications and many-core architectures; WEED, 2nd Workshop on energy efficient design, as well as WIOSCA, the annual workshop on the interaction between operating systems and computer architecture.
More details
Other editions
Additional editions

Content
- Title Page
- Preface
- ISCA Workshops Committees
- A4MMC Foreword
- EAMA Foreword
- AMAS-BT Foreword
- WEED Foreword
- WIOSCA Foreword
- Table of Contents
- A4MMC: Applications for Multi- and Many-Cores
- Accelerating Agent-Based Ecosystem Models Using the Cell Broadband Engine
- Introduction
- Background
- Hardware
- Implementation
- Performance Evaluation
- More on Particle Management
- Future Work
- Conclusion
- References
- Performance Impact of Task Mapping on the Cell BE Multicore Processor
- Introduction
- Cell BE and Benchmark Application
- Cell Broadband Engine
- Synthetic Benchmark Application
- Experiments
- Conclusions and Future Work
- References
- Parallelization Strategy for CELL TV
- Introduction
- Applications
- Parallelization Strategies
- Inter Module Parallelization
- Inner Module Parallelization and SIMD
- Discussion
- Conclusion
- References
- Towards User Transparent Parallel Multimedia Computing on GPU-Clusters
- Introduction
- Parallel-Horus
- GPU-Based Extensions to Parallel-Horus
- A Line Detection Application
- Curvilinear Structure Detection
- Evaluation
- Future Work
- Conclusions
- References
- Implementing a GPU Programming Model on a Non-GPU Accelerator Architecture
- Introduction and Background
- CUDA
- MCUDA
- Rigel
- RCUDA
- Source Code Transformations
- Runtime Library
- Kernel Execution
- RCUDA Optimizations
- Kernel Code Transformations
- Runtime Optimizations
- Evaluation
- Simulation Infrastructure Methodology
- Benchmarks
- Baseline Performance
- RCUDA Runtime Overhead
- Optimizations
- DMM Case Study of Performance Portability
- Related Work
- Conclusion
- References
- On the Use of Small 2D Convolutions on GPUs
- Introduction
- Electromagnetic Diffraction at Nano-structures
- NVIDIA CUDA GPU Platform
- Parallel Implementation on the GPU
- Initial CUDA Implementation
- Increasing Independent Work
- Tuning the Execution Configuration
- Optimizing the 2D Convolution
- Optimizing Transfers between CPU and GPU
- Experiments and Results
- Experimental Setup
- Performance Measurements
- Discussion
- Conclusion
- References
- Can Manycores Support the Memory Requirements of Scientific Applications?
- Introduction
- Analysis Methodology
- Application Analysis
- Memory Bandwidth
- Memory Footprint
- Related Work
- Conclusions
- References
- Parallelizing an Index Generator for Desktop Search
- Introduction
- What to Parallelize and How?
- Filename Generation
- Term Extraction
- Index Update
- Parallelization
- Performance Results
- Lessons Learned and Conclusion
- References
- AMAS-BT: 3rd Workshop on Architectural and Micro-Architectural Support for Binary Translation
- Computation vs. Memory Systems: Pinning Down Accelerator Bottlenecks
- Introduction
- Our Vision for Accelerators
- Applying Our Methodology to a Trivial Application: Image Rotation
- Checking for Any Dependence on Problem Size
- Locating Hotspots of Computation and Communication
- Hitting the Memory Wall
- Distinguishing Local from Global Communication
- Understanding Detailed Dataflow Behavior
- Our Pintool
- Exploring JPEG Acceleration
- Related Work
- Conclusions
- References
- Trace Execution Automata in Dynamic Binary Translation
- Introduction
- Motivation
- From Traces to TEA
- Building TEA Out of Traces
- Recording TEA Instead of Traces
- Experimental Results
- Implementation Challenges
- Analyzing TEA's Performance
- Previous and Related Work
- Conclusions and Future Work
- References
- ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation
- Introduction
- Related Work
- ISAMAP
- Models
- System Overview
- Translator Generation
- Translator
- Endianness
- Run-Time
- System Calls Mapping
- Mapping Improvements
- Conditional Mapping
- Run-Time Optimizations
- Experimental Results
- Evaluation
- Conclusion
- Future Works
- References
- EAMA: 3rd Workshop for Emerging Applications and Many-Core Architectures
- Parallelization of Particle Filter Algorithms
- Introduction
- Particle Filter Algorithm
- MATLAB Implementation
- Conversion from MATLAB to C
- OpenMP Implementation
- Naïve CUDA Implementation
- Naïve versus Thrust
- CUDA Optimizations
- Tree Reductions
- GPU Linear Congruential Generator
- Results
- Integration with MATLAB
- Related Work
- Conclusions
- Recommendations for Further Work
- References
- What Kinds of Applications Can Benefit from Transactional Memory?
- Introduction
- Types of TM and How They Can Be Useful
- TM Myths and Misconceptions
- Evaluating TM Prototypes
- Concluding Remarks
- References
- Characteristics of Workloads Using the Pipeline Programming Model
- Introduction
- Pipeline Programming Model
- Motivation for Pipelining
- Uses of the Pipeline Model
- Implementations
- Methodology
- Workloads
- Program Characteristics
- Experimental Setup
- Principal Component Analysis
- Experimental Results
- Related Work
- Conclusions
- References
- WEED: 2nd Workshop on Energy Efficient Design
- The Search for Energy-Efficient Building Blocks for the Data Center
- Introduction
- Related Work
- System Overview
- Hardware
- Benchmark Details
- Measurement Infrastructure
- Evaluation
- Single-Machine Benchmarks
- Multi-machine Dryad Benchmarks
- Discussion
- Energy Efficiency
- The Missing Links
- Conclusions
- References
- KnightShift: Shifting the I/O Burden in Datacenters to Management Processor for Energy Efficiency
- Introduction
- Overview of Intelligent Platform Management Interface
- Enhancing IPMI to Act as Knight
- Experimental Setup and Results
- Trace Overview
- Energy Proportionality Impact on Energy Consumption
- Discussion of Energy Proportionality in Future
- Related Work
- Conclusions
- References
- Guarded Power Gating in a Multi-core Setting
- Introduction
- Problem Background
- Power-Gating Scenarios
- Modeling Methodology
- Results
- Inter-core Power Gating Results
- Intra-core Power Gating Results
- Hybrid Power Gating
- Discussion
- A Case for Guard Mechanism
- Conclusions and Future Work
- References
- Using Partial Tag Comparison in Low-Power Snoop-Based Chip Multiprocessors
- Introduction
- Background
- Motivation
- S-PTC: Cache Optimizations
- S-PTC
- S-PTC Updating
- Methodology and Results
- Methodology
- Performance
- Bandwidth Utilization Reduction
- Tag Lookup Power
- Area
- Related Work
- Conclusion
- References
- Achieving Power-Efficiency in Clusters without Distributed File System Complexity
- Introduction
- Motivation
- Problems
- Exploiting System-Level Power States
- Design
- Power-Efficient Nodes
- Implementation
- Experimental Evaluation
- I/O Performance
- Cluster Power-Efficiency
- Related Work
- Conclusions and Future Work
- References
- What Computer Architects Need to Know about Memory Throttling
- Memory Throttling
- Overview
- Comparison to CPU Clock Throttling
- Power and Performance
- Infrastructure
- System
- Workloads
- Measurements
- Throttling Characterization
- Bandwidth
- Bandwidth-Limited
- Transition
- Bandwidth-Saturated
- Performance
- Power
- Conclusion
- References
- Predictive Power Management for Multi-core Processors
- Introduction
- Prediction for Power Management
- Program Phase Characterization
- Methodology
- Power Measurement
- Performance Counter Measurement
- Core Activity Predictor
- Core-Level CPU Power Model
- Results
- Performance Metric Definitions
- Quantitative Comparison
- Predictive Frequency Boosting
- Conclusion
- References
- WIOSCA: 6th Annual Workshop on the Interaction between Operating Systems and Computer Architecture
- IOMMU: Strategies for Mitigating the IOTLB Bottleneck
- Introduction
- IOMMU Performance Analysis
- Virtual I/O Memory Access Patterns
- vIOMMU
- Analysis of Virtual I/O Memory Access Patterns
- IOTLB Miss-Rate Reduction Approaches
- Streams Entries Eager Eviction
- Non-overlapping Coherent Frames
- Large TLB and Higher TLB Associativity
- Super-Pages
- Prefetching Techniques
- Adjacent Mappings Prefetch
- Explicit Caching of Mapped Entries
- Evaluation of Strategies
- Discussion
- Related Work
- Conclusions
- References
- Improving Server Performance on Multi-cores via Selective Off-Loading of OS Functionality
- Introduction
- Background and Motivation
- Hardware-Based Decision-Making
- Hardware Prediction of OS Syscall Length
- Dynamic Estimation of N
- Experimental Methodology
- Results
- Impact of Design Parameters
- Comparing Instrumentation and Hardware Prediction
- Scalability of Off-Loading
- TLB Impact
- Related Work
- Impact of OS on System Throughput
- Hardware Support for Efficient OS Execution
- Conclusions
- References
- Performance Characteristics of Explicit Superpage Support
- Introduction
- Limitations of Transparent Support
- POWER®
- Itanium® (IA-64)
- X86 Variants (Intel® EM64T, AMD® X86-64)
- Related Work
- Page Allocation
- Page Reclaim
- Superpage Reservation
- Shared Mapping Accounting
- Private Mapping Accounting
- Explicit Superpage Support
- RAM-Based Filesystem
- System V Shared Memory
- Anonymous mmap() Mappings
- Explicit Programming API
- Backing Memory Sections with Superpages
- Heap
- Mapping Text/Data/BSS
- Stack
- Evaluation
- STREAM (Memory Throughput)
- SysBench (Database Workload)
- SPECcpu 2006 v1.1 (Computational)
- SPECjvm 2008 (Java)
- Conclusions
- References
- Interfacing Operating Systems and Polymorphic Computing Platforms Based on the MOLEN Programming Paradigm
- Introduction
- Related Work
- Background Overview
- MOLEN Programming Paradigm
- The Runtime Environment
- MOLEN Runtime Primitives
- MOLEN SET
- MOLEN EXECUTE
- Dynamic Binding Implementation
- Evaluation
- Conclusion
- References
- Extrinsic and Intrinsic Text Cloning
- Introduction
- Text Cloning: Causes, Implications, Remedies
- Extrinsic Text Cloning
- Intrinsic Text Cloning
- How Important Is ETC and ITC
- How to Eliminate ETC and ITC
- Grid Computing Systems
- Grid Architecture
- Extrinsic Text Cloning in Grid
- Evaluation Using Simulation
- Experimental Framework
- Results
- Related Work
- Conclusions
- References
- A Case for Coordinated Resource Management in Heterogeneous Multicore Platforms
- Introduction
- Implementation
- The IXP Island of Cores
- The x86 Island of Cores
- x86-IXP Coordination
- Evaluation
- RUBiS
- MPlayer Benchmark
- Discussion of Results - A Case for Coordination
- Related Work
- Conclusions and Future Work
- References
- Topology-Aware Quality-of-Service Support in Highly Integrated Chip Multiprocessors
- Introduction
- Topology-Aware Quality-of-Service
- Preliminaries
- Topology-Aware Architecture
- Shared Region Organization
- QOS Support
- Topologies
- Experimental Methodology
- Evaluation Results
- Area
- Performance
- QOS and Preemption Impact
- Energy Efficiency
- Related Work
- Conclusion
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.