Algorithms and Architectures for Parallel Processing, Part I

Name: Algorithms and Architectures for Parallel Processing, Part I | 11th International Conference, ICA3PP 2011, Melbourne, Australia,October 24-26, 2011, Proceedings, Part I
Brand: Springer
Price: 53.49 EUR
Availability: OnlineOnly

11th International Conference, ICA3PP 2011, Melbourne, Australia,October 24-26, 2011, Proceedings, Part I

Yang Xiang Alfredo Cuzzocrea Michael Hobbs Wanlei Zhou(Editor)

Springer (Publisher)

Published on 23. October 2011

XVIII, 497 pages

E-Book

PDF with digital watermarking

System requirements

978-3-642-24650-0 (ISBN)

€53.49incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

More details

Other editions

Content

Title
Table of Contents
ICA3PP 2011 Keynote
Keynote: Assertion Based Parallel Debugging
ICA3PP 2011 Regular Papers
Secure and Energy-Efficient Data Aggregation with Malicious Aggregator Identification in Wireless Sensor Networks
Introduction
Related Work
System Model
Secure and Energy-Efficient Data Aggregation with Malicious Aggregator Identification
Aggregation Commitment
Aggregation Verification
Theoretical Analysis on Communication Overhead
Discussion
Simulation Evaluation
Conclusions
References
Dynamic Data Race Detection for Correlated Variables
Introduction
Problem Description
Related Work
Race Detection for Correlated Variables
Inferring Correlated Sets and Computational Units
Adapting the Lockset Algorithm
Calculating a Computational Unit's Lockset
Happens-Before Analysis - Hybrid Race Detector
Evaluation
Detecting Extended Data Races
Conclusion and Future Work
References
Improving the Parallel Schnorr-Euchner LLL Algorithm
Introduction
Preliminaries
Parallel LLL Reduction
Motivation and Approach
Improved Parallel LLL Reduction
Scalar Product Part
Orthogonalization Part
µ-Update and Size-Reduction Part
Using Sequences of Reduction Parameters
Experiments
Setup
Results
Future Work
References
Distributed Mining of Constrained Frequent Sets from Uncertain Data
Introduction
Background
Mining Frequent Sets from Uncertain Data
Mining Frequent Sets That Satisfy User Constraints
Our Proposed Distributed Mining System
Finding Constrained Locally Frequent Sets
Finding Constrained Globally Frequent Sets
Experimental Results
Conclusions
References
Set-to-Set Disjoint-Paths Routing in Recursive Dual-Net
Introduction
Recursive Dual-Net
Set-to-Set Disjoint-Path Routing in RDN
Concluding Remarks
References
Redflag: A Framework for Analysis of Kernel-Level Concurrency
Introduction
Design
Instrumentation and Logging
Lockset Algorithm
Block-Based Algorithms
Algorithm Enhancements
Filtering False Positives and Benign Warnings
Evaluation
Related Work
Conclusions
References
Exploiting Parallelism in the H.264 Deblocking Filter by Operation Reordering
Introduction
Background
H.264 Deblocking Filter
Related Work
Algorithm
Analysis of Data Dependencies
Proposed Deblocking Order
Analysis
Architectural Requirements
Conclusion and Future Work
References
Compiler Support for Concurrency Synchronization
Introduction
Related Work
Background
The Proposed Priority Scheduling Algorithm
Algorithm
Priority Assignment
PS on TM Pathologies
Experimental Results
Setup
Results
Conclusion
References
Fault-Tolerant Routing Based on Approximate Directed Routable Probabilities for Hypercubes
Introduction
Related Works
Preliminaries
Directed Routable Probabilities
Fault-Tolerant Routing Algorithms
Naive Algorithm ADRP0
Improved Algorithm ARDP1
Performance Evaluation
Time Complexity
Computer Experiment
Discussion
Conclusion
References
Finding a Hamiltonian Cycle in a Hierarchical Dual-Net with Base Network of p -Ary q -Cube
Introduction
The Hierarchical Dual-Net
Topological Properties of HDN
Hamiltonian Cycle Embedding
Concluding Remarks
References
Adaptive Resource Remapping through Live Migration of Virtual Machines
Introduction
Related Work
Heterogeneity-Aware Schedulers
Heterogeneity-Aware Applications
Comparison with ARRIVE-F
Framework
Assumptions
Performance Modeling
Migration Prediction
Migration Decisions
Implementation
Experimental Results
Experimental Platform
Applications
Comparison of Predicted and Actual Execution Times
Framework Overheads
Compute Farm Throughput
Experiment 1
Further Experiments
Conclusions and Future Work
References
LUTS: A Lightweight User-Level Transaction Scheduler
Introduction
Related Work
Overview
Scheduling Transactions
LUTS-Based Heuristics
CILUTS
CTLUTS
HASHLUTS
Experimental Results
Speedup
Overhead
Discussion
Conclusion
References
Verification of Partitioning and Allocation Techniques on Teradata DBMS
Introduction
Related Work
Background
Validation on Teradata
Teradata Description
Experiments
Implementation and Testing Joint and Sequential Approaches
Conclusion
References
Memory Performance and SPEC OpenMP Scalability on Quad-Socket x86 64 Systems
Introduction
Related Work
Test Systems
Benchmarks
Microbenchmarks
SPEC OMPM2001
Results
Memory Latency
Memory Bandwidth
SPEC OMPM2001 Scaling with Multiple Cores
SPEC OMPM2001 Scaling with Multiple Sockets or NUMA Nodes
SPEC OMPM2001 Performance Comparison
Conclusions
References
Anonymous Communication over Invisible Mix Rings
Introduction
Related Work
Our Solution
Structure
Topology
Anonymous Communications
Attacks and Defense
Anonymity Evaluation
Performance Analysis
Conclusion
References
Game-Based Distributed Resource Allocation in Horizontal Dynamic Cloud Federation Platform
Introduction
Mathematical Problem Formulation
Resource Allocation Games in a HDCF Platform
Non-cooperative Resource Allocation Game
Cooperative Resource Allocation Game
Simulation and Discussion
Conclusions
References
Stream Management within the CloudMiner
Introduction
Challenges of Data Stream Management Applications
Cloud-Enabled Stream Processing
Contributions and Organization of the Paper
StreamMiner Framework
Architecture
StreamMiner Services
Technology and Application
Cloud Services and Stream Transmission
A Real-World Application: The StreamMiner Cloud
Performance Experiments
Experiment 1: Transmission Delays
Experiment 2: Transmission and Processing
Experiment 3: Limitation on Transmission Speed
Related Work
Conclusions and Future Work
References
Security Architecture for Virtual Machines
Introduction
Our Model
Overview
Operation at Source
Operation at Destination
Analysis
Slammer Analysis
Performance Analysis
Related Work`
Conclusion
References
Fast and Accurate Similarity Searching of Biopolymer Sequences with GPU and CUDA
Introduction
Theoretical Background
Alignment Algorithm
CUDA Programming Model and Architecture of Hardware Accelerator
Related Works
Implementation of Parallel Alignment on GPU and CUDA
Calculation of Alignment Matrix
Reducing the Number of Transactions
Reducing Idle Time of Threads
Data Arrangement
Storing the Input Sequence and the Substitution Matrix
Efficiency Tests
Concluding Remarks
References
Read Invisibility, VirtualWorld Consistency and Probabilistic Permissiveness are Compatible
Introduction
Software Transactional Memory (STM) Systems
Consistency Criteria for STM Systems
Desirable Properties for STM Systems
Content of the Paper
STM Computation Model and Base Definitions
Consistency Conditions: Opacity and Virtual World Consistency
Invisible Reads, Opacity and Permissiveness are Incompatible
Step 1: Ensuring Virtual World Consistency with Read Invisibility
STM Interface, Incremental Reads and Deferred Updates
The Underlying Data Structures
The readT() and writeT() Operations
The try_to_commitT() Operation
Step 2: Adding Probabilistic Permissiveness to the Protocol
Conclusion
References
Parallel Implementations of Gusfield's Cut Tree Algorithm
Introduction
Related Work
Definitions
Cut Tree Algorithms
Gusfield's Algorithm - Sequential Version
Parallelization of Cut Tree Algorithms
MPI Version
OpenMP Version
Experimental Setup
Experimental Results
Conclusion
References
Efficient Parallel Implementations of Controlled Optimization of Traffic Phases
Introduction
Related Work
Serial Intersection Algorithm
Shared Memory Parallel Intersection Control Algorithm
Message Passing Implementation
Data Parallel Implementation
Experimental Study
Conclusion
References
Scheduling Concurrent Workflows in HPC Cloud through Exploiting Schedule Gaps
Introduction
Related Work
PCH and Gap Search
Distributed Gap Search
Experiments and Discussions
Conclusions
References
Efficient Decoding of QC-LDPC Codes Using GPUs
Introduction
Review of QC-LDPC Codes and the Belief Propagation Decoding Algorithm
QC-LDPC Codes
Belief Propagation Algorithm
Parallel Computations Using GPUs
GPU Architecture
Implementation of a LDPC Decoder on GPUs
Data Structure to Represent the Messages
Decoding Procedures in GPU with the Use of Shared Memory
Results and Discussions
Simulation Results
Conclusion
References
ICA3PP 2011 Short Papers
A Combined Arithmetic Logic Unit and Memory Element for the Design of a Parallel Computer
Introduction
Theoretical Basis for Parallel Self-Timed Adder
Design of ARAM Memory-cum-Logic Units
Logical Operations
Memory Access Operations
Control and Floating Point Operations
Instruction Cycles
Implementation of ARAM
Performance Evaluation
Conclusion
References
Parallel Implementation of External Sort and Join Operations on a Multi-core Network-Optimized System on a Chip
Introduction
Algorithm Implementation
External Sort Algorithm Implementation
External Hash Join Algorithm Implementation
DB2 Sort and Hash Join Accelerator
Simulation Results
Conclusion
References
STM with Transparent API Considered Harmful
Introduction
Benchmarks for STMs
JWormBench: A Port of WormBench to Java
Annotations to Avoid Over-Instrumentation
Over-Instrumented Tasks
New Java Annotations for the Deuce API
Performance Evaluation
Related Work
Conclusions
References
A Global Snapshot Collection Algorithm with Concurrent Initiators with Non-FIFO Channel
Introduction
Related Works
Assumption and System Model
Concurrent Snapshot Collection Strategy
Algorithm
Data Structure
Messages
Algorithms
Proof of Correctness
Complexity Analysis
Conclusion
References
An Approach for Code Compression in Run Time for Embedded Systems - A Preliminary Results
Introduction
Architectures for Code Compression
Code Compression in Run Time
Algorithm for Compression/Decompression with MIC Method
Simulations with Benchmark MiBench
Related Work
Conclusions and Future Work
References
Optimized Two Party Privacy Preserving Association Rule Mining Using Fully Homomorphic Encryption
Introduction
Privacy Preserving Association Rule Mining
Motivation and Problem Formulation
Background
Notations
Some Binary Operations
Fully Homomorphic Encryption (FHE)
Proposed Solution
ARM with Privacy Preservation
Performance and Security
Performance Analysis
Security Analysis
Conclusion
References
SLA-Based Resource Provisioning for Heterogeneous Workloads in a Virtualized Cloud Datacenter
Introduction
Related Work
System Model
Datacenter Model
SLA and Application Models
Admission Control and Scheduling Policy
Forecasting Model
Admission Control and Scheduling
SLA Enforcement and Rescheduling of VMs
Performance Evaluation
Workload Data
Performance Metrics
Analysis of Results
Conclusions and Future Directions
References
SC: A Programming Model and Language for Embedded Manycores
Introduction
Related Work
The SC Programming Model
Components
Behavior
Designing for Execution Guarantees
The SC Programming Language
Components
System Agents
Input / Output
Software Architecture
A Sketch of the SC Compilation Process
Evaluation
Conclusion
References
Provisioning Spot Market Cloud Resources to Create Cost-Effective Virtual Clusters
Introduction
Related Work
Cloud-Based Virtual Clusters
Use of Variable Pricing Resources
System Model of a Cloud-Based Virtual Cluster
Cost-Effective Resource Provisioning and Scheduling Policy
Estimating Job Runtimes
Estimation Correction and Rescheduling
Job Moldability and Speedup Considerations
Performance Evaluation
Comparison with Best-Case and Worst-Case Scenarios
Effect of Different Runtime Estimation Methods
Conclusions and Future Work
References
A Principled Approach to Grid Middleware
Introduction
Background and Motivation
Grid Vision and Grid Practice
The Cloud Vision - A Better Grid?
Overview of the Minimum Intrusion Grid
System Architecture
MiG Design Principles and Rationale
Browser-Based Interface
How MiG Executes Grid Jobs
Software Deployment
MiG Features beyond the ``Job-Shop''
VGrids: Virtual Organisations in MiG
Resource Management
Storage in MiG
Advanced VGrid Features for Group Collaboration
Current Status and Future Directions
Conclusions
References
Performance Analysis of Preemption-Aware Scheduling in Multi-cluster Grid Environments
Introduction
Analytical Queuing Model
Preemption-Aware Scheduling Policy
Performance Evaluation
Experimental Setup
Experimental Results
Related Work
Conclusions and Future Work
References
Performance Evaluation of Open Source Seismic Data Processing Packages
Introduction
Related Work
Sequence of Seismic Functions
Performance Evaluation
SU and Madagascar Seismic Data Format
Seismic Data Conversion
Tests and Results
Conclusion
References
Reputation-Based Resource Allocation in Market-Oriented Distributed Systems
Introduction
Related Work
Models
System Model
Application Model
Market Model
Reputation-Based Resource Allocation
Formation of Resource Reputation
The Suitability between Resource and Task
Incorporation of Data Staging
Experimental Methodology
Experimental Settings
Performance Metrics
Experimental Results
Impact of Scheduling with Variability of Network Capacity
Comparing Market-Based Resource Allocation Approaches
Conclusion
References
Cooperation-Based Trust Model and Its Application in Network Security Management
Introduction
Related Work
User Cooperation Trust Model (UCTM)
Design Principle
The Reputation Model
Simulation Detail
Simulation Results
Conclusion
References
Performance Evaluation of the Three-Dimensional Finite-Difference Time-Domain(FDTD) Method on Fermi Architecture GPUs
Introduction
Background
A Brief Overview of the FDTD Method
Parallelization of FDTD Method on Fermi Architecture GPUs
Method
Simulation Model
Implementation in CUDA
Performance Analysis and Comparison between Using Shared Memory and L1 Cache
Conclusions and Future Work
References
The Probability Model of Peer-to-Peer Botnet Propagation
Introduction
Related Work
Theoretical Propagation Model
Propagation Ability and Quarantine Ability
Simulation Experiments
Conclusion and Future Work
References
A Parallelism Extended Approach for the Enumeration of Orthogonal Arrays
Introduction
Major Contribution
Background
Serial MCS Algorithm
Two Key Properties of Serial MCS Algorithm
A Step by Step Extending Parallelism Approach for EOA
Parallel Computing for Each Small Extending Step
Drawback of Previous Work with Master-Slave Load Balancing Method
Convert MCS Algorithm by Using Stack Data Structure
Work Sharing Method for Dynamic Load Balancing
MPI Implementation
Experimental Evaluation
Experiments for Small Extending Step
Dynamic Load Balancing
Impact of Duration Time d and Chunk Size c
Conclusion and Future Work
References
Author Index

Content (PDF)

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Algorithms and Architectures for Parallel Processing, Part I

Description

More details

Other editions

Additional editions

Content

System requirements