About This Book...
This book outlines a set of issues that are critical to all of parallel architecture--communication latency, communication bandwidth, and coordination of cooperative work (across modern designs). It describes the set of techniques available in hardware and in software to address each issues and explore how the various techniques interact.
Table Of Content...
Foreword
Preface
1 Introduction
1.1 Why Parallel Architecture
1.1.1 Application Trends
1.1.2 Technology Trends
1.1.3 Architectural Trends
1.1.4 Supercomputers
1.1.5 Summary
1.2 Convergence of Parallel Architectures
1.2.1 Communication Architecture
1.2.2 Shared Address Space
1.2.3 Message Passing
1.2.4 Convergence
1.2.5 Data Parallel Processing
1.2.6 Other Parallel Architectures
1.2.7 A Generic Parallel Architecture
1.3 Fundamental Design Issues
1.3.1 Communication Abstraction
1.3.2 Programming Model Requirements
1.3.3 Communication and Replication
1.3.4 Performance
1.3.5 Summary
1.4 Concluding Remarks
1.5 Historical References
1.6 Exercises
2 Parallel Programs
2.1 Parallel Application Case Studies
2.1.1 Simulating Ocean Currents
2.1.2 Simulating the Evolution of Galaxies
2.1.3 Visualizing Complex Scenes Using Ray Tracing
2.1.4 Mining Data for Associations
2.2 The Parallelization process
2.2.1 Steps in the Process
2.2.2 Parallelizing Computation versus
Data
2.2.3 Goals of the Parallelization process
2.3 parallelization of an Example Program
2.3.1 The Equation Solver kernel
2.3.2 Decomposition
2.3.3 Assignment
2.3.4 Orchestration under the Data
Parallel Model
2.3.5 Orchestration under the Shared
Address Space Model
2.3.6 Orchestration under the
Message-Passing Model
2.4 Concluding Remarks
2.5 Exercises
3 Programming for Performance
3.1 Partitioning for Performance
3.1.1 Load Balance and Synchronization Wait Time
3.1.2 Reducing Inherent Communication
3.1.3 Reducing the Extra Work
3.1.4 Summary
3.2 Data Access and Communication in a
Multimemory System
3.2.1 A Multiprocessor as an Extended
Memory Hierarchy
3.2.2 Artifactual Communication in the
Extended Memory Hierarachy
3.2.3 Artifactual Communication and Replication: The Working Set Perspective
3.3 Orchestration for Performance
3.3.1 Reducing Artifactual Communication
3.3.2 Structuring Communication to
Reduce Cost
3.4 Performance Factors from the Processor's Perspective
3.5 The Parallel Application Case Studies:
An In-Depth Look
3.5.1 Ocean
3.5.2 Barnes-Hut
3.5.3 Raytrace
3.5.4 Data Mining
3.6 Implications for Programming Models
3.6.1 Naming
3.6.2 Replication
3.6.3 Overhead and Granularity of
Communication
3.6.4 Block Data Transfer
3.6.5 Synchronization
3.6.6 Hardware Cost and Design Complexity
3.6.7 Performance Model
3.6.8 Summary
3.7 Concluding Remarks
3.8 Exercises
4 Workload-Driven Evaluation
4.1 Scaling Workloads and Machines
4.1.1 Basic Measures of Multiprocessor Performance
4.1.2 Why Worry about Scaling'
4.1.3 Key Issues in Scaling
4.1.4 Scaling Models and Speedup Measures
4.1.5 Impact of Scaling Models on the Equation Solver Kernel
4.1.6 Scaling Workload Parameters
4.2 Evaluating a Real Machine
4.2.1 Performance Isolation Using
Microbenchmarks
4.2.2 Choosing Workloads
4.2.3 Evaluating a Fixed-Size Machine
4.2.4 Varying Machine Size
4.2.5 Choosing Performance Metrics
4.3 Evaluating an Architectural Idea or Trade-off
4.3.1 Multiprocessor Simulation
4.3.2 Scaling Down Problem and Machine Parameters for Simulation
4.3.3 Dealing with the Parameter Space: An Example Evaluation
4.3.4 Summary
4.4 Illustrating Workload Characterization
4.4.1 Workload Case Studies
4.4.2 Workload Characteristics
4.5 Concluding Remarks
4.6 Exercises
5 Shared Memory Multiprocessors
5.1 Cache Coherence
5.1.1 The Cache Coherence Problem
5.1.2 Cache Coherence through Bus Snooping
5.2 Memory Consistency
5.2.1 Sequential Consistency
5.2.2 Sufficient Conditions for
Preserving Sequential Consistency
5.3 Design Space for Snooping Protocols
5.3.1 A Three-State (MSI) Write-Back
Invalidation Protocol
5.3.2 A Four-State (MESI) Write-Back
Invalidation Protocol
5.3.3 A Four-State (Dragon) Write-Back Update Protocol
5.4 Assessing Protocol Design Trade-offs
5.4.1 Methodology
5.4.2 Bandwidth Requirement under the
MESI Protocol
5.4.3 Impact of Protocol Optimizations
5.4.4 Trade-Offs in Cache Block Size
5.4.5 Update-Based versus
Invalidation-Based Protocols
5.5 Synchronization
5.5.1 Components of a Synchronization Event
5.5.2 Role of the User and System
5.5.3 Mutual Exclusion
5.5.4 Point-to-Point Event Synchronization
5.5.5 Global (Barrier) Event Synchronization
5.5.6 Synchronization Summary
5.6 Implications for Software
5.7 Concluding Remarks
5.8 Exercises
6 Snoop-Based Multiprocessor Design
6.1 Correctness Requirements
6.2 Base Design: Single-Level Caches with an Atomic Bus
6.2.1 Cache Controller and Tag Design
6.2.2 Reporting Snoop Results
6.2.3 Dealing with Write Backs
6.2.4 Base Organization
6.2.5 Nonatomic State Transitions
6.2.6 Serialization
6.2.7 Deadlock
6.2.8 Livelock and Starvation
6.2.9 Implementing Atomic Operations
6.3 Multilevel Cache Hierarchies
6.3.1 Maintaining Inclusion
6.3.2 Propagating Transactions for
Coherence in the Hierarchy
6.4 Split-Transaction Bus
6.4.1 An Example Split-Transaction Design
6.4.2 Bus Design and Request-Response
Matching
6.4.3 Snoop Results and Conflicting Requests
6.4.4 Flow Control
6.4.5 Path of a Cache Miss
6.4.6 Serialization and Sequential
Consistency
6.4.7 Alternative Design Choices
6.4.8 Split-Transaction Bus with
Multilevel Caches
6.4.9 Supporting Multiple Outstanding
Misses from a Processor
6.5 Case Studies: SGI Challenge and Sun
Enterprise
6.5.1 SGI Powerpath-2 System Bus
6.5.2 SGI Processor and Memory Subsystems
6.5.3 SGI I/O Subsystems
6.5.4 SGI Challenge Memory System Performance
6.5.5 Sun Gigaplane System Bus
6.5.6 Sun Processor and Memory Subsystem
6.5.7 Sun I/O Subsystem
6.5.8 Sun Enterprise Memory System Performance
6.5.9 Application Performance
6.6 Extending Cache Coherence
6.6.1 Shared Cache Designs
6.6.2 Coherence for Virtually Indexed
Caches
6.6.3 Translation Lookaside Buffer
Coherence
6.6.4 Snoop-Based Cache Coherence on Rings
6.6.5 Scaling Data and Snoop Bandwidth
in Bus-Based Systems
6.7 Concluding Remarks
6.8 Exercises
7 Scalable Multiprocessors
7.1 Scalability
7.1.1 Bandwidth Scaling
7.1.2 Latency Scaling
7.1.3 Cost Scaling
7.1.4 Physical Scaling
7.1.5 Scaling in a Generic Parallel Architecture
7.2 Realizing Programming Models
7.2.1 Primitive Network Transactions
7.2.2 Shared Address Space
7.2.3 Message Passing
7.2.4 Active Messages
7.2.5 Common Challenges
7.2.6 Communication Architecture Design Space
7.3 Physical DMA
7.3.1 Node-to-Network Interface
7.3.2 Implementing Communication Abstractions
7.3.3 A Case Study: nCUBE/2
7.3.4 Typical LAN Interfaces
7.4 User-Level Access
7.4.1 Node-to-Network Interface
7.4.2 Case Study: Thinking Machines CM-5
7.4.3 User-Level Handlers
7.5 Dedicated Message Processing
7.5.1 Case Study: Intel Paragon
7.5.2 Case Study: Meiko CS-2
7.6 Shared Physical Address Space
7.6.1 Case Study: CRAY T3D
7.6.2 Case Study: CRAY T3E
7.6.3 Summary
7.7 Clusters and Networks of Workstations
7.7.1 Case Study: Myrinet SBUS Lanai
7.7.2 Case Study: PCI Memory Channel
7.8 Implications for Paraller Software
7.8.1 Network Transaction Performance
| Title: | Parallel Computer Architecture : A Hardware / Software Approach PB | Publisher: | Elsevier |
| Edition: | Paperback |
| Language: | English |
| ISBN: | 818147189X |
| EAN: | 9788181471895 |
|
* The book summary and image may be of a different edition or binding of the same title.
* Book reviews are added by registered customers. They need not necessarily buy book. * These books are NOT available for reading online or for free download in PDF or ebook format. * Price can change due to reprinting, price change by publisher or sourcing cost change for imported books. www.infibeam.com/Books is the biggest online bookstore in India for sale of books at best price - fiction, literature, audiobooks, study guides, novels, story books, rare books, textbooks and books by popular authors. These are available in various editions and bindings e.g. paperback and at best discount. |
All Stores |
Books |
Your AccountContact Us |
HelpOthers |
Payment Options
Safe & Secure Shopping![]() |