Contents
Foreword
Preface
Acknowlodgments
Fundamentals of Computer Doslgn
1.1 Introduction
1.2 The Task of a Computer Designer
1.3 Technology and Computer Usage Trends
1.4 Cost and Trends in Cost
1.5 Measuring and Reporting Performance
1.6 Quantitative Prindptes of Gomputer Deeign
1.7 Putting It All Together: The Concept of Memory Hierarchy
1.8 Fallacies and Pltfalls
1.9 Concluding Remarks
1.10 Historical Perspective and Referencea
Exercises
Instruction Sot Principles and Cxamplw
2.1 Introduction
2.2 Classifying Instruction Set Architectures
2.3 Memory Addressing
2.4 Operations in the Instruction Set
2.5 Type and Size of Operands
2.6 Encoding an Instruction Set
2.7 Crosscutting Issues: The Role of Compilers
2.8 Putting It All Together: The DLX Architecture
2.9 Fallacies and Pitfalls
2.10 Concludlng Remarks
2.11 Historical Perspective and References
Exerciees
Pipellnlng
3.1 What Is Pipelining?
3.2 The Basic Pipeline for DLX
3.3 The Major Hurdte of Pipelining-Pipeline Hazaros
3.4 Data Hazards
3.5 Control Hazards
3.6 What Makes PipeUning Hard to Implement?
3.7 Extending the DLX Pipeline to Handle Multicycte Operations
3.8 Crosscutting Issues: Instruction Set Design and Pipetining
3.9 Putting It All Togather The MIPS R4000 Pipellne
3.10 Fallacies and Pttfalls
3.11 Concluding Remarks
3.12 HistoricalPerspectiveanlelerances
Exercises
Advanced Plpelining a5nd Instruetionlovel Parallolism
4.1 Instruction-Level Parallelism: Concepts and Chalhenges
4.2 OvercomingDataHazantewlthOynamlcScheduling
4.3 Raducing Branch Penalties with Dynamic Hardware Prediction
4.4 TakingAdvantageo of MorelLPwithMultiplelssue
4.5 CompilerSupport for Exploiting lLP
4.6 HardwareSupport for ExtractingMore Parallelism
4.7 Studiesof lLP
4.8 Putting it All Together The PowerPC 620
4.9 FaBa cies and pitfalis
4.10 Conduding Remarks
4.11 HistorlcalPerspective and Referftnces
Exereises
5.1 introduction
5.2 TheABCsofCaches
5.3 FtoducingCacheMisses
5.4 Reducing Ceche Miss Penalty
5.5 ReducingHitTime
5.6 Main Mamory
5.7 VirtualMemory
5.8 ProtectionandExampleso of VirtualMemory
5.9 Crosscuttinglssues ln theDesignofMemoryHlerarchies
5.10 Putting It All Together The Alpha AXP 21064 Memory Hierarchy
5.11 FallacieeandPltfalis
5.12 Concluding Remarks
5.13 Historical Perspective and References
Storage Systemms
6.1 Introducton
6.2 TypesofStorage Devices
6.3 Buses-Connecting VO DevloestoCPU/Memory
6.4 1/0 Performance Msasures
6.5 Reliabitity, Availability, and RAID
6.6 Crosscutting lssues:lntoftecing toanOperatingSystem
6.7 Oesigningan1/OSysem
6.6 Puttlng it All Together UNIXFile SystemPerformance
6.9 FailaciesandPitfaUs
6.10 ConciuJdlng Remarks
6.11 Historical Perspective and Reterences
Exercises
Interconnoction Notworks
Interconnection Networks
7.1 Introduction
7.2 A Simpte Network
7.3 Connecting the Interconnection Network to the Computer
7.4 Interconnection Network Media
7.5 Connecting More Than Two Computen
7.6 Practlcal Issues (or Commeroal InteroonneoionNetworks
7.7 Examples of Interconnection Networks
7.8 Crosscuttlng Issues for Interconnectton Networks
7.9 Intemetworking
7.10 PuttlngltAIITogetharAnATMNetworkofWorkstations
7.11 Fallacies and Pitfalls
7.12 Conduding Remarks
7.13 Hlstorical Perspective and References
Exercises
8 Multiprocessors
8.1 Introduction
8.2 Characteristics of Application Domains
8.3 Centralized Shared-Memory Architectures
8.4 DlstrlbutwtShared-Memory Architectures
8.5 Synchronization
8.6 Models of Memofy Consistency
8.7 Crosscutting Issues
8.8 Putting It All Together The SGI Challenge Multlprocessor
8.9 Fallacis and Pltfalls
8.10 Conduding Remarks
8.11 Historical Perspective and References
Exercises
AppendixA: ComputerArlthmetlc
by DAVIO GOLDBERQ
Xerox Palo Alto Research Center
A.l Introduction
A.2 BasicTechnlquesoflntegerArithmetic
A.3 Floatig Point
A.4 Roating-Point Multpliation
A.5 Roating-PolntAddtion
A.6 Division and Remainder
A.7 More on Roating-Point Arithmetic
A.8 Speeding Up Integer Addition
A.9 Speeding Up Integer Multiplication and Oivision
A.10 Putting It All Together
A.11 Fallacis and Pitalls
A.12 Historical Perspectlve and References
Exercises
Appendix B: Vector Processors
B.1 Why Vector Processors?
B.2 BasicVectorArchitecture
B.3 Two Real-Workl Issues: Vector Length and Stride
B.4 Effectiveness of Compiler Vectorization
B.5 Enhancing Vector Performance
B.6 Putting It All Together: Perfonnance of Vector Processors
B.7 Fallacies and Pitfalls
B.8 Concluding Remarks
B.9 Historical Perapective and Referces
Exerciaes
Appondix Cs SurvoyofRiscArehltectures
C.1 Introduction
C.2 AddressingModes and Inatrction Formats
C.3 Instructions: The DLX Subsat
C.4 nstructions: Common Extenstona to DLX
C.5 InstructionsUniquetoMIPS
C.6 InstructionsUnique to SPARC
C.7 Instructions Unique to PowerPC
C.8 Instructions Unique to PA-RISC
C.9 Concluding Remarks
C.10 References
Appendlx D: An AKenatlveto RISC: Tle intel 80x86
D.1 Introduction
D.2 80x86 Registers and Data Addressing Modes
D.3 80x86 Integer Operations
D.4 80x86 RoatingPoint Operstions
D.5 80x86 Instruction Encoding
D.6 PuttingitAllogetherMeasunirnentsofinstructionSetUsage
D.7 ConckKfcigRwwta
D.8 HistoricalPerspectrveandReferences
E.l Implementation Issues for the Snooping Coherence Protocol
E.2 Imptementation Issues in the Distributed Directory Protocol
Exercises
RefoTMICM
Index