书籍详情
计算机组成与设计:硬件/软件接口(英文版第4版 ARM版)
作者:(美)帕特林 等著
出版社:机械工业出版社
出版时间:2010-04-01
ISBN:9787111302889
定价:¥95.00
购买这本书可以去
内容简介
《计算机组成与设计:硬件/软件接口(英文版·第4版·ARM版)》采用了一个MIPS处理器来展示计算机硬件技术、流水线、存储器层次结构以及I/O等基本功能。此外。《计算机组成与设计:硬件/软件接口(英文版·第4版·ARM版)》还包括一些关于x86架构的介绍。这本最畅销的计算机组成书籍经过全面更新,关注现今发生在计算机体系结构领域的革命性变革:从单处理器发展到多核微处理器。此外,出版这本书的ARM版是为了强调嵌入式系统对于全亚洲计算行业的重要性,并采用ARM处理器来讨论实际计算机的指令集和算术运算。因为ARM是用于嵌入式设备的最流行的指令集架构,而全世界每年约销售40亿个嵌入式设备。与前几版一样。采用ARMv6(ARM 11系列)为主要架构来展示指令系统和计算机算术运算的基本功能。覆盖从串行计算到并行计算的革命性变革,新增了关于并行化的一章,并且每章中还有一些强调并行硬件和软件主题的小节。新增一个由NVIDIA的首席科学家和架构主管撰写的附录,介绍了现代GPU的出现和重要性,首次详细描述了这个针对可视计算进行了优化的高度并行化、多线程、多核的处理器。描述一种度量多核性能的独特方法——“Roofline model”,自带benchmark测试和分析AMD Opteron X4、Intel Xeo 5000、Sun Ultra SPARC T2和IBM Cell的性能。涵盖了一些关于闪存和虚拟机的新内容。提供了大量富有启发性的练习题,内容达200多页。将AMD Opteron X4和Intel Nehalem作为贯穿《计算机组成与设计:硬件/软件接口(英文版·第4版·ARM版)》的实例。用SPEC CPU2006组件更新了所有处理器性能实例。
作者简介
David A.Patterson,加州大学伯克利分校计算机科学系教授。美国国家工程研究院院士。IEEE和ACM会士。曾因成功的启发式教育方法被IEEE授予James H.Mulligan,Jr教育奖章。他因为对RISC技术的贡献而荣获1 995年IEEE技术成就奖,而在RAID技术方面的成就为他赢得了1999年IEEE Reynold Johnson信息存储奖。2000年他~13John L.Hennessy分享了John von Neumann奖。John L.Hennessy,斯坦福大学校长,IEEE和ACM会士。美国国家工程研究院院士及美国科学艺术研究院院士。Hennessy教授因为在RISC技术方面做出了突出贡献而荣获2001年的Eckert-Mauchly奖章.他也是2001年Seymour Cray计算机工程奖得主。并且和David A.Patterson分享了2000年John von Neumann奖。
目录
Contents
Preface xv
CHAPTERS
Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Below Your Program 10
1.3 Under the Covers 13
1.4 Performance 26
1.5 The Power Wall 39
1.6 The Sea Change: The Switch from Uniprocessors to Multiprocessors 41
1.7 Real Stuff: Manufacturing and Benchmarking the AMD Opteron X4 44
1.8 Fallacies and Pitfalls 51
1.9 Concluding Remarks 54
1.10 Historical Perspective and Further Reading 55
1.11 Exercises 56
Instructions: Language of the Computer 74
2.1 Introduction 76
2.2 Operations of the Computer Hardware 77
2.3 Operands of the Computer Hardware 80
2.4 Signed and Unsigned Numbers 86
2.5 Representing Instructions in the Computer 93
2.6 Logical Operations 100
2.7 Instructions for Making Decisions 104
2.8 Supporting Procedures in Computer Hardware 113
2.9 Communicating with People 122
2.10 ARM Addressing for 32-Bit Immediates and More Complex Addressing Modes 127
2.11 Parallelism and Instructions: Synchronization 133
2.12 Translating and Starting a Program 135
2.13 A C Sort Example to Put It All Together 143
: This icon identi.es material on the CD
2.14 Arrays versus Pointers 152
2.15 Advanced Material: Compiling C and Interpreting Java 156
2.16 Real Stuff: MIPS Instructions 156
2.17 Real Stuff: x86 Instructions 161
2.18 Fallacies and Pitfalls 170
2.19 Concluding Remarks 171
2.20 Historical Perspective and Further Reading 174
2.21 Exercises 174
Arithmetic for Computers 214
3.1 Introduction 216
3.2 Addition and Subtraction 216
3.3 Multiplication 220
3.4 Division 226
3.5 Floating Point 232
3.6 Parallelism and Computer Arithmetic: Associativity 258
3.7 Real Stuff: Floating Point in the x86 259
3.8 Fallacies and Pitfalls 262
3.9 Concluding Remarks 265
3.10 Historical Perspective and Further Reading 268
3.11 Exercises 269
The Processor 284
4.1 Introduction 286
4.2 Logic Design Conventions 289
4.3 Building a Datapath 293
4.4 A Simple Implementation Scheme 302
4.5 An Overview of Pipelining 316
4.6 Pipelined Datapath and Control 330
4.7 Data Hazards: Forwarding versus Stalling 349
4.8 Control Hazards 361
4.9 Exceptions 370
4.10 Parallelism and Advanced Instruction-Level Parallelism 377
4.11 Real Stuff: the AMD Opteron X4 (Barcelona) Pipeline 390
4.12 Advanced Topic: an Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 392
4.13 Fallacies and Pitfalls 393
4.14 Concluding Remarks 394
4.15 Historical Perspective and Further Reading 395
4.16 Exercises 395
Large and Fast: Exploiting Memory Hierarchy 436
5.1 Introduction 438
5.2 The Basics of Caches 443
5.3 Measuring and Improving Cache Performance 461
5.4 Virtual Memory 478
5.5 A Common Framework for Memory Hierarchies 504
5.6 Virtual Machines 511
5.7 Using a Finite-State Machine to Control a Simple Cache 515
5.8 Parallelism and Memory Hierarchies: Cache Coherence 520
5.9 Advanced Material: Implementing Cache Controllers 524
5.10 Real Stuff: the AMD Opteron X4 (Barcelona) and Intel Nehalem Memory Hierarchies 525
5.11 Fallacies and Pitfalls 529
5.12 Concluding Remarks 533
5.13 Historical Perspective and Further Reading 534
5.14 Exercises 534
Storage and Other I/O Topics 554
6.1 Introduction 556
6.2 Dependability, Reliability, and Availability 559
6.3 Disk Storage 561
6.4 Flash Storage 566
6.5 Connecting Processors, Memory, and I/O Devices 568
6.6 Interfacing I/O Devices to the Processor, Memory, and Operating System 572
6.7 I/O Performance Measures: Examples from Disk and File Systems 582
6.8 Designing an I/O System 584
6.9 Parallelism and I/O: Redundant Arrays of Inexpensive Disks 585
6.10 Real Stuff: Sun Fire x4150 Server 592
6.11 Advanced Topics: Networks 598
6.12 Fallacies and Pitfalls 599
6.13 Concluding Remarks 603
6.14 Historical Perspective and Further Reading 604
6.15 Exercises 605
Multicores, Multiprocessors, and Clusters 616
7.1 Introduction 618
7.2 The Dif.culty of Creating Parallel Processing Programs 620
7.3 Shared Memory Multiprocessors 624
7.4 Clusters and Other Message-Passing Multiprocessors 627
7.5 Hardware Multithreading 631
7.6 SISD, MIMD, SIMD, SPMD, and Vector 634
7.7 Introduction to Graphics Processing Units 640
7.8 Introduction to Multiprocessor Network Topologies 646
7.9 Multiprocessor Benchmarks 650
7.10 Roo.ine: A Simple Performance Model 653
7.11 Real Stuff: Benchmarking Four Multicores Using the Roo. ine Model 661
7.12 Fallacies and Pitfalls 670
7.13 Concluding Remarks 672
7.14 Historical Perspective and Further Reading 674
7.15 Exercises 674 Index I-1
CD-ROM CONTENT
Graphics and Computing GPUs A-2
A.1 Introduction A-3
A.2 GPU System Architectures A-7
A.3 Scalable Parallelism – Programming GPUs A-12
A.4 Multithreaded Multiprocessor Architecture A-25
A.5 Parallel Memory System G.6 Floating Point A-36
A.6 Floating Point Arithmetic A-41
A.7 Real Stuff: The NVIDIA GeForce 8800 A-46
A.8 Real Stuff: Mapping Applications to GPUs A-55
A.9 Fallacies and Pitfalls A-72
A.10 Concluding Remarks A-76
A.11 Historical Perspective and Further Reading A-77
ARM and Thumb Assembler Instructions B1-2
B1.1 Using This Appendix B1-3 B1.2 Syntax B1-4 B1.3 Alphabetical List of ARM and Thumb Instructions B1-8 B1.4 ARM Assembler Quick Reference B1-49 B1.5 GNU Assembler Quick Reference B1-60
ARM and Thumb Instruction Encodings B2-2
B2.1 ARM Instruction Set Encodings B2-3
B2.2 Thumb Instruction Set Encodings B2-9
B2.3 Program Status Registers B2-11
Instruction Cycle Timings B3-2
B3.1 Using the Instruction Set Cycle Timing Tables B3-3 B3.2 ARM7TDMI Instruction Cycle Timings B3-5 B3.3 ARM9TDMI Instruction Cycle Timings B3-6 B3.4 StrongARM1 Instruction Cycle Timings B3-8 B3.5 ARM9E Instruction Cycle Timings B3-9 B3.6 ARM10E Instruction Cycle Timings B3-11 B3.7 Intel XScale Instruction Cycle Timings B3-12 B3.8 ARM11 Cycle Timings B3-14
C The Basics of Logic Design C-2
C.1 Introduction C-3
C.2 Gates, Truth Tables, and Logic Equations C-4
C.3 Combinational Logic C-9
C.4 Using a Hardware Description Language C-20
C.5 Constructing a Basic Arithmetic Logic Unit C-26
C.6 Faster Addition: Carry Lookahead C-38
C.7 Clocks C-48
C.8 Memory Elements: Flip-Flops, Latches, and Registers C-50
C.9 Memory Elements: SRAMs and DRAMs C-58
C.10 Finite-State Machines C-67
C.11 Timing Methodologies C-72
C.12 Field Programmable Devices C-78
C.13 Concluding Remarks C-79
C.14 Exercises C-80
D Mapping Control to Hardware D-2
D.1 Introduction D-3
D.2 Implementing Combinational Control Units D-4
D.3 Implementing Finite-State Machine Control D-8
D.4 Implementing the Next-State Function with a Sequencer D-22
D.5 Translating a Microprogram to Hardware D-28
D.6 Concluding Remarks D-32
D.7 Exercises D-33
ADVANCED CONTENT
Section 2.15 Compiling C and Interpreting Java Section 4.12 An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations Section 5.9 Implementing Cache Controllers Section 6.11 Networks
HISTORICAL PERSPECTIVES & FURTHER READING
Chapter 1 Computer Abstractions and Technology: Section 1.10 Chapter 2 Instructions: Language of the Computer: Section 2.20 Chapter 3 Arithmetic for Computers: Section 3.10 Chapter 4 The Processor: Section 4.15 Chapter 5 Large and Fast: Exploiting Memory Hierarchy: Section 5.13 Chapter 6 Storage and Other I/O Topics: Section 6.14 Chapter 7 Multicores, Multiprocessors, and Clusters: Section 7.14 Appendix A Graphics and Computing GPUs: Section A.11
TUTORIALS
VHDL
Verilog
SOFTWARE
Xilinx FPGA Design, Simulation and Synthesis Software QEMU http://www.nongnu.org/qemu/about.html
Glossary G-1 Index I-1 Further Reading FR-1
Preface xv
CHAPTERS
Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Below Your Program 10
1.3 Under the Covers 13
1.4 Performance 26
1.5 The Power Wall 39
1.6 The Sea Change: The Switch from Uniprocessors to Multiprocessors 41
1.7 Real Stuff: Manufacturing and Benchmarking the AMD Opteron X4 44
1.8 Fallacies and Pitfalls 51
1.9 Concluding Remarks 54
1.10 Historical Perspective and Further Reading 55
1.11 Exercises 56
Instructions: Language of the Computer 74
2.1 Introduction 76
2.2 Operations of the Computer Hardware 77
2.3 Operands of the Computer Hardware 80
2.4 Signed and Unsigned Numbers 86
2.5 Representing Instructions in the Computer 93
2.6 Logical Operations 100
2.7 Instructions for Making Decisions 104
2.8 Supporting Procedures in Computer Hardware 113
2.9 Communicating with People 122
2.10 ARM Addressing for 32-Bit Immediates and More Complex Addressing Modes 127
2.11 Parallelism and Instructions: Synchronization 133
2.12 Translating and Starting a Program 135
2.13 A C Sort Example to Put It All Together 143
: This icon identi.es material on the CD
2.14 Arrays versus Pointers 152
2.15 Advanced Material: Compiling C and Interpreting Java 156
2.16 Real Stuff: MIPS Instructions 156
2.17 Real Stuff: x86 Instructions 161
2.18 Fallacies and Pitfalls 170
2.19 Concluding Remarks 171
2.20 Historical Perspective and Further Reading 174
2.21 Exercises 174
Arithmetic for Computers 214
3.1 Introduction 216
3.2 Addition and Subtraction 216
3.3 Multiplication 220
3.4 Division 226
3.5 Floating Point 232
3.6 Parallelism and Computer Arithmetic: Associativity 258
3.7 Real Stuff: Floating Point in the x86 259
3.8 Fallacies and Pitfalls 262
3.9 Concluding Remarks 265
3.10 Historical Perspective and Further Reading 268
3.11 Exercises 269
The Processor 284
4.1 Introduction 286
4.2 Logic Design Conventions 289
4.3 Building a Datapath 293
4.4 A Simple Implementation Scheme 302
4.5 An Overview of Pipelining 316
4.6 Pipelined Datapath and Control 330
4.7 Data Hazards: Forwarding versus Stalling 349
4.8 Control Hazards 361
4.9 Exceptions 370
4.10 Parallelism and Advanced Instruction-Level Parallelism 377
4.11 Real Stuff: the AMD Opteron X4 (Barcelona) Pipeline 390
4.12 Advanced Topic: an Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 392
4.13 Fallacies and Pitfalls 393
4.14 Concluding Remarks 394
4.15 Historical Perspective and Further Reading 395
4.16 Exercises 395
Large and Fast: Exploiting Memory Hierarchy 436
5.1 Introduction 438
5.2 The Basics of Caches 443
5.3 Measuring and Improving Cache Performance 461
5.4 Virtual Memory 478
5.5 A Common Framework for Memory Hierarchies 504
5.6 Virtual Machines 511
5.7 Using a Finite-State Machine to Control a Simple Cache 515
5.8 Parallelism and Memory Hierarchies: Cache Coherence 520
5.9 Advanced Material: Implementing Cache Controllers 524
5.10 Real Stuff: the AMD Opteron X4 (Barcelona) and Intel Nehalem Memory Hierarchies 525
5.11 Fallacies and Pitfalls 529
5.12 Concluding Remarks 533
5.13 Historical Perspective and Further Reading 534
5.14 Exercises 534
Storage and Other I/O Topics 554
6.1 Introduction 556
6.2 Dependability, Reliability, and Availability 559
6.3 Disk Storage 561
6.4 Flash Storage 566
6.5 Connecting Processors, Memory, and I/O Devices 568
6.6 Interfacing I/O Devices to the Processor, Memory, and Operating System 572
6.7 I/O Performance Measures: Examples from Disk and File Systems 582
6.8 Designing an I/O System 584
6.9 Parallelism and I/O: Redundant Arrays of Inexpensive Disks 585
6.10 Real Stuff: Sun Fire x4150 Server 592
6.11 Advanced Topics: Networks 598
6.12 Fallacies and Pitfalls 599
6.13 Concluding Remarks 603
6.14 Historical Perspective and Further Reading 604
6.15 Exercises 605
Multicores, Multiprocessors, and Clusters 616
7.1 Introduction 618
7.2 The Dif.culty of Creating Parallel Processing Programs 620
7.3 Shared Memory Multiprocessors 624
7.4 Clusters and Other Message-Passing Multiprocessors 627
7.5 Hardware Multithreading 631
7.6 SISD, MIMD, SIMD, SPMD, and Vector 634
7.7 Introduction to Graphics Processing Units 640
7.8 Introduction to Multiprocessor Network Topologies 646
7.9 Multiprocessor Benchmarks 650
7.10 Roo.ine: A Simple Performance Model 653
7.11 Real Stuff: Benchmarking Four Multicores Using the Roo. ine Model 661
7.12 Fallacies and Pitfalls 670
7.13 Concluding Remarks 672
7.14 Historical Perspective and Further Reading 674
7.15 Exercises 674 Index I-1
CD-ROM CONTENT
Graphics and Computing GPUs A-2
A.1 Introduction A-3
A.2 GPU System Architectures A-7
A.3 Scalable Parallelism – Programming GPUs A-12
A.4 Multithreaded Multiprocessor Architecture A-25
A.5 Parallel Memory System G.6 Floating Point A-36
A.6 Floating Point Arithmetic A-41
A.7 Real Stuff: The NVIDIA GeForce 8800 A-46
A.8 Real Stuff: Mapping Applications to GPUs A-55
A.9 Fallacies and Pitfalls A-72
A.10 Concluding Remarks A-76
A.11 Historical Perspective and Further Reading A-77
ARM and Thumb Assembler Instructions B1-2
B1.1 Using This Appendix B1-3 B1.2 Syntax B1-4 B1.3 Alphabetical List of ARM and Thumb Instructions B1-8 B1.4 ARM Assembler Quick Reference B1-49 B1.5 GNU Assembler Quick Reference B1-60
ARM and Thumb Instruction Encodings B2-2
B2.1 ARM Instruction Set Encodings B2-3
B2.2 Thumb Instruction Set Encodings B2-9
B2.3 Program Status Registers B2-11
Instruction Cycle Timings B3-2
B3.1 Using the Instruction Set Cycle Timing Tables B3-3 B3.2 ARM7TDMI Instruction Cycle Timings B3-5 B3.3 ARM9TDMI Instruction Cycle Timings B3-6 B3.4 StrongARM1 Instruction Cycle Timings B3-8 B3.5 ARM9E Instruction Cycle Timings B3-9 B3.6 ARM10E Instruction Cycle Timings B3-11 B3.7 Intel XScale Instruction Cycle Timings B3-12 B3.8 ARM11 Cycle Timings B3-14
C The Basics of Logic Design C-2
C.1 Introduction C-3
C.2 Gates, Truth Tables, and Logic Equations C-4
C.3 Combinational Logic C-9
C.4 Using a Hardware Description Language C-20
C.5 Constructing a Basic Arithmetic Logic Unit C-26
C.6 Faster Addition: Carry Lookahead C-38
C.7 Clocks C-48
C.8 Memory Elements: Flip-Flops, Latches, and Registers C-50
C.9 Memory Elements: SRAMs and DRAMs C-58
C.10 Finite-State Machines C-67
C.11 Timing Methodologies C-72
C.12 Field Programmable Devices C-78
C.13 Concluding Remarks C-79
C.14 Exercises C-80
D Mapping Control to Hardware D-2
D.1 Introduction D-3
D.2 Implementing Combinational Control Units D-4
D.3 Implementing Finite-State Machine Control D-8
D.4 Implementing the Next-State Function with a Sequencer D-22
D.5 Translating a Microprogram to Hardware D-28
D.6 Concluding Remarks D-32
D.7 Exercises D-33
ADVANCED CONTENT
Section 2.15 Compiling C and Interpreting Java Section 4.12 An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations Section 5.9 Implementing Cache Controllers Section 6.11 Networks
HISTORICAL PERSPECTIVES & FURTHER READING
Chapter 1 Computer Abstractions and Technology: Section 1.10 Chapter 2 Instructions: Language of the Computer: Section 2.20 Chapter 3 Arithmetic for Computers: Section 3.10 Chapter 4 The Processor: Section 4.15 Chapter 5 Large and Fast: Exploiting Memory Hierarchy: Section 5.13 Chapter 6 Storage and Other I/O Topics: Section 6.14 Chapter 7 Multicores, Multiprocessors, and Clusters: Section 7.14 Appendix A Graphics and Computing GPUs: Section A.11
TUTORIALS
VHDL
Verilog
SOFTWARE
Xilinx FPGA Design, Simulation and Synthesis Software QEMU http://www.nongnu.org/qemu/about.html
Glossary G-1 Index I-1 Further Reading FR-1
猜您喜欢