eca contents

Table of Contents For

Essentials Of

Computer Architecture

Preface xxi

Chapter 1 Introduction And Overview 3

1.1	The Importance Of Architecture 3
1.2	Learning The Essentials 3
1.3	The Cycle Of New Hardware And New Software 4
1.4	What We Will Cover 5
1.5	What We Will Omit 6
1.6	What We Will Emphasize 6
1.7	Architecture, Design, and Implementation 7
1.8	Hardware Designs And Unexpected Constraints 8
1.9	Summary 8

Chapter 2 Program Interpretation And Transformation 11

2.1	Introduction 11
2.2	Specification Of Computation 11
2.3	Automated Program Interpretation 12
2.4	What Level Of Abstraction Should Be Used For Programs? 13
2.5	Transforming A Source Program To Machine Language 13
2.6	Using Memory Addresses Instead Of Variable Names 15
2.7	Global And Local Variables 16
2.8	Segments And Their Location In Memory 17
2.9	Relocation And Memory Addresses 18
2.10	The ELF Standard for Storing Object Programs 20
2.11	Translation Of An Example C Program 21
	Exercises 23

Chapter 3 Data And Program Representation 27

3.1	Introduction 27
3.2	Definitions Of Bit And Byte 27
3.3	Possible Values In A Byte 28
3.4	Binary Weighted Positional Representation 29
3.5	Bit Ordering 31
3.6	Hexadecimal Notation Used By Humans 31
3.7	Notation For Hexadecimal And Binary Constants 32
3.8	Character Sets 33
3.9	Unicode 34
3.10	Unsigned Integers And Endianness 35
3.11	Signed Binary Integers 36
3.12	Quirks Of Signed Representations 37
3.13	An Example Of Two's Complement Numbers 37
3.14	Sign Extension 38
3.15	Casting Integers 39
3.16	Floating Point 40
3.17	Range Of IEEE Floating Point Values 42
3.18	Biased Exponent Values 43
3.19	An Example Floating Point Number 43
3.20	Special Values And NaN 44
3.21	Binary Coded Decimal Representation 45
3.22	Signed, Fractional, And Packed BCD Representations 46
3.23	Data Aggregates 46
3.24	Instructions And Their Representation 47
3.25	Summary 48
	Exercises 49

Chapter 4 A High-Level Overview Of Processors 53

4.1	Introduction 53
4.2	The Two Basic Architectural Approaches 53
4.3	The Harvard And Von Neumann Architectures 54
4.4	Definition Of A Processor 55
4.5	The Range Of Processors 56
4.6	Hierarchical Structure And Computational Engines 57
4.7	Structure Of A Conventional Processor 58
4.8	Processor Categories And Roles 59
4.9	Processor Technologies 61
4.10	Stored Programs 61
4.11	The Fetch-Execute Cycle 62
4.12	Instructions In Memory 63
4.13	Variable-length and Fixed-length Instructions 63
4.14	Clock Rate And Instruction Rate 65
4.15	Control: Getting Started And Stopping 65
4.16	Starting The Fetch-Execute Cycle 66
4.17	Summary 67
	Exercises 67

Chapter 5 Instruction Sets And Operands 71

5.1	Introduction 71
5.2	Mathematics, Convenience, And Cost 71
5.3	Instruction Sets 72
5.4	Instruction Set Architecture 73
5.5	Opcodes, Operands, And Results 74
5.6	Typical Instruction Format 74
5.7	General-Purpose Registers 74
5.8	Floating Point Registers And Register Identification 75
5.9	Programming With Registers 75
5.10	Register Banks 76
5.11	Terminology: Complex And Reduced Instruction Sets 77
5.12	RISC Design And The Execution Pipeline 78
5.13	Pipelines And Instruction Stalls 80
5.14	Other Causes Of Pipeline Stalls 81
5.15	Consequences For Programmers 82
5.16	Programming, Stalls, And No-Op Instructions 82
5.17	Forwarding 83
5.18	Types Of Operations 83
5.19	Program Counter, Fetch-Execute, And Branching 84
5.20	Condition Codes And Conditional Branching 85
5.21	Subroutine And Function Calls 86
5.22	Argument Passing And Return Values 87
5.23	An Example Instruction Set 88
5.24	The Principle Of Minimalism 90
5.25	The Principles Of Elegance And Orthogonality 91
5.26	Summary 91
	Exercises 92

Chapter 6 Operand Addressing And Operand Types 97

6.1	Introduction 97
6.2	Zero-, One-, Two-, And Three-Address Designs 97
6.3	Zero Operands Per Instruction 98
6.4	One Operand Per Instruction 99
6.5	Two Operands Per Instruction 99
6.6	Three Operands Per Instruction 100
6.7	Operand Sources And Immediate Values 100
6.8	The Von Neumann Bottleneck 101
6.9	Implicit And Explicit Operand Encoding 101
	6.9.1	Implicit Operand Encoding 101
	6.9.2	Explicit Operand Encoding 102
6.10	Operands That Combine Multiple Values 103
6.11	Tradeoffs In The Choice Of Operands 103
6.12	Direct And Indirect Operands In Memory 105
6.13	Illustration Of Operand Addressing Modes 105
6.14	Summary 106
	Exercises 107

Chapter 7 Assembly Languages And Programming Paradigm 111

7.1	Introduction 111
7.2	The Reason To Learn Assembly Language 111
7.3	Characteristics Of High-Level And Low-Level Languages 112
7.4	Assembly Languages 113
7.5	Assembly Language Syntax And Opcodes 114
	7.5.1	Statement Format 114
	7.5.2	Opcode Names 114
	7.5.3	Commenting Conventions 114
7.6	Operand Order 115
7.7	Register Names 116
7.8	Operand Syntax 117
7.9	Assembly Code For If-Then 118
7.10	Assembly Code For If-Then-Else 118
7.11	Assembly Code For Definite Iteration (For Loop) 119
7.12	Assembly Code For Indefinite Iteration (While Loop) 120
7.13	Assembly Code For A Function Call 120
7.14	Variable Declarations In Assembly Code 121
7.15	Example Assembly Language Code 122
	7.15.1	The Fibonacci Example In C 122
	7.15.2	The Fibonacci Example In x86 Assembly Language 124
	7.15.3	The Fibonacci Example In ARM Assembly Language 125
7.16	How An Assembler Works: Two-Pass Translation 127
7.17	Assembly Language Macros 129
7.18	Summary 130
	Exercises 130

Chapter 8 Main Memory And Memory Addressing 135

8.1	Introduction 135
8.2	Characteristics Of Computer Memory 135
8.3	Static And Dynamic RAM Technologies 136
8.4	Memory Performance And Higher Data Rate Technologies 137
8.5	Memory Addresses For An Array Of Bytes 139
8.6	The Fetch-Store Paradigm, Pointers, And Dereferencing 139
8.7	Memory Dumps 141
8.8	Aligned Memory Access And Aggregates In Memory 143
8.9	Ordering Items To Optimize Space With Aligned Access 144
8.10	Memory Organization And Aligned Memory Access 145
8.11	Memory Hardware, Words, And Word Operations 146
8.12	Byte Addressing, Word Addressing, And Alignment 147
8.13	Calculating A Word Address 148
8.14	Calculating Word Addresses Using Powers of Two 148
8.15	Multiple Memories With Separate Controllers 149
8.16	Memory Banks 150
8.17	Interleaving 151
8.18	Memory Sizes, Powers Of Two, And Prefixes 152
8.19	Content Addressable Memory 153
8.20	Ternary CAM 154
8.21	Summary 154
	Exercises 155

Chapter 9 Virtual Memory Technologies And Virtual Addressing 159

9.1	Introduction 159
9.2	Definition Of Virtual Memory 159
9.3	Memory Management Unit And A Virtual Address Space 160
9.4	Address Translation Using Powers Of Two 161
9.5	Virtual Address Spaces For Concurrent Applications 163
9.6	Mechanisms Used To Create Virtual Address Spaces 164
9.7	Base-Bound Registers 164
9.8	Virtual Memory Isolation And Protection 165
9.9	Loading Program Pieces And Demand Paging 165
9.10	Hardware And Software For Demand Paging 166
9.11	Page Replacement 167
9.12	Paging Terminology And Data Structures 167
9.13	Address Translation With A Demand Paging System 168
9.14	Using Powers Of Two For Address Translation 169
9.15	Presence, Use, And Modified Bits 170
9.16	Page Table Storage 171
9.17	Efficient Paging With A Translation Lookaside Buffer 172
9.18	Consequences For Programmers 172
9.19	Single-Level Page Tables And Page Table Size 174
9.20	The Advantage Of Multi-Level Page Tables 174
9.21	Multi-Level Page Tables For 64-Bit Architectures 176
9.22	Sparseness And Address Space Layout Randomization 177
9.23	Page Table Walks And The Importance Of A TLB 177
9.24	Summary 177
	Exercises 178

Chapter 10 Caches And Caching 183

10.1	Introduction 183
10.2	How Data Propagates Through A Storage Hierarchy 183
10.3	Definition of Caching 184
10.4	Characteristics Of A Cache 184
10.5	Cache Terminology 185
10.6	Best-Case And Worst-Case Cache Performance 186
10.7	Cache Performance On A Typical Sequence 187
10.8	Cache Replacement Policy 187
10.9	LRU Replacement 188
10.10	Multilevel Cache Hierarchy 188
10.11	Preloading Caches 189
10.12	Caches Used With Memory 190
10.13	Main Memory Cache 190
10.14	Write Through And Write Back 191
10.15	Cache Coherence 192
10.16	L1, L2, and L3 Caches 193
10.17	Sizes Of L1, L2, And L3 Caches 194
10.18	Instruction And Data Caches 194
10.19	Modified Harvard Architecture 195
10.20	Implementation Of Memory Caching 196
10.21	Direct Mapped Memory Cache 196
10.22	Using Powers Of Two For Efficiency 198
10.23	Hardware Implementation Of A Direct Mapped Cache 199
10.24	Set Associative Memory Cache 201
10.25	Consequences For Programmers 202
10.26	The Relationship Between Virtual Memory And Caching 202
10.27	Virtual Memory Caching And Cache Flush 203
10.28	A Note About Cache Performance 204
10.29	Summary 205
	Exercises 205

Chapter 11 Storage: File Systems, Blocks, And SSDs 209

11.1	Introduction 209
11.2	Persistent Storage Mechanisms And Files 209
11.3	The History Of Persistent Storage 210
11.4	Solid-State Drives (SSDs) 211
11.5	The Block-Oriented Interface 211
11.6	DMA Hardware Used For Block Transfer 212
11.7	Storing Files In 512-Byte Blocks 213
11.8	Blocks Of Cells, Logical Blocks, And Erase-Before-Write 214
11.9	Flash Lifetime 215
11.10	The Internal Structure Of An SSD 215
11.11	Logical Block Storage And Update 216
11.12	Repeated Wear, Location Mapping, And Wear Leveling 217
11.13	Overprovisioning And Bad Block Management 218
11.14	Garbage Collection 218
11.15	Assessment Of SSD Technology 219
11.16	Summary 220
	Exercises 220

Chapter 12 A Programmer's View Of Devices, I\^/\^O, And Buffering 223

12.1	Introduction 223
12.2	Devices And Device Hardware 223
12.3	Device-Independent I\^/\^O And Encapsulation 224
12.4	Conceptual Parts Of A Device Driver 225
12.5	Two Main Categories Of Devices 226
12.6	Example Flow Through A Device Driver 227
12.7	An Input Queue In A Device Driver 228
12.8	An Output Queue In A Device Driver 228
12.9	Isolating I\^/\^O Devices From Applications 229
12.10	The Motivation For A Standard I\^/\O Library 230
12.11	Reducing System Call Overhead 231
12.12	Standard I\^/\^O Functions 231
12.13	Buffered Input 232
12.14	Buffered Output 232
12.15	Flushing A Buffer 233
12.16	Using Buffered I\^/\^O With Devices 233
12.17	The Relationship Between Buffering And Caching 234
12.18	Summary 234
	Exercises 235

Chapter 13 Buses And Bus Architectures 239

13.1	Introduction 239
13.2	Definition Of A Bus 239
13.3	Processors, I\^/\^O Devices, And Buses 240
	13.3.1	Proprietary And Standardized Buses 240
	13.3.2	Shared Buses And An Access Protocol 241
	13.3.3	Multiple Buses 241
	13.3.4	A Parallel Vs. Serial Bus 241
13.4	Physical Bus Connections And Sockets 242
13.5	Control, Address, And Data Lines In A Bus 242
13.6	The Fetch-Store Paradigm 243
13.7	Fetch-Store Operations On A Parallel Bus 244
13.8	Data Transfer Rate, Bus Width, And Serial Buses 244
13.9	Large Data Transfers And Direct Memory Access 246
13.10	Bus Transfer Size And The Size Of Data Items 246
13.11	Bus Address Space 247
13.12	Invalid Addresses And Bus Errors 248
13.13	Memory Addresses And Sockets 249
13.14	The Question Of Multiple Buses 249
13.15	Using Fetch-Store With Devices 250
13.16	Operation Of An Interface 251
13.17	Asymmetric Specifications And Bus Errors 252
13.18	An Example Bus Address Space And Address Map 252
13.19	Holes In A Bus Address Space And Utilization 253
13.20	The Program Interface To A Bus 254
13.21	Bridging Between Two Buses 255
13.22	An Example Bridge Mapping 256
13.23	Consequences For Device Driver Programmers 257
13.24	Switching Fabrics As An Alternative To Buses 257
13.25	Summary 259
	Exercises 259

Chapter 14 Programming Devices And Interrupt-Driven I/O 263

14.1	Introduction 263
14.2	The Two I\^/\^O Paradigms 263
14.3	Programmed I\^/\^O 264
14.4	Synchronization 265
14.5	Synchronization Using Polling 265
14.6	Code For Polling 266
14.7	Control And Status Registers 268
14.8	Using A Struct To Define CSRs 269
14.9	Processor Use And Polling 270
14.10	Interrupt-Driven I\^/\^O 270
14.11	An Interrupt Mechanism And Fetch-Execute 271
14.12	Handling An Interrupt 272
14.13	Interrupt Vectors 273
14.14	Interrupt Initialization And Disabled Interrupts 274
14.15	Interrupting An Interrupt Handler 274
14.16	Configuration Of Interrupts 275
14.17	Dynamic Bus Connections And Pluggable Devices 276
14.18	Interrupts, Performance, And Smart Devices 276
14.19	Smart Devices, DMA, And Offloading 278
14.20	Extending DMA With Buffer Chaining 278
14.21	Scatter Read And Gather Write Operations 279
14.22	Operation Chaining 280
14.23	Summary 280
	Exercises 281

Chapter 15 Data Paths And Instruction Execution 285

15.1	Introduction 285
15.2	Data Paths 285
15.3	An Example Instruction Set 286
15.4	Instructions In Memory 288
15.5	Moving To The Next Instruction 290
15.6	Fetching An Instruction 292
15.7	Decoding An Instruction 292
15.8	Connections To A Register Unit 294
15.9	Control And Coordination 294
15.10	Arithmetic Operations And Multiplexing 295
15.11	Operations Involving Data In Memory 296
15.12	Example Execution Sequences 297
15.13	Summary 298
	Exercises 299

Chapter 16 CPUs: Microcode, Protection, And Processor Modes 303

16.1	Introduction 303
16.2	A Central Processor 303
16.3	CPU Complexity 304
16.4	Modes Of Execution 305
16.5	Backward Compatibility 305
16.6	Changing Modes 306
16.7	Privilege And Protection 306
16.8	Multiple Levels Of Protection 306
16.9	Microcoded Instructions 307
16.10	Microcode Variations 309
16.11	The Advantage Of Microcode 309
16.12	FPGAs And Changes To An Instruction Set 310
16.13	Vertical Microcode 311
16.14	Horizontal Microcode 311
16.15	Example Horizontal Microcode 313
16.16	A Horizontal Microcode Example 314
16.17	Operations That Require Multiple Cycles 315
16.18	Horizontal Microcode And Parallel Execution 316
16.19	Look-Ahead And High-Performance Execution 317
16.20	Parallelism And Execution Order 318
16.21	Out-Of-Order Instruction Execution 318
16.22	Conditional Branches And Branch Prediction 319
16.23	Consequences For Programmers 320
16.24	Summary 320
	Exercises 321

Chapter 17 Parallelism 325

17.1	Introduction 325
17.2	Parallel And Pipelined Architectures 325
17.3	Characterizations Of Parallelism 326
17.4	Microscopic Vs. Macroscopic 326
17.5	Examples Of Microscopic Parallelism 327
17.6	Examples Of Macroscopic Parallelism 327
17.7	Symmetric Vs. Asymmetric 328
17.8	Fine-Grain Vs. Coarse-Grain Parallelism 328
17.9	Explicit Vs. Implicit Parallelism 329
17.10	Types Of Parallel Architectures (Flynn Classification) 329
17.11	Single Instruction Single Data (SISD) 330
17.12	Single Instruction Multiple Data (SIMD) 330
17.13	Multiple Instructions Multiple Data (MIMD) 332
17.14	Communication, Coordination, And Contention 334
17.15	Performance Of Multiprocessors 335
17.16	Consequences For Programmers 337
	17.16.1	Locks And Mutual Exclusion 337
	17.16.2	Programming Explicit And Implicit Parallel Computers 339
	17.16.3	Programming Symmetric Vs. Asymmetric Multiprocessors 340
17.17	Redundant Parallel Architectures 340
17.18	Distributed And Cluster Computers 341
17.19	A Modern Supercomputer 342
17.20	Summary 342
	Exercises 343

Chapter 18 Data Pipelining 347

18.1	Introduction 347
18.2	The Concept Of Pipelining 347
18.3	Software Pipelining 349
18.4	Hardware Pipelining 349
18.5	How Hardware Pipelining Increases Performance 350
18.6	When Pipelining Can Be Used 352
18.7	The Conceptual Division Of Processing 353
18.8	Pipeline Architectures 354
18.9	Pipeline Setup, Stall, And Flush Times 355
18.10	Definition Of Superpipeline Architecture 355
18.11	Summary 356
	Exercises 356

Chapter 19 Assessing Performance 361

19.1	Introduction 361
19.2	Measuring Computational Power And Performance 361
19.3	Measures Of Computational Power 362
19.4	Application-Specific Instruction Counts 363
19.5	Instruction Mix 364
19.6	Standardized Benchmarks 365
19.7	I\^/\^O And Memory Bottlenecks 366
19.8	Moving The Boundary Between Hardware And Software 366
19.9	Choosing Items To Optimize, Amdahl's Law 367
19.10	Amdahl's Law And Parallel Systems 368
19.11	Summary 368
	Exercises 369

Chapter 20 Multicore Processors 373

20.1	Introduction 373
20.2	The Move To Multicore Processor Chips 373
20.3	The Multicore Concept: Parallelism And Shared Memory 374
20.4	Multicore Processor Vs. Multiprocessor 375
20.5	Asymmetry 375
20.6	Direct Communication Among Cores 376
20.7	Tight Coupling And Extremely Low Latency 376
20.8	Shared Access To All I\^/\^O Devices 377
20.9	The Ability To Associate Interrupts With Specific Cores 377
20.10	The Ability To Start And Stop Cores 378
20.11	Shared Memory And Multicore Caching 379
20.12	Cache Inconsistency 379
20.13	Preventing Inconsistency: Cache Coherence 380
20.14	Programming Multicore: Threads And Scheduling 381
20.15	Thread Scheduling And Affinity 382
20.16	Simultaneous Multi-Threading (SMT) 383
20.17	Inter-core Communication Via Messages And Software Interrupts 384
20.18	Mutual Exclusion Among Cores 385
20.19	Locks, And Busy Waiting 386
20.20	Using A Memory Location As A Lock (Test-And-Set) 386
20.21	An Example Of Test-And-Set 387
20.22	Atomic Update And Cache Coherence 388
20.23	Summary 388

Chapter 21 Power And Energy 391

21.1	Introduction 391
21.2	Definition Of Power 391
21.3	Definition Of Energy 392
21.4	Power Consumption By A Digital Circuit 393
21.5	Switching Power Consumed By A CMOS Digital Circuit 394
21.6	Cooling, Power Density, And The Power Wall 395
21.7	Energy Use 395
21.8	Power Management 396
	21.8.1	Voltage And Delay 396
	21.8.2	Decreasing Clock Frequency 397
	21.8.3	Slower Clock Frequency And Multicore Processors 398
21.9	Software Control Of Energy Use 399
21.10	Choosing When To Sleep And When To Awaken 400
21.11	Sleep Modes And Network Devices 402
21.12	Summary 402
	Exercises 403

Chapter 22 Building Blocks: Transistors, Gates, And Clocks 407

22.1	Introduction 407
22.2	The History Of Digital Technologies 407
22.3	Electrical Terminology: Voltage And Current 408
22.4	The Transistor 408
22.5	Logic Gates 410
22.6	Implementation Of A Nand Logic Gate Using Transistors 412
22.7	Symbols Used For Logic Gates 413
22.8	Example Interconnection Of Gates 413
22.9	A Digital Circuit For Binary Addition 416
22.10	Multiple Gates Per Integrated Circuit 418
22.11	The Need For More Than Combinatorial Circuits 418
22.12	Circuits That Maintain State 419
22.13	Feedback And Propagation Delay 420
22.14	Using Latches To Create A Memory 420
22.15	Flip-Flops And Transition Diagrams 421
22.16	Binary Counters 423
22.17	Clocks, Feedback, And Sequences 424
22.18	The Importance Of Feedback 427
22.19	Starting A Sequence 428
22.20	Iteration In Software Vs. Replication In Hardware 429
22.21	Gate And Chip Minimization 430
22.22	Using Spare Gates 430
22.23	Power Distribution And Heat Dissipation 431
22.24	Timing And Clock Zones 431
22.25	Clockless Logic 433
22.26	Circuit Size And Moore's Law 434
22.27	Circuit Boards And Layers 435
22.28	Levels Of Abstraction 435
22.29	Summary 436
	Exercises 437

Chapter 23 Hardware Modularity 441

23.1	Introduction 441
23.2	Motivations For Modularity 441
23.3	Software Modularity 442
23.4	Parameterization 442
23.5	Forms Of Hardware Modularity 442
23.6	Modular Chip Construction 443
	23.6.1	SoC Design 443
	23.6.2	Chiplet Technology 443
	23.6.3	Chiplets And Fabrication Flaws 444
23.7	Replication And Parallelism 444
23.8	Basic Block Replication 445
23.9	An Example Modular Design: A Rebooter 445
23.10	High-Level Rebooter Design 445
23.11	A Building Block To Accommodate A Range Of Sizes 446
23.12	Parallel Interconnection 447
23.13	Module Selection 448
23.14	Summary 448
	Exercises 449

Appendix 1 Rules For Boolean Algebra Simplification 451

A1.1	Introduction 451
A1.2	Notation Used 451
A1.3	Rules Of Boolean Algebra 452

Appendix 2 A Quick Introduction To x86 Assembly Language 453

A2.1	Introduction 453
A2.2	The x86 General-Purpose Registers 454
A2.3	Allowable Operands 455
A2.4	Intel And AT&T Forms Of x86 Assembly Language 456
	A2.4.1	Characteristics Of Intel Assembly Language 457
	A2.4.2	Characteristics Of AT&T Assembly Language 457
A2.5	Arithmetic Instructions 458
A2.6	Logical Operations 459
A2.7	Basic Data Types 460
A2.8	Data Blocks, Arrays, And Strings 462
A2.9	Memory References 463
A2.10	Data Size Inference And Explicit Size Directives 464
A2.11	Computing An Address 465
A2.12	The Stack Operations Push And Pop 466
A2.13	Flow Of Control And Unconditional Branch 467
A2.14	Conditional Branch And Condition Codes 467
A2.15	Subprogram Call And Return 468
A2.16	C Calling Conventions And Argument Passing 469
A2.17	Function Calls And A Return Value 471
A2.18	Extensions To Sixty-Four Bits (x64) 471
A2.19	Summary 472

Appendix 3 ARM Register Definitions And Calling Sequence 475

A3.1	Introduction 475
A3.2	Registers On An ARM Processor 475
A3.3	ARM Calling Conventions 476

Appendix 4 Lab Exercises For A Computer Architecture Course 481

A4.1	Introduction 481
A4.2	Hardware Required for Digital Logic Experiments 482
A4.3	Solderless Breadboard 482
A4.4	Using A Solderless Breadboard 483
A4.5	Power And Ground Connections 484
A4.6	Building And Testing Circuits 484
A4.7	Lab Exercises 485

Table of Contents For Essentials Of Computer Architecture

Preface xxi

Chapter 1 Introduction And Overview 3

Chapter 2 Program Interpretation And Transformation 11

Chapter 3 Data And Program Representation 27

Chapter 4 A High-Level Overview Of Processors 53

Chapter 5 Instruction Sets And Operands 71

Chapter 6 Operand Addressing And Operand Types 97

Chapter 7 Assembly Languages And Programming Paradigm 111

Chapter 8 Main Memory And Memory Addressing 135

Chapter 9 Virtual Memory Technologies And Virtual Addressing 159

Chapter 10 Caches And Caching 183

Chapter 11 Storage: File Systems, Blocks, And SSDs 209

Chapter 12 A Programmer's View Of Devices, I\^/\^O, And Buffering 223

Chapter 13 Buses And Bus Architectures 239

Chapter 14 Programming Devices And Interrupt-Driven I/O 263

Chapter 15 Data Paths And Instruction Execution 285

Chapter 16 CPUs: Microcode, Protection, And Processor Modes 303

Chapter 17 Parallelism 325

Chapter 18 Data Pipelining 347

Chapter 19 Assessing Performance 361

Chapter 20 Multicore Processors 373

Chapter 21 Power And Energy 391

Chapter 22 Building Blocks: Transistors, Gates, And Clocks 407

Chapter 23 Hardware Modularity 441

Appendix 1 Rules For Boolean Algebra Simplification 451

Appendix 2 A Quick Introduction To x86 Assembly Language 453

Appendix 3 ARM Register Definitions And Calling Sequence 475

Appendix 4 Lab Exercises For A Computer Architecture Course 481

Index 507

Table of Contents For

Essentials Of

Computer Architecture