Table of Contents
1 Introduction to the Linux Kernel
Along Came Liuns:Introduction to Linux
Overview of Operation Systems and Kernels
Linux Versrs Classic Unix Kernels
Linux Kerner Verisions
The Linux Kernel Development Community
Before We Begin
2 Gettion Started with the Kernel
Obtaining the Kernel Source
Installing the Kernel Source
Using Patches
The Kernel Source Tree
Building the Kernel
Mini8mixing Build Noise
Spawning Multiple Build Jobs
Installing the Kernel
A Beast of a Different Nature
No Libc
GNUC
No Memory Protection
No(Easy)Use of Floating Point
Small,Fixed-Size Stack
Synchronization and Concurrency
Portability Is Important
So Here We Are
3 Process Management
Process Descriptor and the Task Structure
Alloacting the Process Descriptor
Storing the Process Descriptor
Process State
Manipulation the Current Process State
Process Context
The Process Family Tree
Process Creation
Copy-on-Write
fork()
vford()
The Linux Implementation of Threads
Kernel Threads
Process Termination
Removal of the Process Descriptor
The Dilemma of the Parentless Task
Process Wrap Up
4 Process Scheduling
Policy
I/O-Bound Versus Processor-Bound Processes
Process Priority
Timeslice
Process Preemption
The Scheduling Policy in Action
The Linux Scheduling Algorithm
Rnnqueues
schedule()
Calculating Priority and Timeslice
Sleeping and Waking Up
The Load Balancer
Preemption and Context Switching
User Preemption
Kernel Preemption
Real-Time
Scheduler-Related System Calls
Scheduling Policy and Priority-Related System Calls
Yielding Processor Time 62
Scheduler Finale 62
System Calls 63
APIs, POSIX, and the C Library 64
Syscalls 65
System Call Numbers 65
System Call Performance 66
System Call Handler 66
Denoting the Correct System Call 67
Parameter Passing 67
System Call Implementation 68
Verifying the Parameters 68
System Call Context 70
Final Steps in Binding a System Call 71
Accessing the System Call from User-Space 72
Why Not to Implement a System Call 73
System Calls in Conclusion 74
Interrupts and Interrupt Handlers 75
Interrupts 75
Interrupt Handlers 76
Top Halves Versus Bottom Halves 77
Registering an Interrupt Handler 77
Freeing an Interrupt Handler 79
Writing an Interrupt Handler 80
Shared Handlers 81
A Real-Life Interrupt Handler 82
Interrupt Context 84
Implementation of Interrupt Handling 85
/proc/interrupts 87
Interrupt Control 88
Disabling and Enabling Interrupts 89
Disabling a Specific Interrupt Line 90
Status of the Interrupt System 91
Don't Interrupt Me;We're Almost Done! 92
Bottom Halves and Deferring Work 93
Bottom Halves 94
Why Bottom Halves? 94
A World of Bottom Halves 95
Softirqs 97
Implementation of Softirqs 97
Using Softirqs 100
Tasklets 101
Implementation of Tasklets 101
UsingTasklets 104
ksoftirqd 105
The Old BH Mechanism 107
Work Queues 108
Implementation of Work Queues 108
UsingWork Queues 111
The Old Task Queue Mechanism 114
Which Bottom Half Should I Use? 115
Locking Between the Bottom Halves 116
Disabling Bottom Halves 116
The Bottom of Bottom-Half Processing 118
Kernel Synchronization Introduction 119
Critical Regions and Race Conditions 120
Why Do We Need Protection? 120
Locking 122
What Causes Concurrency, Anyway? 124
So, How Do I Know What Needs Protecting?125
Deadlocks 126
Contention and Scalability 128
Locking andYour Code 130
9 Kernel Synchronization Methods 131
Atomic Operations 131
Atomic Integer Operations 132
Atomic Bitwise Operations 134
Spin Locks 137
Other Spin Lock Methods 140
Spin Locks and Bottom Halves 140
Reader-Writer Spin Locks 141
Semaphores 143
Creating and Initializing Semaphores 145
Using Semaphores 145
Reader-Writer Semaphores 146
Spin Locks Versus Semaphores 148
Completion Variables 148
BKL:The Big Kernel Lock 149
Seq Locks 150
Preemption Disabling 151
Ordering and Barriers 153
Synchronization Summarization 156
10 Timers and Time Management 157
Kernel Notion of Time 158
The Tick Rate: HZ 158
The Ideal HZValue 160
Jiffies 162
Internal Representation of Jiffies 163
Jiffies Wraparound 164
User-Space and HZ 165
Hardware Clocks and Timers 166
Real-Time Clock 166
System Timer 167
The Timer Interrupt Handler 167
TheTime of Day 169
Timers 171
Using Timers 172
Timer Race Conditions 173
The Timer Implementation 174
Delaying Execution 174
Busy Looping 175
Small Delays 176
schedule_timeout0 177
Out of Time 179
11 Memory Management 181
Pages 381
Zones 183
Getting Pages 185
Getting Zeroed Pages 186
Freeing pages 186
kmalloc() 187
gfp_mask Flags 188
kfree0 192
vmalloc() 193
Slab Layer 194
Design of the Slab Layer 195
Slab Allocator Interface 198
Statically Allocating on the Stack 201
Playing Fair on the Stack 202
High Memory Mappings 202
Permanent Mappings 203
Temporary Mappings 203
Per-CPU Allocations 204
The New percpu Interface 205
Per-CPU Data at Compile-Time 205
Per-CPU Data at Runtime 206
Reasons for Using Per-CPU Data 207
Which Allocation Method Should I Use? 208
12 The Virtual Filesystem 209
Common Filesystem Interface 209
Filesystem Abstraction Layer 210
Unix Filesystems 211
VFS Objects andTheir Data Structures 212
OtherVFS Objects 213
The Superblock Object 213
Superblock Operations 214
The lnode Object 217
Inode Operations 219
The Dentry Object 222
Dentry State 223
The Dentry Cache 223
Dentry Operations 224
The File Object 226
File Operations 227
Data Structures Associated with Filesystems 231
Data Structures Associated with a Process 232
Filesystems in Linux 234
13 The Block I/O Layer 235
Anatomy of a Block Device 236
Buffers and Buffer Heads 237
The bio structure 239
The OldVersus the New 242
Request Queues 242
Requests 243
I/O Schedulers 243
The Job of an I/O Scheduler 244
The Linus Elevator 244
The Deadline I/O Scheduler 245
The Anticipatory I/O Scheduler 247
The Complete Fair Queuing I/O Scheduler
248
The Noop I/O Scheduler 249
I/O Scheduler Selection 249
Summary 250
14 The Process Address Space 251
The Memory Descriptor 252
Allocating a Memory Descriptor 254
Destroying a Memory Descriptor 255
The mm_struct and KernelThreads 255
Memory Areas 255
VMA Flags 257
VMA Operations 258
Lists and Trees of Memory Areas 259
Memory Areas in ILeal Life 260
Manipulating Memory Areas 261
find_vma0 262
find_vma_prev0 263
find_vma_intersection0 263
mmap0 and do_mmap0: Creating an Address Interval
264
The mmap0 System Call 265
munmap0 and do_munmap0: P,.emoving an Address
Interval 266
The munmap0 System Call 266
Page Tables 266
Conclusion 268
15 The Page Cache and Page Writeback 269
Page Cache 270
The address_space Object 270
P~adix Tree 273
The Old Page Hash Table 273
The Buffer Cache 273
The pdflush Daemon 274
Laptop Mode 275
bdflush and kupdated 276
Congestion Avoidance: Why We Have Multiple
Threads 276
To Make a Long Story Short 277
16 Modules 279
Hello, World! 279
Building Modules 281
At Home in the Source Tree 281
Living Externally 282
Installing Modules 283
Generating Module Dependencies 283
Loading Modules 283
Managing Configuration Options 284
Module Parameters 286
Exported Symbols 288
Wrapping Up Modules 289
17 kobjects and sysfs 291
kobjects 292
ktypes 293
ksets 293
Subsystems 294
Structure Confusio 294
Managing and Manipulating kobjects 295
Reference Counts 296
krefs 297
sysfs 298
Adding and Removing kobjects from sysfs 300
Adding Files to sysfs 300
The Kernel Events Layer 303
kobjects and sysfs in a Nutshell 305
18 Debugging 307
WhatYou Need to Start 307
Bugs in the Kernel 308
printk0 308
The Robustness of printk0 309
Loglevels 309
The Log Buffer 310
syslogd and ldogd 311
A Note About printk0 and Kernel Hacking311
Oops 311
ksymoops 313
kallsyms 313
Kernel Debugging Options 313
Atomicity Debugging 314
Asserting Bugs and Dumping Information 314
Magic SysRq Key 315
The Saga of a Kernel Debugger 316
gdb 316
kgdb 317
kdb 317
Poking and Probing the System 318
Using UID as a Conditional 318
Using Condition Variables 318
Using Statistics 318
Rate LimitingYour Debugging 319
Binary Searching to Find the Culprit Change 320
When M1 Else Fails:The Community 320
19 Portability 321
History of Portability in Linux 322
Word Size and Data Types 323
Opaque Types 325
Special Types 326
Explicitly Sized Types 326
Signedness of Chars 327
Data Alignment 328
Avoiding Alignment Issues 328
Alignment of Nonstandard Types 329
Structure Padding 329
Byte Order 330
History of Big- and Little-Endian 332
Byte Ordering in the Kernel 332
Time 332
Page Size 333
Processor Ordering 334
SME Kernel Preemption, and High Memory 334
Portability Is Fun 334
20 Patches, Hacking, and the Community 335
The Community 335
Linux Coding Style 33.6
Indention 336
Braces 336
Line Size 337
Naming 338
Functions 338
Comments 338
Typedefs 339
Using What Is Already Provided 339
No ifdefs in the Source 340
Structure Initializers 340
Fixing Code Up Ex Post Facto 340
Chain of Command 341
Submitting Bug Reports 341
Generating Patches 342
Submitting Patches 343
Conclusion 343
A Linked Lists 345
Circular Linked Lists 346
Moving Through a Linked List 347
The Linux Kernel's Implementation 347
The Linked-List Structure 348
Manipulating Linked Lists 349
Traversing Linked Lists 351
B Kernel Random Number Generator 353
Design and Implementation 354
The Dilemma of System Startup 356
Interfaces to Input Entropy 356
Interfaces to Output Entropy 357
C Algorithmic Complexity 359
Algorithms 359
Big-O Notation 359
Big Theta Notation 360
Putting It All Together 360
Perils of Time Complexity 361
Bibliography and Reading List 363
Books on Operating System Design 363
Books on Unix Kernels 364
Books on Linux Kernels 364
Books on Other Kernels 364
Books on the UnixAPI 365
Books on the C Programnming Language 365
Other Works 365
Websites 366
Index 367