0% found this document useful (0 votes)
42 views

Table of Contents

This document provides an overview of the Solaris 10 and OpenSolaris kernel architecture and internals. It covers key features of Solaris versions, the kernel architecture, processes and threading, scheduling, interprocess communication, and more. The book is intended to help readers understand the internal structures and algorithms used in the Solaris kernel.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Table of Contents

This document provides an overview of the Solaris 10 and OpenSolaris kernel architecture and internals. It covers key features of Solaris versions, the kernel architecture, processes and threading, scheduling, interprocess communication, and more. The book is intended to help readers understand the internal structures and algorithms used in the Solaris kernel.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

SolarisTM Internals

Second Edition
Solaris 10 and Open Solaris Kernel
Architecture

Richard McDougall
Ji m Mauro

Sun Microsystems Press

••
••
PRENTICE
HALL

Upper Saddle River, NJ • Boston • Indianapolis • San Francisco


New York • Toronto • Montreal • London • Munich • Paris • Madrid
Capetown • Sydney • Tokyo • Singapore • Mexico City
Contents

Foreword xxvii
Preface xxix
About the Authors xxxvii
Acknowledgments xxxix

PART ONE
Introduction to Solaris Internals 1

Chapter 1 Introduction 3
1.1 Key Features of Solaris 10, Solaris 9, and Solaris 8 4
1.1.1 Solaris 10 5
1.1.2 Solaris 9 8
1.1.3 Solaris 8 10
1.2 Key Differentiators 12
1.3 Kernel Overview 15
1.3.1 Solaris Kernel Architecture 16
1.3.2 Modular Implementation 17

vii
viii Contents

1.4 Processes, Threads, and Scheduling 18


1.4.1 A New Threads Model 20
1.4.2 Global Process Priorities and Scheduling 22
1.5 Interprocess Communication 23
1.5.1 Traditional UNIX IPC 24
1.5.2 System V IPC 24
1.5.3 POSIX IPC 25
1.5.4 Solaris Doors: Advanced Solaris IPC 25
1.6 Signals 25
1.7 Memory Management 26
1.7.1 Global Memory Allocation 27
1.7.2 The Cyclic Page Cache 28
1.7.3 Kernel Memory Management 28
1.8 Files and File Systems 29
1.9 Resource Management 30
1.9.1 Processor Controls and Domains 33
1.9.2 Solaris Resource Management 35
1.9.3 Internet Protocol Quality of Service 38
1.9.4 Resource Management and Observability 38

PART TWO
The Process Model 41

Chapter 2 The Solaris Process Model 43


2.1 Components of a Process 44
2.1.1 Thread Objects 44
2.1.2 Core Process Components 47
2.2 Process Model Evolution 48
2.2.1 Thread Model Evolution 49
2.2.2 Unified Process Model 50
2.3 Executable Objects 52
2.4 Process Structures 55
2.4.1 The proc Structure 56
2.4.2 User Area 66
2.4.3 Lightweight Processes (LWPs) 69
2.4.4 Kernel Threads 73
Contents ix

2.5 Kernel Process Table 79


2.5.1 Process Limits 80
2.5.2 Thread Limits 83
2.6 Process Resource Attributes 84
2.7 Process Creation 89
2.8 System Calls 98
2.8.1 System Calls on SPARC Architectures 99
2.8.2 A Tour through a System Call 101
2.9 Process Termination 106
2.9.1 LWP and Kernel Thread Exit 108
2.9.2 Deathrow List 109
2.10 The Process File System 110
2.10.1 Procfs Implementation 113
2.10.2 Process Resource Usage 123
2.10.3 Microstate Accounting 125
2.11 Signals 129
2.11.1 Signals Implementation 135
2.11.2 Observing Signal Activity 148
2.11.3 Summary 149
2.12 Sessions and Process Groups 150
2.13 MDB Reference 156

Chapter 3 Scheduling Classes and the Dispatcher 157


3.1 Fundamentals 157
3.2 Processor Abstractions 162
3.2.1 Processor Observability 168
3.3 Dispatcher Queues, Structures, and Variables 171
3.3.1 Dispatcher Structures 172
3.3.2 Dispatcher Structure Linkage 175
3.3.3 Examining Dispatcher Structures 177
3.4 Dispatcher Locks 183
3.4.1 Dispatcher Lock Functions 186
3.4.2 Thread Locks 187
3.4.3 Thread Lock Functions 188
3.4.4 Lock Statistics 189
3.5 Dispatcher Initialization 190
x Contents

3.6 Scheduling Classes 192


3.6.1 Scheduling Class Data 193
3.6.2 Scheduling Class Functions 198
3.6.3 Scheduling Class Dispatcher Tables 202
3.7 Thread Priorities 207
3.7.1 Global Priorities 208
3.7.2 User Priorities 209
3.7.3 Setting Thread Priorities 211
3.8 Dispatcher Functions 234
3.8.1 Dispatcher Queue Management 234
3.8.2 The Heart of the Dispatcher: swtch ( ) 242
3.9 Preemption 246
3.10 The Kernel Sleep/Wakeup Facility 253
3.10.1 Condition Variables 253
3.10.2 Sleep Queues 255
3.10.3 The Sleep Process 257
3.10.4 The Wakeup Mechanism 261
3.11 Interrupts 262
3.11.1 Interrupt Priorities 264
3.11.2 Interrupts as Threads 264
3.11.3 Interrupt Thread Priorities 266
3.11.4 High-Priority Interrupts 266
3.11.5 Interrupt Management 267
3.11.6 Interrupt Monitoring 267
3.11.7 Interprocessor Interrupts and Cross-Calls 268
3.12 Summary 270
3.13 MDB Reference 271

Chapter 4 Interprocess Communication 273


4.1 The System V IPC Framework 274
4.1.1 IPC Objects 274
4.1.2 IPC Framework Design 275
4.1.3 Locking 277
4.1.4 Module Creation 280
4.2 System V IPC Resource Controls 282
4.2.1 The Solution 283
Contents xi

4.3 Configuring IPC Tuneables on Solaris 10 285


4.4 System V Shared Memory 286
4.4.1 Shared Memory Kernel Implementation 288
4.4.2 Intimate Shared Memory (ISM) 291
4.4.3 Dynamic ISM Shared Memory 294
4.5 System V Semaphores 295
4.5.1 Semaphore Kernel Resources 296
4.5.2 Kernel Implementation of System V Semaphores 297
4.5.3 Semaphore Operations 297
4.6 System V Message Queues 299
4.6.1 Kernel Resources for Message Queues 299
4.6.2 Kernel Implementation of Message Queues 301
4.7 POSIX IPC 303
4.7.1 POSIX Shared Memory 304
4.7.2 POSIX Semaphores 305
4.7.3 POSIX Message Queues 309
4.8 Solaris Doors 312
4.8.1 Doors Overview 313
4.8.2 Doors Implementation 314
4.9 MDB Reference 321

Chapter 5 Process Rights Management 323


5.1 Then and Now 323
5.2 Least Privilege in Solaris 324
5.3 Process Privilege Models 325
5.3.1 The Traditional Solaris Superuser Model 326
5.3.2 Extending Solaris with Process Privileges 327
5.3.3 How the Solaris 10 Least Privilege Model Was Chosen 328
5.3.4 Other UNIX Implementations 331
5.4 Privilege Awareness: The Details 334
5.4.1 Per-Process State 334
5.4.2 Privilege Awareness State Transitions 334
5.4.3 Privilege State Manipulation 335
5.4.4 Privilege Escalation Prevention 340
5.4.5 The Trouble with uid 0 340
5.4.6 Basic Privileges 342
xi i Contents

5.4.7 Privileges and the Runtime Environment 342


5.4.8 Privileges and NFS 343
5.4.9 Privileges and Third-Party File Systems 344
5.5 Least Privilege Interfaces 344
5.5.1 The Conspiracy of Bit Sets and Constants 345
5.5.2 Privilege Names and Constants 346
5.5.3 Kernel Data Structures 346
5.5.4 Kernel Interfaces 349
5.5.5 System Call Interfaces 351
5.5.6 Library Interfaces 353
5.5.7 Using Privileges with Role-Based Access Control 357
5.5.8 Using Privileges with Role-Based Access Control 359
5.5.9 Using DTrace for Tracking Privileges 360
5.5.10 Enhancements to proc ( 4 ) and Core Dumps 360
5.5.11 Privilege Debugging 361
5.5.12 Privilege Auditing 362
5.5.13 Device Protection 362

PART THREE
Resource Management 365

Chapter 6 Zones 367


6.1 Introduction 367
6.1.1 Zone Basics 368
6.1.2 Zone Principles 370
6.2 Zone Runtime 371
6.2.1 Zone State Model 371
6.2.2 Zone Names and Numeric IDs 372
6.2.3 Zone Runtime Support 373
6.2.4 Listing Zone Information 374
6.3 Booting Zon2s 375
6.4 Security 379
6.4.1 Credential Handling 380
6.4.2 Fine-Grained Privileges 380
6.4.3 Role-Based Access Control 385
6.4.4 chroot Interactions 385
Contents xiii

6.5 Process Model 386


6.5.1 Signals and Process Control 386
6.5.2 Global Zone Visibility and Access 387
6.5.3 /proc 387
6.5.4 Core Files 389
6.6 File Systems 389
6.6.1 Configuration 389
6.6.2 Size Restrictions 390
6.6.3 File System-Specific Issues 390
6.6.4 File System Traversal Issues 392
6.7 Networking 393
6.7.1 Partitioning 394
6.7.2 Interfaces 395
6.7.3 IPv6 396
6.7.4 IPsec 397
6.7.5 Raw IP Socket Access 397
6.7.6 DLPI Access 398
6.7.7 Routing 398
6.7.8 TCP Connection Teardown 398
6.8 Devices 398
6.8.1 Device Categories 399
6.8.2 /dev and /devices Namespace 400
6.8.3 Device Management: Zone Configuration 401
6.8.4 Device Management: Zone Runtime 401
6.8.5 Zone Console Design 402
6.8.6 ftpd 404
6.9 Interprocess Communication 405
6.9.1 Pipes, STREAMS, and Sockets 405
6.9.2 Doors 405
6.9.3 Loopback Transport Providers 406
6.9.4 System V IPC 406
6.9.5 POSIX IPC 407
6.10 Resource Management and Observability 407
6.10.1 Performance 409
6.10.2 Solaris Resource Management Interactions 410
6.10.3 Kstats 412
6.11 MDB Reference 414
xiv Contents

Chapter 7 Projects, Tasks, and Resource Controls 415


7.1 Projects and Tasks Framework 415
7.1.1 Introduction 415
7.1.2 Projects 416
7.1.3 Tasks 416
7.1.4 Why We Added Tasks to Solaris 417
7.2 The Project Database 418
7.3 Project and Task APIs 419
7.3.1 Interfaces for Projects and Tasks 419
7.4 Kernel Infrastructure for Projects and Tasks 420
7.4.1 System Call Interaction with Projects 421
7.4.2 proc (4) 421
7.4.3 In-Kernel Project Data Structures 421
7.5 Resource Controls 423
7.5.1 Introduction to Resource Controls 424
7.5.2 What Is an rctl? 424
7.5.3 Numeric Values of Resource Controls 426
7.5.4 Resource Control Definitions 426
7.5.5 Policy 428
7.5.6 Consequences of Exceeding an rctl 429
7.5.7 Signal and siginf o Semantics for Exceeded Controls 430
7.5.8 Generalizing Hard and Soft Limits 431
7.5.9 Resource Controls and the Task 431
7.5.10 Visibility through /proc; Privileges and Ownership 432
7.6 Interfaces for Resource Controls 432
7.6.1 Project Name-Service Attributes 433
7.6.2 Attributes Originating within Solaris 433
7.6.3 Grammar for Attributes 433
7.6.4 Interpretation of rctl Attributes 433
7.6.5 An Example /etc/project 435
7.6.6 System Calls and Private Kernel Interfaces 436
7.6.7 Library Functions 436
7.7 Kernel Interfaces for Resource Controls 437
7.7.1 Data Structures 438
7.7.2 Operations Vector 439
Contents XV

7.7.3 Interface Overview 440


7.7.4 Interface Definitions 441
7.7.5 An Example Resource Control 442

PART FOUR
Memory 445

Chapter 8 Introduction to Solaris Memory 447


8.1 Virtual Memory Primer 447
8.2 Two Levels of Memory 448
8.3 Memory Sharing and Protection 448
8.4 Pages: Basic Units of Physical Memory 448
8.5 Virtual-to-Physical Translation 449
8.6 Physical Memory Management: Paging and Swapping 450
8.7 Virtual Memory as a File System Cache 450
8.8 New Features of the Virtual Memory Implementation 451

Chapter 9 Virtual Memory 455


9.1 Design Overview 455
9.2 Virtual Address Spaces 457
9.2.1 Sharing Executables and Libraries 458
9.2.2 Address Spaces on SPARC Systems 459
9.2.3 x86 and x64 Address Space Layout 461
9.2.4 Growing the Heap 461
9.2.5 The Stack 462
9.2.6 Using pmap to Look at Mappings 465
9.3 Tracing the VM System 466
9.4 Virtual Address Space Management 467
9.4.1 Address Space Management 467
9.4.2 Address Space Callbacks 472
9.4.3 Virtual Memory Protection Modes 473
9.4.4 Page Faults in Address Spaces 473
9.5 Segment Drivers 476
9.5.1 The vnode Segment: seg_vn 481
9.5.2 Copy-on-Write 484
9.5.3 Page Protection and Advice 484
xvi Contents

9.6 Anonymous Memory 485


9.7 The Anonymous Memory Layer 487
9.8 The swapf s Layer 489
9.8.1 swapf s Implementation 489
9.9 Virtual Memory Watchpoints 492
9.10 Changes to Support Large Pages 494
9.10.1 System View of a Large Page 494
9.10.2 Free List Organization 495
9.10.3 Large-Page Faulting 495
9.10.4 Large-Page Freeing 499
9.10.5 Operations That Interfere with Large Pages 499
9.10.6 HAT Support 500
9.10.7 procfs Changes 501
9.11 MDB Reference 501

Chapter 10 Physical Memory 503


10.1 Physical Memory Allocation 503
10.1.1 The Allocation Cycle of Physical Memory 503
10.2 Pages: The Basic Unit of Solaris Memory 506
10.2.1 The Page Hash List 507
10.2.2 Page Structures 508
10.2.3 Free List and Cache List 509
10.2.4 Physical Page "memseg" Lists 509
10.2.5 The Page-Level Interfaces 510
10.2.6 The Page Throttle 512
10.2.7 Page Coloring 512
10.3 The Page Scanner 516
10.3.1 Page Scanner Operation 517
10.3.2 Page-Out Algorithm and Parameters 518
10.3.3 Shared Library Optimizations 520
10.3.4 Parameters That Limit Pages Paged Out 521
10.3.5 Page Scanner Implementation 522
10.3.6 The Memory Scheduler 524
10.4 MDB Reference 525
Contents xvii

Chapter 11 Kernet Memory 527


11.1 Kernel Virtual Memory Layout 527
11.1.1 Kernel Address Space 528
11.1.2 Kernel Text and Data Segments 528
11.1.3 Virtual Memory Data Structures 530
11.1.4 UltraSPARC Kernel Nucleus 531
11.1.5 Loadable Kernel Module Text and Data 531
11.1.6 The Kernel Address Space and Segments 533
11.2 Kernel Memory Allocation 534
11.2.1 The Kernel Heap 534
11.2.2 The Kernel Memory Segment Driver 535
11.2.3 The Kernel Memory Slab Allocator 537
11.3 The Vmem Allocator 552
11.3.1 Background 552
11.3.2 Vmem Objectives 553
11.3.3 Interface Description 553
11.3.4 Vmem Implementation 556
11.3.5 Vmem Performance 560
11.3.6 Summary 561
11.4 Kernel Memory Allocator Tracing 562
11.4.1 Enabling KMA DEBUG Flags 562
11.4.2 Examining Kernel Memory Allocations with MDB 563
11.4.3 Detecting Memory Corruption 565
11.4.4 Checking a Freed Buffer: Oxdeadbeef 566
11.4.5 Debugging with the Redzone Indicator: Oxfeedface 566
11.4.6 Detecting Uninitialized Data: Oxbaddcafe 569
11.4.7 Associating Panic Messages with Failures 570
11.4.8 Memory Allocation Logging 570
11.4.9 Analyzing Memory with Advanced Techniques 573
11.4.10 Finding Corrupt Buffers with : : kmem_veri f y 575
11.4.11 Using the Allocator Logging Facility 576
11.5 MDB Reference 578
xviii Contents

Chapter 12 Hardware Address Translation 581


12.1 HAT Overview 581
12.2 The UltraSPARC HAT Layer 583
12.2.1 Introduction 583
12.2.2 struct hat 585
12.2.3 The Translation Table 588
12.2.4 The Translation Storage Buffer (TSB) 601
12.2.5 Intimate Shared Memory (ISM) 613
12.2.6 Synchronization in the HAT Layer 616
12.2.7 SPARC HAT Layer Kernel Tunables 620
12.2.8 SPARC Hat Layer kstats 621
12.3 The x64 HAT Layer 625
12.3.1 MMU Configuration 625
12.3.2 s t ruc t mmu Variable 627
12.3.3 Virtual Address Space Layout 628
12.3.4 64-Bit Address Space Layout 629
12.3.5 32-Bit Address Space Layout 629
12.3.6 HAT Implementation 631
12.4 MDB Reference 636

Chapter 13 Working with Multiple Page Sizes in Solaris 639


13.1 Determining When to Use Large Pages 639
13.2 Measuring Application Performance 640
13.2.1 Determination Allocated Page Sizes 642
13.2.2 Discovery of Supported Page Sizes 644
13.3 Configuring for Multiple Page Sizes 645
13.3.1 Enabling Large Pages 646
13.3.2 Advising Page-Size Preferences with ppgs z (im) 646
13.3.3 Interposing Shared Libraries with libmpss . so 647
13.3.4 Request Larger Page Sizes with the Compiler 648
13.3.5 Interfaces to Request Larger Page Sizes 649
13.3.6 CPU Specific Large Page Support 652
Contents xix

PART FIVE
File Systems 655

Chapter 14 File System Framework 657


14.1 File System Framework 657
14.2 Process-Level File Abstractions 658
14.2.1 File Descriptors 660
14.2.2 The open Code Path 661
14.2.3 Allocating and Deallocating File Descriptors 662
14.2.4 File Descriptor Limits 665
14.2.5 File Structures 666
14.3 Solaris File System Framework 668
14.3.1 Evolution of the File System Framework 669
14.3.2 The Solaris File System Interface 672
14.4 File System Modules 672
14.4.1 Interfaces for Mount Options 673
14.4.2 Module Initialization 674
14.5 The Virtual File System (vf s) Interface 675
14.5.1 vf s Methods 676
14.5.2 vf s Support Functions 679
14.5.3 The mount Method 681
14.5.4 The umount Method 683
14.5.5 Root vnode Identification 683
14.5.6 vf s Information Available with MDB 684
14.6 The Vnode 685
14.6.1 Object Interface 686
14.6.2 vnode Types 688
14.6.3 vnode Method Registration 688
14.6.4 vnode Methods 690
14.6.5 Support Functions for Vnodes 696
14.6.6 The Life Cycle of a Vnode 696
14.6.7 vnode Creation and Destruction 698
14.6.8 The vnode Reference Count 698
14.6.9 Interfaces for Paging vnode Cache 698
14.6.10 Block 110 on vnode Pages 700
XX Contents

14.6.11 vnode Information Obtainable with mdb 701


14.6.12 DTrace Probes in the vnode Layer 703
14.7 File System 110 707
14.7.1 Memory Mapped I/O 708
14.7.2 read() and write () System Calls 709
14.7.3 The seg_kpm Driver 710
14.7.4 The seg_map Driver 710
14.7.5 Interaction between segmap and segkpm 716
14.8 File Systems and Memory Allocation 718
14.8.1 Solaris 8—Cyclic Page Cache 718
14.8.2 The Old Allocation Algorithm 719
14.8.3 The New Allocation Algorithm 720
14.8.4 Putting lt All Together: The Allocation Cycle 720
14.9 Path-Name Management 722
14.9.1 The lookuppn ( ) Method 722
14.9.2 The vop lookup ( ) Method 723
14.9.3 The vop_readdir ( ) Method 723
14.9.4 Path-Name Traversal Functions 724
14.10 The Directory Name Lookup Cache 726
14.10.1 DNLC Operation 726
14.10.2 Primary DNLC Support Functions 728
14.10.3 DNLC Negative Cache 729
14.10.4 DNLC Directory Cache 729
14.10.5 DNLC Housekeeping Thread 733
14.10.6 DNLC Statistics 733
14.11 The File System Flush Daemon 734
14.12 File System Conversion to Solaris 10 734
14.13 MDB Reference 736

Chapter 15 The UFS File System 737


15.1 UFS Development History 737
15.2 UFS On-Disk Format 739
15.2.1 On-Disk UFS Inodes 739
15.2.2 UFS Directories 742
15.2.3 UFS Hard Links 744
15.2.4 Shadow Inodes 745
Contents xxi

15.2.5 The Boot Block 746


15.2.6 The Superblock 747
15.2.7 The Cylinder Group 748
15.2.8 Summary of UFS Architecture 749
15.3 The UFS mode 751
15.3.1 In-Core UFS Inodes 751
15.3.2 mode Cache 752
15.3.3 Block Allocation 754
15.3.4 Methods to Read and Write UFS Files 760
15.4 Access Control in UFS 764
15.5 Extended Attributes in UFS 767
15.6 Locking in UFS 768
15.6.1 UFS Lock Descriptions 769
15.6.2 mode Lock Ordering 772
15.6.3 UFS Lockfs Protocol 773
15.7 Logging 775
15.7.1 On-Disk Log Data Structures 776
15.7.2 In-Core Log Data Structures 779
15.7.3 Summary Information 782
15.7.4 Transactions 783
15.7.5 Rolling the Log 787
15.7.6 Redirecting Reads and Writes to the Log 789
15.7.7 Failure Recovery 790
15.8 MDB Reference 790

PART SIX
Platform Specifics 793

Chapter 16 Support for NUMA and CMT Hardware 795


16.1 Memory Hierarchy Designs 796
16.1.1 What Is NUMA? 796
16.1.2 What Is CMT? 797
16.2 Memory Placement Optimization Framework 799
16.2.1 Latency Model 800
16.2.2 More Complex Models 801
16.3 Initial Thread Placement 802
xxii Contents

16.4 Scheduling 802


16.5 Memory Allocation 803
16.6 Lgroup Implementation 804
16.6.1 Parameters Affecting MPO 805
16.7 MPO APIs 807
16.7.1 Informational 807
16.7.2 Verifying the Interface Version 810
16.7.3 Initialization of the Locality Group Interface 810
16.8 Locality Group Hierarchy 811
16.8.1 Locality Group Characteristics 812
16.8.2 Locality Groups and Thread and Memory Placement 812
16.9 MPO Statistics 813
16.10 MDB Reference 814

Chapter 17 Locking and Synchronization 815


17.1 Synchronization 815
17.2 Parallel Systems Architectures 816
17.3 Hardware Considerations for Locks and Synchronization 819
17.4 Introduction to Synchronization Objects 824
17.4.1 Synchronization Process 825
17.4.2 Synchronization Object Operations Vector 826
17.5 Mutex Locks 827
17.5.1 Overview 828
17.5.2 Solaris Mutex Lock Implementation 830
17.6 Reader/Writer Locks 835
17.6.1 Solaris Reader/Writer Locks 836
17.7 Turnstiles and Priority Inheritance 840
17.7.1 Turnstiles Implementation 841
17.8 Kernel Semaphores 844
17.9 DTrace Lockstat Provider 846
17.9.1 Overview 846
17.9.2 Adaptive Lock Probes 847
17.9.3 Spin Lock Probes 848
17.9.4 Thread Locks 849
17.9.5 Readers/Writer Lock Probes 849
Contents xxiii

PART SEVEN
Networking 853

Chapter 18 The Solaris Network Stack 855


18.1 STREAMS and the Network Stack 855
18.1.1 The STREAMS Model 856
18.1.2 Network Stack as STREAMS Module 859
18.1.3 Issues with STREAMS-Based Stacks 862
18.2 Solaris 10 Stack: Design Goals 862
18.3 Solaris 10 Network Stack Framework 863
18.3.1 Vertical Perimeter 864
18.3.2 IP Classifier 868
18.3.3 Synchronization Mechanism 870
18.4 TCP as an Implementation of the New Framework 870
18.4.1 The Interface between TCP and IP 872
18.4.2 TCP Loopback 874
18.5 UDP 875
18.5.1 UDP Packet Drop within the Stack 876
18.5.2 UDP Module 876
18.5.3 UDP and Socket Interaction 878
18.6 Synchronous STREAMS 878
18.6.1 TCP Synchronous STREAMS 878
18.6.2 STREAMS Fallback 879
18.7 IP 880
18.7.1 Plumbing NICs 880
18.7.2 IP Network Multipathing 881
18.7.3 Multicast 881
18.8 Solaris Device Driver Framework 882
18.8.1 GLDv2 and DLPI Drivers (Solaris 9 and Prior) 882
18.8.2 A New Architecture: GLDv3 883
18.8.3 GLDv3 Link Aggregation Architecture 888
18.8.4 Checksurn Offload 890
18.9 Interrupt Model and NIC Speeds 891
18.9.1 Solaris 9 and Earlier Releases 891
18.9.2 Dynamic Switch between Interrupt vs. Polling Mode 892
18.9.3 Interrupt Load Spreading 894
xxiv Contents

18.10 Summary 895


18.11 MDB Reference 895

PART EIGHT
Kernel Services 899

Chapter 19 Clocks and Timers 901


19.1 The System Clock Thread 901
19.1.1 Thread Tick Processing 903
19.1.2 DTrace Providers for Tick Processing 904
19.2 Callouts and Callout Tables 904
19.3 System Time Facilities 910
19.3.1 High-Resolution Timer 910
19.3.2 Time-of-Day Clock 910
19.4 The Cyclic Subsystem 912
19.4.1 Cyclic Subsystem Interface Overview 912
19.4.2 Cyclic Subsystem Implementation Overview 913
19.4.3 Clients of the Cyclic Subsystem 922
19.4.4 Cyclic Kernel At-Large Interfaces 923
19.4.5 Cyclic Kernel Inter-Subsystem Interfaces 924
19.4.6 Cyclic Backend Interfaces 924
19.4.7 Cyclic Subsystem Backend-Supplied Interfaces 924

Chapter 20 Task Queues 927


20.1 Overview of Task Queues 927
20.2 Dynamic Task Queues 928
20.2.1 Why a Dynamic Task Queue? 928
20.2.2 Problems Addressed by Dynamic Task Queues 929
20.2.3 Task Pool Model 930
20.2.4 Interface Changes to Support Dynamic Task Queues 931
20.3 Task Queues Kernel Programming Interfaces 932
20.4 Device Driver Interface for Task Queues 934
20.5 Task Queue Observability 935
20.5.1 Kstat Counters 935
20.5.2 DTrace SDT Probes 936
Contents XXV

20.6 Task Queue Implementation Notes 937


20.6.1 Use of Kmem Caches 937
20.6.2 Use of Vmem Arenas 937
20.6.3 Hashed Vmem Arenas 938
20.6.4 Cached List of Entries 939
20.6.5 Problems with Task Pool Implementation 940
20.6.6 Use of Dynamic Task Pools in STREAMS 940

Chapter 21 kmdb I mplementation 943


21.1 Introduction 943
21.1.1 MDB Components 943
21.1.2 Major kmdb Design Decisions 946
21.1.3 The Structure of kmdb 949
21.1.4 MDB Components and Their Implementation in kmdb 952
21.1.5 Conclusion 959
21.1.6 Remaining Components 959

APPENDICES 963

Appendix A Kernel Virtual Address Maps 965

Appendix B Adding a System Call to Solaris 971

Appendix C A Sample Procfs Utility 975

Bibliography 979
Index 983

You might also like