Concurrent Programming on Windows free ebook download
Foreword by Craig Mundie,
Chief Research and Strategy Officer, Microsoft T
� T
Concurrent Programming on Windows
Development Series Joe
Duffy Praise for Concurrent Programming on Windows
"I have been fascinated with concurrency ever since I added threading support
to the Common Language Runtime a decade ago. That's also where I met Joe,
who is a world expert on this topic. These days, concurrency is a first-order
concern for practically all developers. Thank goodness for Joe's book. It is a tour
de force and I shall rely on it for many years to come."- Chris Brumme, Distinguished Engineer, Microsoft
"I first met Joe when we were both working with the Microsoft CLR team. At that
time, we had several discussions about threading and it was apparent that
he was as passionate about this subject as I was. Later, Joe transitioned to
Microsoft's Parallel Computing Platform team where a lot of his good ideas
about threading could come to fruition. Most threading and concurrency books
that I have come across contain information that is incorrect and explains how
to solve contrived problems that good architecture would never get you into in
the first place. Joe's book is one of the very few books that I respect on the
matter, and this respect comes from knowing Joe's knowledge, experience, and
his ability to explain concepts."- Jeffrey Richter, Wintellect
"There are few areas in computing that are as important, or shrouded in mystery,
as concurrency. It's not simple, and Duffy doesn't claim to make it so-but armed
with the right information and excellent advice, creating correct and highly
scalable systems is at least possible. Every self-respecting Windows developer
should read this book."- Jonathan Skeet, Software Engineer, Clearswift
"What I love about this book is that it is both comprehensive in its coverage of
concurrency on the Windows platform, as well as very practical in its presen
tation of techniques immediately applicable to real-world software devel
opment. Joe's book is a 'must have' resource for anyone building native or
managed code Windows applications that leverage concurrency!"- Steve Teixeira, Product Unit Manager, Parallel Computing Platform, Microsoft Corporation
"This book is a fabulous compendium of both theoretical knowledge and
practical guidance on writing effective concurrent applications. Joe Duffy is not
only a preeminent expert in the art of developing parallel applications for
Windows, he's also a true student of the art of writing. For this book, he has
combined those two skill sets to create what deserves and is destined to be a
long-standing classic in developers' hands everywhere. II- Stephen Toub, Program Manager Lead, Parallel Computing Platform, Microsoft II As chip designers run out of ways to make the individual chip faster, they have
- Jason Zander, General Manager, Visual Studio, Microsoft
- �.�
- and relevant platform headers and libraries. The latest versions as of this writing are the Windows Vista and Server 2008 SDKs. Microsoft . NET Framework SDK. This includes the Microsoft C#
- and Visual Basic compilers, and relevant framework libraries. The latest version as of this writing is the .NET Framework 3.5 SDK.
- is already pervasive. This is common in real-time systems,
- performing some compute- or I /O-intensive activity in response to some user input. In such cases, running this work on the UI thread will lead to poor responsiveness and frustrated end users. Instead, concurrency can be used to move work elsewhere, dramatically improving the responsiveness and user experience. You'd like to exploit the asynchrony that already exists in the
- relationship between the CPU running your program and other hardware devices. (They are, after all, separately operating and independent pieces of hardware.) Windows and many device drivers cooperate to ensure that large I / O latencies do not severely impact program performance. Using these capabilities requires that you rewrite code to deal with concurrent orchestration of events. Some problems are more naturally modeled using concurrency.
- Games, AI, and scientific simulations often need to model interac tions among many agents that operate mostly independently of one another, much like objects in the real world. These interactions are inherently concurrent. Stream processing of real-time data feeds, where the data is being generated in the physical world, typically requires the use of concurrency. Telephony switches are inherently massively concurrent, leading to special purpose languages, such as Erlang, that deal specifically with concurrency as a first class concept. You'd like to utilize the processing power made available by
- multiprocessor architectures, such as multicore, which requires a form of concurrency called parallelism to be used. This requires individual operations to be decomposed into independent parts that can run on separate processors.
- independent agents. An agent in this context is a very abstract term, but the key attributes are: ( 1 ) state is mostly isolated within it from the outset, (2) its interactions with the world around it are and (3) it is generally loosely coupled with respect to
- once. We'll call these tasks. Although a task shares many ideas with agents-such as being asynchronous and somewhat independent tasks are unique in that they typically share state intimately. Many sequential client-side programs fail to recognize tasks are first class concepts, but doing so will become increasingly important as fine grained parallelism is necessary for multicore. Many server-side programs also do not have a concept of tasks, because they already use large numbers of agents in order to expose enough latent concurrency to utilize the hardware. This is OK so long as the number of active agents exceeds the number of available processors; as processor counts and the workloads a single agent is responsible for grow, this can become increasingly difficult to ensure.
- are programmed such that the system is made available of latent concurrency. This is called data parallelism. Such operations might
moved towards adding parallel compute capacity instead. Consumer PCs with
multiple cores are now commonplace. We are at an inflection point where
improved performance will no longer come from faster chips but rather from
our ability as software developers to exploit concurrency. Understanding the
concepts of concurrent programming and how to write concurrent code has
therefore become a crucial part of writing successful software. With Concurrent
Programming on Windows, Joe Duffy has done a great job explaining concurrent
concepts from the fundamentals through advanced techniques. The detailed
descriptions of algorithms and their interaction with the underlying hardware
turn a complicated subject into something very approachable. This book is the
perfect companion to have at your side while writing concurrent software for
Windows."
Concurrent Programming on Windows Microsoft .NET
Development Series John Montgomery, Series Advisor Don Box, Series Advisor Brad Abrams, Series Advisor
The award-winning Microsoft .NET Development Series was established in 2002 to provide professional
developers with the most comprehensive and practical coverage of the latest .NET technologies. It is
supported and developed by the leaders and experts of Microsoft development technologies, including
Microsoft architects, MVPs, and leading industry luminaries. Books in this series provide a core resource of
information and understanding every developer needs to write effective applications.For .NET Framework 3.5, 978-0-321-53392-0
978-0-321-17403-1 Paul Yao and David Durant, .NET Compact Framework Programming with Visual Basic NET,
978-0-321-34138-9 Will Stott and James Newkirk, Visual Studio Team System: Better Software Development for Agile Teams, 978-0-321-41850-0 Paul Yao and David Durant, .NET Compact Framework Programming with C#,
Guy Smith-Ferrier, .NET Internationalization: T he Developer's Guide to Building Global Windows and Web Applications,
Workflow Foundation, 978-0-321-39983-0
2.0 Programming, 978-0-321-26796-2 Dharma Shukla and Bob Schmidt, Essential Windows
2007, 978-0-321-41059-7 Neil Roodyn, eXtreme .NET: Introducing eXtreme Programming Techniques to .NET Developers, 978-0-321-30363-9 Chris Sells and Michael Weinhardt, Windows Forms
Windows Communication Foundation: For .NET Framework 3.5,978-0-321-44006-8 Scott Roberts and Hagen Green, Designing Forms for Microsoft Office InfoPath and Forms Services
2.0, 978-0-321-23770-5 Steve Resnick, Richard Crane, Chris Bowen, Essential
2.0: Programming Smart Client Data Applications with .NET, 978-0-321-26892-1 Brian Noyes, Smart Client Deployment with ClickOnce: Deploying Windows Forms Applications with ClickOnce, 978-0-321-19769-6 Fritz Onion with Keith Brown, Essential ASPNET
978-0-321-24673-8 Brian Noyes, Data Binding with Windows Forms
James S. Miller and Susann Ragsdale, T he Common Language Infrastructure Annotated Standard, 978-0-321-15493-4 Christian Nagel, Enterprise Services with the .NET Framework: Developing Distributed Business Solutions with .NET Enterprise Services,
Mark Michaelis, Essential C# 3.0:
Titles in the Series Brad Abrams, .NET Framework Standard Library
Directory Services Programming, 978-0-321-35017-6
2.0 Illustrated, 978-0-321-41834-0 Joe Kaplan and Ryan Dunn, T he .NET Developer's Guide to
978-0-321-54561-9 Joe Duffy, Concurrent Programming on Windows, 978-0-321-43482-1 Sam Guckenheimer and Juan J. Perez, Software Engineering with Microsoft Visual Studio Team System, 978-0-321-27872-2 Anders Hejlsberg, Mads Torgersen, Scott Wiltamuth, Peter Golde, T he C# Programming Language, Third Edition, 978-0-321-56299-9 Alex Homer and Dave Sussman, ASPNET
978-0-321-39820-8 Krzysztof Cwalina and Brad Abrams, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable NET Libraries, Second Edition,
978-0-321-41175-4 Steve Cook, Gareth Jones, Stuart Kent, Alan Cameron Wills, Domain-Specific Development with Visual Studio DSL Tools,
3.5, 978-0-321-51444-8 Eric Carter and Eric Lippert, Visual Studio Tools for Office: Using C# with Excel, Word, Outlook, and InfoPath, 978-0-321-33488-6 Eric Carter and Eric Lippert, Visual Studio Tools for Office: Using Visual Basic 2005 with Excel, Word, Outlook, and InfoPath,
2005, 978-0-321-38218-4 Adam Calderon, Joel Rumerman, Advanced ASP.NET AJAX Server Controls: For .NET Framework
978-0-321-37447-9 Bob Beauchemin and Dan Sullivan, A Developer's Guide to SQL Server
978-0-321-19445-9 Chris Anderson, Essential Windows Presentation Foundation (WPF),
Standard Library Annotated Reference, Volume 2: Networking Library, Reflection Library, and XML Library,
Base Class Library and Extended Numerics Library, 978-0-321-15489-7 Brad Abrams and Tamara Abrams, .NET Framework
Annotated Reference Volume 1:
978-0-321-17404-8 For more information go to informit.com/msdotnetseries/ Concurrent Programming on Windows
Joe Duffy
Addison-Wesley
Upper Saddle River, NJ Boston Indianapolis San Francisco • • • New York Toronto Montreal London Munich Paris Madrid • • • • • • Capetown Sydney Tokyo Singapore Mexico City • • • •
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim,
the designations have been printed with initial capital letters or in all capitals.The .NET logo is either a registered trademark or trademark of Microsoft Corporation in the United States and/ or
other countries and is used under license from Microsoft.
The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty
of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or conse
quential damages in connection with or arising out of the use of the information or programs contained herein.
The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales,
which may include electronic versions and/or custom covers and content particular to your business, training
goals, marketing focus, and branding interests. For more information, please contact: U.s. Corporate and Government Sales(800) 382-3419 corpsales@pearsontechgroup.com For sales outside the United States please contact: International Sales international@pearsoned.com Visit us on the Web: informit.com/ aw
Library o/Congress Cataloging-in-Publication Data
Duffy, Joe, 1980- Concurrent programming on Windows / Joe Duffy. p. cm. Includes bibliographical references and index.
ISBN 978-0-321-43482-1 (pbk. : alk. paper) 1. Parallel programming (Computer science)
QA76.642D84 2008 005.2'75-<1c22 2008033911 Copyright © 2009 Pearson Education, Inc.
All rights reserved. Printed in the United States of America. This publication is protected by copyright, and
permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval
system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or like
wise. For information regarding permissions, write to:Pearson Education, Inc Rights and Contracts Department 501 Boylston Street, Suite 900 Boston, MA 02116
Fax (617) 671-3447
ISBN-13: 978-0-321-43482-1
ISBN-lO: 0-321-43482-X Text printed in the United States on recycled paper at Edwards Brothers in Ann Arbor, Michigan. First printing, October 2008
For Mom
& Dad
Contents at a Glance Con ten ts
Xl Foreword xix Preface xxiii
XXVll Acknowledgmen ts About the Au thor xxix
1 PART I Concepts
1 Introduction 3
2 Synchronization and Time 13
PART II Mechanisms 77
3 Threads 79
4 Advanced Threads 127
5 Windows Kernel Synchronization 183
6 Data and Control Synchronization 253
7 Thread Pools 315
8 Asynchronous Programming Models 399
9 Fibers 429
PART III Techniques 475
10 Memory Models and Lock Freedom 477
11 Concurrency Hazards 545 ix
C o n t e n t s a t a G l a n ce x
12 Parallel Containers 613
13 Data and Task Parallelism 657
14 Performance and Scalability 735
PART IV Systems 783
16 Graphical User Interfaces 829
PART V Appendices 863 A Designing Reusable Libraries for Concurrent .NET Programs 865 B Parallel Extensions to .NET 887 Index 931 Contents Foreword xix Preface xxiii
Acknowledgments xxvii About the Au thor
XXIX
1 PART I Concepts
1 Introduction
3 Why Concurrency?
3 Program Architecture and Concurrency 6
Layers of Parallelism
Where Are We? 1 1
Managing Program State 1 4
Identifying Shared vs. Private S tate 1 5 State Machines and Time 1 9 Isolation 3 1 Immutability 34
Synchronization: Kinds and Techniques
38 Data Synchronization
60 Where Are We?
73 xi C o n t e n t s xii
PART II Mechanisms 77
Threading from 1 0,001 Feet 80
What Is a Windows Thread?
81 What Is a CLR Thread?
85
Explicit Threading and Alternatives87 The Life and Death of Threads 89 Thread Creation
89 Thread Termination 1 01 DlIMain 1 1 5 Thread Local S torage 1 1 7
Where Are We? 1 24
4 Advanced Threads 127
Thread State 1 27
User-Mode Thread S tacks 1 2 7 Internal Data Structures (KTHREAD, ETHREAD, TEB) 1 45 Contexts 1 51
Inside Thread Creation and Termination 1 52
Thread Creation Details 1 52 Thread Termination Details 1 53
Thread Scheduling 1 54
Thread S tates 1 55 Priorities 1 59 Quantums 1 63 Priority and Quantum Adjustmen ts 1 64 Sleeping and Yielding 1 67 S uspension 1 68 Affin ity: Preference for Running on a Particular CPU 1 70
Where Are We? 1 80
5 Windows Kernel Synchronization 183
The Basics: Signaling and Waiting 1 84
Why Use Kernel Objects? 1 86 Waiting in Native Code 1 89 Managed Code 204 Asynchronous Procedure Calls (APCs) 208
Co n t e n t s xiii
Using the Kernel Objects 2 1 1
Mutex 2 1 1 Semaphore 2 1 9 A Mutex/Semaphore Example: Blocking/Bounded Queue 224 Auto- and Man ual-Reset Events 226
Waitable Timers 234 Signaling an Object and Waiting A tomically 241 Debugging Kernel Objects 250
Where Are We? 251
Mutual Exclusion 255
Win32 Critical Sections 256 CLR Locks 272
Reader
I Writer Locks (RWLs) 287
Windows Vista Slim Reader/Writer Lock 289 .NET Framework Slim Reader/Writer Lock (3.5) 293 .NET Framework Legacy Reader/Writer Lock 300
Condition Variables 304
Windows Vista Condition Variables 304 .NET Framework Monitors 309 Guarded Regions 3 1 1
Where Are We? 3 1 2
Thread Pools 1 01 3 1 6
Three Ways: Windows Vista, Windows Legacy, a n d CLR 3 1 7 Common Features 3 1 9
Windows Thread Pools 323
Windows Vista Thread Pool 323 Legacy Win32 Thread Pool 353
CLR Thread Pool 364
Work Items 364 I/O Completion Ports 368 Timers 371 Registered Waits 374 Remember (Again): You Don 't Own the Threads 377
Contents xiv
Thread Pool Thread Managemen t 377 Debugging 386 A Case Study: Layering Priorities and Isolation on Top of the Thread Pool 387
Performance When Using the Thread Pools 391 Where Are We? 398
8 Asynchronous Programming Models 399
Asynchronous Programming Model (APM) 400
Rendezvousing: Four Ways 403 Implementing IAsyncResult 4 1 3 Where the APM Is Used i n the .NET Framework 4 1 8 ASP.NET Asynchronous Pages 420
Event-Based Asynchronous Pattern 421
The Basics 42 1 Supporting Cancellation 425 Supporting Progress Reporting and Incremental Results 425 Where the EAP Is Used in the .NET Framework 426
Where Are We? 427
9 Fibers 429
An Overview of Fibers 430
Upsides and Downsides 43 1
Using Fibers 435
Creating New Fibers 435 Converting a Thread in to a Fiber 438 Determining Whether a Thread Is a Fiber 439 Switching Between Fibers 440 Deleting Fibers 441 An Example of Switching the Curren t Thread 442
Additional Fiber-Related Topics 445
Fiber Local S torage (FLS) 445 Thread Affinity 447 A Case Study: Fibers and the CLR 449
Building a User-Mode Scheduler 453
The Implemen tation 455 A Word on S tack vs. S tackless Blocking 472
Where Are We? 473
Contents xv
PART III Techniques 475
10 Memory Models and Lock Freedom 477
Memory Load and Store Reordering 478
What Runs Isn 't Always What You Wrote 481 Critical Regions as Fences 484 Data Dependence and Its Impact on Reordering 485
Hardware Atomicity 486
The Atomicity of Ordinary Loads and S tores 487 Interlocked Operations 492
Memory Consistency Models 506
Hardware Memory Models 509 Memory Fences 511 .NET Memory Models 51 6 Lock Free Programming 5 1 8
Examples of Low-Lock Code 520
Lazy Initialization and Double-Checked Locking 520 A Nonblocking S tack and the ABA Problem 534 Dekker 's Algorithm Revisited 540
Where Are We? 541
11 Concurrency Hazards 545
Correctness Hazards 546
Data Races 546 Recursion and Reentrancy 555 Locks and Process Shutdown 561
Liveness Hazards 572
Deadlock 572 Missed Wake- Ups (a.k.a. Missed Pulses) 597 Livelocks 601 Lock Convoys 603 Stampeding 605 Two-Step Dance 606 Priority Inversion and S tarvation 608
Where Are We? 609 xvi C o n t e n t s
12 Parallel Containers 613
Fine-Grained Locking 6 1 6
A rrays 6 1 6 FIFO Queue 61 7 Linked Lists 621 Dictionary (Hash table) 626
Lock Free 632
General-Purpose Lock Free FIFO Queue 632 Work S tealing Queue 636
Coordination Containers 640
Producer/Consumer Data Structures 641
Phased Computations with Barriers 650
Where Are We? 654
13 Data and Task Parallelism 657
Data Parallelism 659
Loops and Iteration 660
Task Parallelism 684
Fork/Join Parallelism 685 Dataflow Parallelism (Futures and Promises) 689 Recursion 702 Pipelines 709 Search 71 8
Message-Based Parallelism 71 9 Cross-Cutting Concerns 720
Concurrent Exceptions 72 1 Cancellation 729
Where Are We? 732
14 Performance and Scalability 735
Parallel Hardware Architecture 736
SMP, CMF, and HT 736 Superscalar Execution 738 The Memory Hierarchy 739 A Brief Word on Profiling in Visual Studio 754
Speedup: Parallel vs. Sequential Code 756
Deciding to "Go Parallel" 756
C o n t e n t s xvii Measuring Improvements Due to Parallelism 758 A mdahl's Law 762 Critical Paths and Load Imbalance 764 Garbage Collection and Scalability 766
Spin Waiting 767
How to Properly Spin on Windows 769 A Spin-Only Lock 772 Mellor-Crummey-Scott (MCS) Locks 778
Where Are We? 781
PART IV Systems 783
Overlapped I / O 786
Overlapped Objects 788 Win32 Asynchronous I/O 792 .NET Framework Asynchronous I/O 81 7
I / O Cancellation 822
Asynchronous I/O Cancellation for the Current Thread 823 Synchronous I/O Cancellation for Another Thread 824 Asynchronous I/O Cancellation for Any Thread 825
Where Are We? 826
16 Graphical User Interfaces 829
CUI Threading Models 830
Single Threaded Apartments (STAs) 833 Responsiveness: What Is It, Anyway? 836
.NET Asynchronous CUI Features 837
.NET GUI Frameworks 837 Synchronization Contexts 847 Asynchronous Operations 855 A Convenient Package: BackgroundWorker 856
Where Are We? 860
PART V Appendices 863 A Designing Reusable libraries for Concurrent .NET Programs 865 The 20,000-Foot View 866 xviii Conte n t s
The Details 867
Locking Models 867 Using Locks 870 Reliability 875 Scheduling and Threads 879 Scalability and Performance 881 Blocking 884
887
B Parallel Extensions to .NET
Task Parallel Library 888
Unhandled Exceptions 893 Paren ts and Children 895 Cancellation 897 Futures 898 Contin uations 900 Task Managers 902 Putting it All Together: A Helpful Parallel Class 904 Self-Replicating Tasks 909
Parallel LINQ 9 1 0
Buffering and Merging 912 Order Preservation 9 1 4
Synchronization Primitives 915
ISupportsCancelation 915 Coun tdownEven t 915 Lazylnit<T> 91 7 Man ualResetEven tSlim 91 9 SemaphoreSlim 920 Spin Lock 92 1 Spin Wait 923
Concurrent Collections 924
BlockingCollection<T> 925 Concurren tQueue<T> 928 Concurren tStack<T> 929
Index 931 Foreword
is once again at a crossroads. Hardware con currency, in the form of new together with growing soft
manycore processors,
ware complexity, will require that the technology industry fundamentally rethink both the architecture of modern computers and the resulting soft ware development paradigms.
For the past few decades, the computer has progressed comfortably along the path of exponential performance and capacity growth without any fundamental changes in the underlying computation model. Hardware followed Moore's Law, clock rates increased, and software was written to exploit this relentless growth in performance, often ahead of the hardware curve. That symbiotic hardware-software relationship continued unabated until very recently. Moore's Law is still in effect, but gone is the unnamed law that said clock rates would continue to increase commensurately.
The reasons for this change in hardware direction can be summarized by a simple equation, formulated by David Patterson of the University of California at Berkeley: =
Power Wall
ILP Wall A Brick Wall for Serial Performance Memory Wall
Power dissipation in the CPU increases proportionally with clock frequency, imposing a practical limit on clock rates. Today, the ability to dissipate heat has reached a practical physical limit. As a result, a significant
xix Foreword xx
increase in clock speed without heroic (and expensive) cooling (or materi als technology breakthroughs) is not possible. This is the "Power Wall" part of the equation. Improvements in memory performance increasingly lag behind gains in processor performance, causing the number of CPU cycles required to access main memory to grow continuously. This is the "Mem ory Wall." Finally, hardware engineers have improved the performance of sequential software by speculatively executing instructions before the results of current instructions are known, a technique called instruction level parallelism (ILP). ILP improvements are difficult to forecast, and their com plexity raises power consumption. As a result, ILP improvements have also stalled, resulting in the "ILP Wall."
We have, therefore, arrived at an inflection point. The software ecosys tem must evolve to better support manycore systems, and this evolution will take time. To benefit from rapidly improving computer performance and to retain the "write once, run faster on new hardware" paradigm, the programming community must learn to construct concurrent applications. Broader adoption of concurrency will also enable Software Services + through asynchrony and loose-coupling, client-side parallelism, and server-side cloud computing.
The Windows and .NET Framework platforms offer rich support for concurrency. This support has evolved over more than a decade, since the introduction of multiprocessor support in Windows NT. Continued improvements in thread scheduling performance, synchronization APIs, and memory hierarchy awareness-particularly those added in Windows Vista-make Windows the operating system of choice for maximizing the use of hardware concurrency. This book covers all of these areas. When you begin using multithreading throughout an application, the importance of clean architecture and design is critical to reducing software complexity and improving maintainability. This places an emphasis on understanding not only the platform's capabilities but also emerging best practices. Joe does a great job interspersing best practice alongside mechanism through out this book.
Manycore provides improved performance for the kinds of applications we already create. But it also offers an opportunity to think completely differently about what computers should be able to do for people. The
Foreword xxi
continued increase in compute power will qualitatively change the applications that we can create in ways that make them a lot more inte resting and helpful to people, and able to do new things that have never been possible in the past. Through this evolution, software will enable more personalized and humanistic ways for us to interact with computers. So enjoy this book. It offers a lot of great information that will guide you as you take your first steps toward writing concurrent, many core aware soft ware on the Windows platform.
Craig Mundie Chief Research and Strategy Officer Microsoft Corporation June 2008
Preface
I BEGAN WRITING
this book toward the end of 2005. At the time, dual-core processors were becoming standard on the mainstream PCs that ordinary (nonprogrammer) consumers were buying, and a small number of people in industry had begun to make noise about the impending concurrency problem. (Herb Sutter's, The Free Lunch is Over, paper immediately comes to mind.) The problem people were worried about, of course, was that the software of the past was not written in a way that would allow it to natu rally exploit that additional compute power. Contrast that with the never ending increase in clock speeds. No more free lunch, indeed.
It seemed to me that concurrency was going to eventually be an impor tant part of every software developer's job and that a book such as this one would be important and useful. Just two years later, the impact is beginning to ripple up from the OS, through the libraries, and on up to applications themselves.
This was about the same time I had just wrapped up proto typing a small side project I had been working on for six months, called Parallel Language Integrated Query (PLINQ) . The PLINQ project was a conduit for me to explore the intricacies of concurrency, multicore, and specifically how par allelism might be used in real-world, everyday programs. I used it as a tool to figure out where the platform was lacking. This was in addition to spending my day job at Microsoft focused on software transactional mem ory (STM), a technology that in the intervening two years has become somewhat of an industry buzzword . Needless to say, I had become pretty
xxiii P reface xxiv
entrenched in all topics concurrency. What better way to get entrenched even further than to write a book on the subject? As I worked on all of these projects, and eventually PLINQ grew into the
Parallel Extensions to the .NET Framework technology, I was amazed at how few good books on Windows concurrency were available. I remember time and time again being astonished or amazed at some intricate and eso teric bit of concurrency-related information, jotting it down, and earmark ing it for inclusion in this book. I only wished somebody had written it down before me, so that I didn't need to scour it from numerous sources: hallway conversations, long nights of pouring over Windows and CLR source code, and reading and rereading countless Microsoft employee blogs. But the best books on the topic dated back to the early '90s and, while still really good, focused too much on the mechanics and not enough on how to structure parallel programs, implement parallel algorithms, deal with concurrency hazards, and all those important concepts. Everything else targeted academics and researchers, rather than application, system, and library developers.
I set out to write a book that I'd have found fascinating and a useful way to shortcut all of the random bits of information I had to learn throughout. Although it took me a surprisingly long two-and-a-half years to finish this book, the state of the art has evolved slowly, and the state of good books on the topic hasn't changed much either. The result of my efforts, I hope, is a new book that is down to earth and useful, but still full of very deep tech nical information. It is for any Windows or .NET developer who believes that concurrency is going to be a fundamental requirement of all software somewhere down the road, as all industry trends seem to imply.
I look forward to kicking back and enjoying this book. And I sincerely hope you do too. Book Structure I've structured the book into four major sections. The first, Concepts, intro duces concurrency at a high level without going too deep into any one topic. The next section, Mechanisms, focuses squarely on the fundamental plat form features, inner workings, and API details. After that, the Techniques
P reface xxv
section describes common patterns, best practices, algorithms, and data structures that emerge while writing concurrent software. The fourth sec tion, Systems, covers many of the system-wide architectural and process concerns that frequently arise. There is a progression here. Concepts is first because it develops a basic understanding of concurrency in general. Under standing the content in Techniques would be difficult without a solid under standing of the Mechanisms, and similarly, building real Systems would be impossible without understanding the rest. There are also two appendices at the end. Code Requirements To run code found in this book, you'll need to download some free pieces of software.
Microsoft Windows SDK. This includes the Microsoft C++ compiler
Both can be found on MSDN: http: / /msdn.microsoft.com.
In addition, it's highly recommended that you consider using Visual Studio. This is not required-and in fact, much of the code in this book was written in emacs-but provides for a more seamless development and debugging experience. Visual Studio 2008 Express Edition can be down loaded for free, although it lacks many useful capabilities such as perform ance profiling.
Finally, the debugging tools for Windows package, which includes the popular WINDBG debugging utility-can also come in handy, partic ularly if you don't have Visual Studio. It is freely downloadable from http: / /www.microsoft.com. Similarly, the Sysinternals utilities available from http: / / technet.microsoft.com / sysinternals are quite useful for inspecting aspects of the Windows OS.
Preface xxvi
A companion website is available at:
http://www.bluebytesoftware.comlbooks Joe Duffy June 2008 jOe@bluebytesoftware.com http://www.bluebytesoftware.com Acknowledgments
MANY PEOPLE HAVE
helped with the creation of this book, both directly and indirectly. First, I have to sincerely thank Chris Brumme and Jan Gray for inspiring me to get the concurrency bug several years ago. You've both been incredi bly supportive and have helped me at every turn in the road . This has led to not only this book but a never-ending stream of career, technical, and per sonal growth opportunities. I'm still not sure how I'll ever repay you guys.
Also, thanks to Herb Sutter, who was instrumental in getting this book's contract in the first place. And also to Craig Mundie for writing a terrific Foreword and, of course, leading Microsoft and the industry as a whole into our manycore future.
Vance Morrison deserves special thanks for not only being a great men tor along the way, but also for being the toughest technical reviewer I've ever had. His feedback pushed me really hard to keep things concise and relevant. I haven't even come close to attaining his vision of what this book could have been, but I hope I'm not too far afield from it.
Next, in alphabetical order, many people helped by reviewing the book, discussing ideas along the way, or answering questions about how things work (or were supposed to work): David Callahan, Neill Clift, Dave Detlefs, Yves Dolce, Patrick Dussud, Can Erten, Eric Eilebrecht, Ed Essey, Kang Su Gatlin, Goetz Graefe, Kim Greenlee, Vinod Grover, Brian Grunkemeyer, Niklas Gustafsson, Tim Harris, Anders Hejlsberg, Jim Larus, Eric Li, Weiwen Liu, Mike Magruder, Jim Miller, Igor Ostrovsky,
xxvii Ac k n owled,m e n t s xxviii
Joel Pobar, Jeff Richter, Paul Ringseth, Burton Smith, Stephen Toub, Roger Wolff, and Keith Yedlin. For those reviewers who were constantly prom ised drafts of chapters that never actually materialized on time, well, I sin cerely appreciate the patience.
Infinite thanks also go out to the staff from Addison-Wesley. In particu lar, I'd like to give a big thanks to Joan Murray. You've been the only con stant throughout the whole project and have to be the most patient person
I've ever worked with. When I originally said the book would only take eight months, I wasn't lying intentionally. Hey, a 22-month underestimate isn't too bad, right? Only a true software developer would say that.
About the Author
is the development lead, architect, and founder of the Parallel Joe Duffy Extensions to the .NET Framework team at Microsoft, in the Visual Studio division. In addition to hacking code and managing a team of amazing developers, he defines the team's long-term vision and strategy. His current interests include functional programming, first-class concurrency safety in the type system and creating programming models that will enable every day people to exploit GPUs and SIMD style processors. Joe had previous positions at Microsoft as the developer for Parallel LINQ (PLINQ) and the concurrency program manager for the Common Language Runtime (CLR) .
Before joining Microsoft, he had seven years of professional programming experience, including four years at EMC. He was born in Massachusetts, and currently lives in Washington. While not indulging in technical excur sions, Joe spends his time playing guitar, studying music theory, listening to and writing music, and feeding his wine obsession.
xxix
PART I Concepts
1
No matter whether you're doing
C
server-side programming for the web or cloud computing, building a responsive graphical user interface, or creating a new interactive client appli cation that uses parallelism to attain better performance, concurrency is ever present. Learning how to deal with concurrency when it surfaces and how to exploit it to deliver more capable and scalable software is necessary for a large category of software developers and is the main focus of this book.
Before jumping straight into the technical details of how to use concur rency when developing software, we'll begin with a conceptual overview of concurrency, some of the reasons it can be important to particular kinds of software, the role it plays in software architecture, and how concurrency will fit progressively into layers of software in the future.
Everything in this chapter, and indeed most of the content in this book, applies equally to programs written in native C++ as it does to programs written in the .NET Framework.
Why Concurrency? There are many reasons why concurrency may be interesting to you.
You are programming in an environment where concurrency
3 C h a pter I n t rod u c t i o n 1 :
4
OS programming, and server-side programming. It is the reason, for example, that most database programmers must become deeply familiar with the notion of a transaction before they can truly be effective at their jobs. You need to maintain a responsive user interface (UI) while
In summary, many problem domains are ripe with inherent concur rency. If you're building a server application, for example, many requests
Why C o n c u rre n cy
5
may arrive concurrently via the network and must be dealt with simultaneously. If you're writing a Web request handler and need to access shared state, concurrency is suddenly thrust to the forefront.
While it's true that concurrency can sometimes help express problems more naturally, this is rare in practice. Human beings tend to have a diffi cult time reasoning about large amounts of asynchrony due to the combi natorial explosion of possible interactions. Nevertheless, it is becoming increasingly more common to use concurrency in instances where it feels unnatural. The reason for this is that microprocessor architecture has fun damentally changed; parallel processors are now widespread on all sorts of mainstream computers. Multicore has already pervaded the PC and mobile markets, and highly parallel graphics processing units (CPUs) are every where and sometimes used for general purpose computing. In order to fully maximize use of these newer generation processors, programs must be written in a naturally scalable manner. That means applications must contain sufficient so that, as newer machines are adopted,
laten t concurrency
program performance automatically improves alongside by realizing that latent concurrency as
actual concurrency.
In fact, although many of us program in a mostly sequential manner, our code often has a lot of inherent latent concurrency already by virtue of the way operations have been described in our language of choice. Data and control dependence among loops, if-branches, and memory moves can constrain this, but, in a surprisingly large number of cases, these are artifi cial constraints that are placed on code out of stylistic habit common to C-style programming.
This shift is a change from the past, particularly for client-side pro grams. Parallelism is the use of concurrency to decompose an operation into finer grained constituent parts so that independent parts can run on separate processors on the target machine. This idea is not new. Parallelism has been used in scientific computing and supercomputing for decades as a way to scale across tens, hundreds, and, in some cases, thousands of processors. But mainstream commercial and Web software generally has been authored with sequential techniques based on the assumption that clock speed will increase 40 to 50 percent year over year, indefinitely, and that corresponding improvements in performance would follow "for free."
C h a pter I n t ro d u c t i o n 1 :
6
Program Architecture and Concurrency
Concurrency begins with architecture. It is also possible to retrofit concurrency into an existing application, but the number of common pitfalls is vastly decreased with careful planning. The following taxonomy is a use ful way to think about the structure of concurrent programs, which will help during the initial planning and architecture phases of your project:
Agents. Most programs are already coarsely decomposed into
asynchronous,
peer agents. There are many manifestations of agents in real-world systems, ranging from individual Web requests, a Windows Communication Foundation (WCF) service request, COM component call, some asynchronous activity a program has farmed off onto another thread, and so forth. Moreover, some programs have just one agent: the program's entry point. Tasks. Individual agents often need to perform a set of operations at
Operations on data are often naturally parallel, so long as they Data.
Progra m Arch ite c t u re a n d Co n c u rre n cy
7