Concurrent Programming on Windows free ebook download

  Foreword by Craig Mundie,

  Chief Research and Strategy Officer, Microsoft T

  � T

  Concurrent Programming on Windows

  Development Series Joe

  Duffy Praise for Concurrent Programming on Windows

"I have been fascinated with concurrency ever since I added threading support

to the Common Language Runtime a decade ago. That's also where I met Joe,

who is a world expert on this topic. These days, concurrency is a first-order

concern for practically all developers. Thank goodness for Joe's book. It is a tour

de force and I shall rely on it for many years to come."

  • Chris Brumme, Distinguished Engineer, Microsoft

  

"I first met Joe when we were both working with the Microsoft CLR team. At that

time, we had several discussions about threading and it was apparent that

he was as passionate about this subject as I was. Later, Joe transitioned to

Microsoft's Parallel Computing Platform team where a lot of his good ideas

about threading could come to fruition. Most threading and concurrency books

that I have come across contain information that is incorrect and explains how

to solve contrived problems that good architecture would never get you into in

the first place. Joe's book is one of the very few books that I respect on the

matter, and this respect comes from knowing Joe's knowledge, experience, and

his ability to explain concepts."

  • Jeffrey Richter, Wintellect

  

"There are few areas in computing that are as important, or shrouded in mystery,

as concurrency. It's not simple, and Duffy doesn't claim to make it so-but armed

with the right information and excellent advice, creating correct and highly

scalable systems is at least possible. Every self-respecting Windows developer

should read this book."

  • Jonathan Skeet, Software Engineer, Clearswift

  

"What I love about this book is that it is both comprehensive in its coverage of

concurrency on the Windows platform, as well as very practical in its presen­

tation of techniques immediately applicable to real-world software devel­

opment. Joe's book is a 'must have' resource for anyone building native or

managed code Windows applications that leverage concurrency!"

  • Steve Teixeira, Product Unit Manager, Parallel Computing Platform, Microsoft Corporation

  

"This book is a fabulous compendium of both theoretical knowledge and

practical guidance on writing effective concurrent applications. Joe Duffy is not

only a preeminent expert in the art of developing parallel applications for

Windows, he's also a true student of the art of writing. For this book, he has

combined those two skill sets to create what deserves and is destined to be a

long-standing classic in developers' hands everywhere. II

  • Stephen Toub, Program Manager Lead, Parallel Computing Platform, Microsoft
  • II As chip designers run out of ways to make the individual chip faster, they have

    moved towards adding parallel compute capacity instead. Consumer PCs with

    multiple cores are now commonplace. We are at an inflection point where

    improved performance will no longer come from faster chips but rather from

    our ability as software developers to exploit concurrency. Understanding the

    concepts of concurrent programming and how to write concurrent code has

    therefore become a crucial part of writing successful software. With Concurrent

      

    Programming on Windows, Joe Duffy has done a great job explaining concurrent

    concepts from the fundamentals through advanced techniques. The detailed

    descriptions of algorithms and their interaction with the underlying hardware

    turn a complicated subject into something very approachable. This book is the

    perfect companion to have at your side while writing concurrent software for

      Windows."

    • Jason Zander, General Manager, Visual Studio, Microsoft

      

    Concurrent Programming on Windows Microsoft .NET

      Development Series John Montgomery, Series Advisor Don Box, Series Advisor Brad Abrams, Series Advisor

    The award-winning Microsoft .NET Development Series was established in 2002 to provide professional

    developers with the most comprehensive and practical coverage of the latest .NET technologies. It is

    supported and developed by the leaders and experts of Microsoft development technologies, including

    Microsoft architects, MVPs, and leading industry luminaries. Books in this series provide a core resource of

    information and understanding every developer needs to write effective applications.

      For .NET Framework 3.5, 978-0-321-53392-0

      978-0-321-17403-1 Paul Yao and David Durant, .NET Compact Framework Programming with Visual Basic NET,

      978-0-321-34138-9 Will Stott and James Newkirk, Visual Studio Team System: Better Software Development for Agile Teams, 978-0-321-41850-0 Paul Yao and David Durant, .NET Compact Framework Programming with C#,

      Guy Smith-Ferrier, .NET Internationalization: T he Developer's Guide to Building Global Windows and Web Applications,

      Workflow Foundation, 978-0-321-39983-0

      2.0 Programming, 978-0-321-26796-2 Dharma Shukla and Bob Schmidt, Essential Windows

      2007, 978-0-321-41059-7 Neil Roodyn, eXtreme .NET: Introducing eXtreme Programming Techniques to .NET Developers, 978-0-321-30363-9 Chris Sells and Michael Weinhardt, Windows Forms

      Windows Communication Foundation: For .NET Framework 3.5,978-0-321-44006-8 Scott Roberts and Hagen Green, Designing Forms for Microsoft Office InfoPath and Forms Services

      2.0, 978-0-321-23770-5 Steve Resnick, Richard Crane, Chris Bowen, Essential

      2.0: Programming Smart Client Data Applications with .NET, 978-0-321-26892-1 Brian Noyes, Smart Client Deployment with ClickOnce: Deploying Windows Forms Applications with ClickOnce, 978-0-321-19769-6 Fritz Onion with Keith Brown, Essential ASPNET

      978-0-321-24673-8 Brian Noyes, Data Binding with Windows Forms

      James S. Miller and Susann Ragsdale, T he Common Language Infrastructure Annotated Standard, 978-0-321-15493-4 Christian Nagel, Enterprise Services with the .NET Framework: Developing Distributed Business Solutions with .NET Enterprise Services,

      Mark Michaelis, Essential C# 3.0:

      Titles in the Series Brad Abrams, .NET Framework Standard Library

      Directory Services Programming, 978-0-321-35017-6

      2.0 Illustrated, 978-0-321-41834-0 Joe Kaplan and Ryan Dunn, T he .NET Developer's Guide to

      978-0-321-54561-9 Joe Duffy, Concurrent Programming on Windows, 978-0-321-43482-1 Sam Guckenheimer and Juan J. Perez, Software Engineering with Microsoft Visual Studio Team System, 978-0-321-27872-2 Anders Hejlsberg, Mads Torgersen, Scott Wiltamuth, Peter Golde, T he C# Programming Language, Third Edition, 978-0-321-56299-9 Alex Homer and Dave Sussman, ASPNET

      978-0-321-39820-8 Krzysztof Cwalina and Brad Abrams, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable NET Libraries, Second Edition,

      978-0-321-41175-4 Steve Cook, Gareth Jones, Stuart Kent, Alan Cameron Wills, Domain-Specific Development with Visual Studio DSL Tools,

      3.5, 978-0-321-51444-8 Eric Carter and Eric Lippert, Visual Studio Tools for Office: Using C# with Excel, Word, Outlook, and InfoPath, 978-0-321-33488-6 Eric Carter and Eric Lippert, Visual Studio Tools for Office: Using Visual Basic 2005 with Excel, Word, Outlook, and InfoPath,

      2005, 978-0-321-38218-4 Adam Calderon, Joel Rumerman, Advanced ASP.NET AJAX Server Controls: For .NET Framework

      978-0-321-37447-9 Bob Beauchemin and Dan Sullivan, A Developer's Guide to SQL Server

      978-0-321-19445-9 Chris Anderson, Essential Windows Presentation Foundation (WPF),

      Standard Library Annotated Reference, Volume 2: Networking Library, Reflection Library, and XML Library,

      Base Class Library and Extended Numerics Library, 978-0-321-15489-7 Brad Abrams and Tamara Abrams, .NET Framework

      Annotated Reference Volume 1:

      978-0-321-17404-8 For more information go to informit.com/msdotnetseries/ Concurrent Programming on Windows

    • �.�

      Joe Duffy

    Addison-Wesley

      Upper Saddle River, NJ Boston Indianapolis San Francisco • • • New York Toronto Montreal London Munich Paris Madrid • • • • • • Capetown Sydney Tokyo Singapore Mexico City • • • •

      

    Many of the designations used by manufacturers and sellers to distinguish their products are claimed as

    trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim,

    the designations have been printed with initial capital letters or in all capitals.

    The .NET logo is either a registered trademark or trademark of Microsoft Corporation in the United States and/ or

    other countries and is used under license from Microsoft.

      

    The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty

    of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or conse­

    quential damages in connection with or arising out of the use of the information or programs contained herein.

      

    The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales,

    which may include electronic versions and/or custom covers and content particular to your business, training

    goals, marketing focus, and branding interests. For more information, please contact: U.s. Corporate and Government Sales

      (800) 382-3419 corpsales@pearsontechgroup.com For sales outside the United States please contact: International Sales international@pearsoned.com Visit us on the Web: informit.com/ aw

    Library o/Congress Cataloging-in-Publication Data

      Duffy, Joe, 1980- Concurrent programming on Windows / Joe Duffy. p. cm. Includes bibliographical references and index.

      ISBN 978-0-321-43482-1 (pbk. : alk. paper) 1. Parallel programming (Computer science)

      QA76.642D84 2008 005.2'75-<1c22 2008033911 Copyright © 2009 Pearson Education, Inc.

      

    All rights reserved. Printed in the United States of America. This publication is protected by copyright, and

    permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval

    system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or like­

    wise. For information regarding permissions, write to:

      Pearson Education, Inc Rights and Contracts Department 501 Boylston Street, Suite 900 Boston, MA 02116

      Fax (617) 671-3447

      ISBN-13: 978-0-321-43482-1

      ISBN-lO: 0-321-43482-X Text printed in the United States on recycled paper at Edwards Brothers in Ann Arbor, Michigan. First printing, October 2008

    For Mom

      & Dad

      Contents at a Glance Con ten ts

      Xl Foreword xix Preface xxiii

      XXVll Acknowledgmen ts About the Au thor xxix

      1 PART I Concepts

      1 Introduction 3

      2 Synchronization and Time 13

      PART II Mechanisms 77

      3 Threads 79

      4 Advanced Threads 127

      5 Windows Kernel Synchronization 183

      6 Data and Control Synchronization 253

      7 Thread Pools 315

      8 Asynchronous Programming Models 399

      9 Fibers 429

    PART III Techniques 475

      10 Memory Models and Lock Freedom 477

      11 Concurrency Hazards 545 ix

      C o n t e n t s a t a G l a n ce x

      12 Parallel Containers 613

      13 Data and Task Parallelism 657

      14 Performance and Scalability 735

      PART IV Systems 783

      16 Graphical User Interfaces 829

      PART V Appendices 863 A Designing Reusable Libraries for Concurrent .NET Programs 865 B Parallel Extensions to .NET 887 Index 931 Contents Foreword xix Preface xxiii

      Acknowledgments xxvii About the Au thor

      XXIX

      1 PART I Concepts

      1 Introduction

      3 Why Concurrency?

      3 Program Architecture and Concurrency 6

      Layers of Parallelism

      Where Are We? 1 1

      Managing Program State 1 4

      Identifying Shared vs. Private S tate 1 5 State Machines and Time 1 9 Isolation 3 1 Immutability 34

      Synchronization: Kinds and Techniques

      38 Data Synchronization

      60 Where Are We?

      73 xi C o n t e n t s xii

      PART II Mechanisms 77

      Threading from 1 0,001 Feet 80

      What Is a Windows Thread?

      81 What Is a CLR Thread?

      

    85

    Explicit Threading and Alternatives

      87 The Life and Death of Threads 89 Thread Creation

      89 Thread Termination 1 01 DlIMain 1 1 5 Thread Local S torage 1 1 7

      Where Are We? 1 24

      4 Advanced Threads 127

      Thread State 1 27

      User-Mode Thread S tacks 1 2 7 Internal Data Structures (KTHREAD, ETHREAD, TEB) 1 45 Contexts 1 51

      Inside Thread Creation and Termination 1 52

      Thread Creation Details 1 52 Thread Termination Details 1 53

      Thread Scheduling 1 54

      Thread S tates 1 55 Priorities 1 59 Quantums 1 63 Priority and Quantum Adjustmen ts 1 64 Sleeping and Yielding 1 67 S uspension 1 68 Affin ity: Preference for Running on a Particular CPU 1 70

      Where Are We? 1 80

      5 Windows Kernel Synchronization 183

      The Basics: Signaling and Waiting 1 84

      Why Use Kernel Objects? 1 86 Waiting in Native Code 1 89 Managed Code 204 Asynchronous Procedure Calls (APCs) 208

      Co n t e n t s xiii

      Using the Kernel Objects 2 1 1

      Mutex 2 1 1 Semaphore 2 1 9 A Mutex/Semaphore Example: Blocking/Bounded Queue 224 Auto- and Man ual-Reset Events 226

      Waitable Timers 234 Signaling an Object and Waiting A tomically 241 Debugging Kernel Objects 250

      Where Are We? 251

      Mutual Exclusion 255

      Win32 Critical Sections 256 CLR Locks 272

      Reader

      I Writer Locks (RWLs) 287

      Windows Vista Slim Reader/Writer Lock 289 .NET Framework Slim Reader/Writer Lock (3.5) 293 .NET Framework Legacy Reader/Writer Lock 300

      Condition Variables 304

      Windows Vista Condition Variables 304 .NET Framework Monitors 309 Guarded Regions 3 1 1

      Where Are We? 3 1 2

      Thread Pools 1 01 3 1 6

      Three Ways: Windows Vista, Windows Legacy, a n d CLR 3 1 7 Common Features 3 1 9

      Windows Thread Pools 323

      Windows Vista Thread Pool 323 Legacy Win32 Thread Pool 353

      CLR Thread Pool 364

      Work Items 364 I/O Completion Ports 368 Timers 371 Registered Waits 374 Remember (Again): You Don 't Own the Threads 377

      Contents xiv

      Thread Pool Thread Managemen t 377 Debugging 386 A Case Study: Layering Priorities and Isolation on Top of the Thread Pool 387

      Performance When Using the Thread Pools 391 Where Are We? 398

      8 Asynchronous Programming Models 399

      Asynchronous Programming Model (APM) 400

      Rendezvousing: Four Ways 403 Implementing IAsyncResult 4 1 3 Where the APM Is Used i n the .NET Framework 4 1 8 ASP.NET Asynchronous Pages 420

      Event-Based Asynchronous Pattern 421

      The Basics 42 1 Supporting Cancellation 425 Supporting Progress Reporting and Incremental Results 425 Where the EAP Is Used in the .NET Framework 426

      Where Are We? 427

      9 Fibers 429

      An Overview of Fibers 430

      Upsides and Downsides 43 1

      Using Fibers 435

      Creating New Fibers 435 Converting a Thread in to a Fiber 438 Determining Whether a Thread Is a Fiber 439 Switching Between Fibers 440 Deleting Fibers 441 An Example of Switching the Curren t Thread 442

      Additional Fiber-Related Topics 445

      Fiber Local S torage (FLS) 445 Thread Affinity 447 A Case Study: Fibers and the CLR 449

      Building a User-Mode Scheduler 453

      The Implemen tation 455 A Word on S tack vs. S tackless Blocking 472

      Where Are We? 473

      Contents xv

      PART III Techniques 475

      10 Memory Models and Lock Freedom 477

      Memory Load and Store Reordering 478

      What Runs Isn 't Always What You Wrote 481 Critical Regions as Fences 484 Data Dependence and Its Impact on Reordering 485

      Hardware Atomicity 486

      The Atomicity of Ordinary Loads and S tores 487 Interlocked Operations 492

      Memory Consistency Models 506

      Hardware Memory Models 509 Memory Fences 511 .NET Memory Models 51 6 Lock Free Programming 5 1 8

      Examples of Low-Lock Code 520

      Lazy Initialization and Double-Checked Locking 520 A Nonblocking S tack and the ABA Problem 534 Dekker 's Algorithm Revisited 540

      Where Are We? 541

      11 Concurrency Hazards 545

      Correctness Hazards 546

      Data Races 546 Recursion and Reentrancy 555 Locks and Process Shutdown 561

      Liveness Hazards 572

      Deadlock 572 Missed Wake- Ups (a.k.a. Missed Pulses) 597 Livelocks 601 Lock Convoys 603 Stampeding 605 Two-Step Dance 606 Priority Inversion and S tarvation 608

      Where Are We? 609 xvi C o n t e n t s

      12 Parallel Containers 613

      Fine-Grained Locking 6 1 6

      A rrays 6 1 6 FIFO Queue 61 7 Linked Lists 621 Dictionary (Hash table) 626

      Lock Free 632

      General-Purpose Lock Free FIFO Queue 632 Work S tealing Queue 636

      Coordination Containers 640

      

    Producer/Consumer Data Structures 641

    Phased Computations with Barriers 650

      Where Are We? 654

      13 Data and Task Parallelism 657

      Data Parallelism 659

      Loops and Iteration 660

      Task Parallelism 684

      Fork/Join Parallelism 685 Dataflow Parallelism (Futures and Promises) 689 Recursion 702 Pipelines 709 Search 71 8

      Message-Based Parallelism 71 9 Cross-Cutting Concerns 720

      Concurrent Exceptions 72 1 Cancellation 729

      Where Are We? 732

      14 Performance and Scalability 735

      Parallel Hardware Architecture 736

      SMP, CMF, and HT 736 Superscalar Execution 738 The Memory Hierarchy 739 A Brief Word on Profiling in Visual Studio 754

      Speedup: Parallel vs. Sequential Code 756

      Deciding to "Go Parallel" 756

      C o n t e n t s xvii Measuring Improvements Due to Parallelism 758 A mdahl's Law 762 Critical Paths and Load Imbalance 764 Garbage Collection and Scalability 766

      Spin Waiting 767

      How to Properly Spin on Windows 769 A Spin-Only Lock 772 Mellor-Crummey-Scott (MCS) Locks 778

      Where Are We? 781

      PART IV Systems 783

      Overlapped I / O 786

      Overlapped Objects 788 Win32 Asynchronous I/O 792 .NET Framework Asynchronous I/O 81 7

      I / O Cancellation 822

      Asynchronous I/O Cancellation for the Current Thread 823 Synchronous I/O Cancellation for Another Thread 824 Asynchronous I/O Cancellation for Any Thread 825

      Where Are We? 826

      16 Graphical User Interfaces 829

      CUI Threading Models 830

      Single Threaded Apartments (STAs) 833 Responsiveness: What Is It, Anyway? 836

      .NET Asynchronous CUI Features 837

      .NET GUI Frameworks 837 Synchronization Contexts 847 Asynchronous Operations 855 A Convenient Package: BackgroundWorker 856

      Where Are We? 860

      PART V Appendices 863 A Designing Reusable libraries for Concurrent .NET Programs 865 The 20,000-Foot View 866 xviii Conte n t s

      The Details 867

      Locking Models 867 Using Locks 870 Reliability 875 Scheduling and Threads 879 Scalability and Performance 881 Blocking 884

      887

      B Parallel Extensions to .NET

      Task Parallel Library 888

      Unhandled Exceptions 893 Paren ts and Children 895 Cancellation 897 Futures 898 Contin uations 900 Task Managers 902 Putting it All Together: A Helpful Parallel Class 904 Self-Replicating Tasks 909

      Parallel LINQ 9 1 0

      Buffering and Merging 912 Order Preservation 9 1 4

      Synchronization Primitives 915

      ISupportsCancelation 915 Coun tdownEven t 915 Lazylnit<T> 91 7 Man ualResetEven tSlim 91 9 SemaphoreSlim 920 Spin Lock 92 1 Spin Wait 923

      Concurrent Collections 924

      BlockingCollection<T> 925 Concurren tQueue<T> 928 Concurren tStack<T> 929

      Index 931 Foreword

      is once again at a crossroads. Hardware con­ currency, in the form of new together with growing soft­

      manycore processors,

      ware complexity, will require that the technology industry fundamentally rethink both the architecture of modern computers and the resulting soft­ ware development paradigms.

      For the past few decades, the computer has progressed comfortably along the path of exponential performance and capacity growth without any fundamental changes in the underlying computation model. Hardware followed Moore's Law, clock rates increased, and software was written to exploit this relentless growth in performance, often ahead of the hardware curve. That symbiotic hardware-software relationship continued unabated until very recently. Moore's Law is still in effect, but gone is the unnamed law that said clock rates would continue to increase commensurately.

      The reasons for this change in hardware direction can be summarized by a simple equation, formulated by David Patterson of the University of California at Berkeley: =

      Power Wall

      ILP Wall A Brick Wall for Serial Performance Memory Wall

      Power dissipation in the CPU increases proportionally with clock frequency, imposing a practical limit on clock rates. Today, the ability to dissipate heat has reached a practical physical limit. As a result, a significant

      xix Foreword xx

      increase in clock speed without heroic (and expensive) cooling (or materi­ als technology breakthroughs) is not possible. This is the "Power Wall" part of the equation. Improvements in memory performance increasingly lag behind gains in processor performance, causing the number of CPU cycles required to access main memory to grow continuously. This is the "Mem­ ory Wall." Finally, hardware engineers have improved the performance of sequential software by speculatively executing instructions before the results of current instructions are known, a technique called instruction level parallelism (ILP). ILP improvements are difficult to forecast, and their com­ plexity raises power consumption. As a result, ILP improvements have also stalled, resulting in the "ILP Wall."

      We have, therefore, arrived at an inflection point. The software ecosys­ tem must evolve to better support manycore systems, and this evolution will take time. To benefit from rapidly improving computer performance and to retain the "write once, run faster on new hardware" paradigm, the programming community must learn to construct concurrent applications. Broader adoption of concurrency will also enable Software Services + through asynchrony and loose-coupling, client-side parallelism, and server-side cloud computing.

      The Windows and .NET Framework platforms offer rich support for concurrency. This support has evolved over more than a decade, since the introduction of multiprocessor support in Windows NT. Continued improvements in thread scheduling performance, synchronization APIs, and memory hierarchy awareness-particularly those added in Windows Vista-make Windows the operating system of choice for maximizing the use of hardware concurrency. This book covers all of these areas. When you begin using multithreading throughout an application, the importance of clean architecture and design is critical to reducing software complexity and improving maintainability. This places an emphasis on understanding not only the platform's capabilities but also emerging best practices. Joe does a great job interspersing best practice alongside mechanism through­ out this book.

      Manycore provides improved performance for the kinds of applications we already create. But it also offers an opportunity to think completely differently about what computers should be able to do for people. The

      Foreword xxi

      continued increase in compute power will qualitatively change the applications that we can create in ways that make them a lot more inte­ resting and helpful to people, and able to do new things that have never been possible in the past. Through this evolution, software will enable more personalized and humanistic ways for us to interact with computers. So enjoy this book. It offers a lot of great information that will guide you as you take your first steps toward writing concurrent, many core aware soft­ ware on the Windows platform.

      Craig Mundie Chief Research and Strategy Officer Microsoft Corporation June 2008

      Preface

    I BEGAN WRITING

      this book toward the end of 2005. At the time, dual-core processors were becoming standard on the mainstream PCs that ordinary (nonprogrammer) consumers were buying, and a small number of people in industry had begun to make noise about the impending concurrency problem. (Herb Sutter's, The Free Lunch is Over, paper immediately comes to mind.) The problem people were worried about, of course, was that the software of the past was not written in a way that would allow it to natu­ rally exploit that additional compute power. Contrast that with the never­ ending increase in clock speeds. No more free lunch, indeed.

      It seemed to me that concurrency was going to eventually be an impor­ tant part of every software developer's job and that a book such as this one would be important and useful. Just two years later, the impact is beginning to ripple up from the OS, through the libraries, and on up to applications themselves.

      This was about the same time I had just wrapped up proto typing a small side project I had been working on for six months, called Parallel Language Integrated Query (PLINQ) . The PLINQ project was a conduit for me to explore the intricacies of concurrency, multicore, and specifically how par­ allelism might be used in real-world, everyday programs. I used it as a tool to figure out where the platform was lacking. This was in addition to spending my day job at Microsoft focused on software transactional mem­ ory (STM), a technology that in the intervening two years has become somewhat of an industry buzzword . Needless to say, I had become pretty

      xxiii P reface xxiv

      entrenched in all topics concurrency. What better way to get entrenched even further than to write a book on the subject? As I worked on all of these projects, and eventually PLINQ grew into the

      Parallel Extensions to the .NET Framework technology, I was amazed at how few good books on Windows concurrency were available. I remember time and time again being astonished or amazed at some intricate and eso­ teric bit of concurrency-related information, jotting it down, and earmark­ ing it for inclusion in this book. I only wished somebody had written it down before me, so that I didn't need to scour it from numerous sources: hallway conversations, long nights of pouring over Windows and CLR source code, and reading and rereading countless Microsoft employee blogs. But the best books on the topic dated back to the early '90s and, while still really good, focused too much on the mechanics and not enough on how to structure parallel programs, implement parallel algorithms, deal with concurrency hazards, and all those important concepts. Everything else targeted academics and researchers, rather than application, system, and library developers.

      I set out to write a book that I'd have found fascinating and a useful way to shortcut all of the random bits of information I had to learn throughout. Although it took me a surprisingly long two-and-a-half years to finish this book, the state of the art has evolved slowly, and the state of good books on the topic hasn't changed much either. The result of my efforts, I hope, is a new book that is down to earth and useful, but still full of very deep tech­ nical information. It is for any Windows or .NET developer who believes that concurrency is going to be a fundamental requirement of all software somewhere down the road, as all industry trends seem to imply.

      I look forward to kicking back and enjoying this book. And I sincerely hope you do too. Book Structure I've structured the book into four major sections. The first, Concepts, intro­ duces concurrency at a high level without going too deep into any one topic. The next section, Mechanisms, focuses squarely on the fundamental plat­ form features, inner workings, and API details. After that, the Techniques

      P reface xxv

      section describes common patterns, best practices, algorithms, and data structures that emerge while writing concurrent software. The fourth sec­ tion, Systems, covers many of the system-wide architectural and process concerns that frequently arise. There is a progression here. Concepts is first because it develops a basic understanding of concurrency in general. Under­ standing the content in Techniques would be difficult without a solid under­ standing of the Mechanisms, and similarly, building real Systems would be impossible without understanding the rest. There are also two appendices at the end. Code Requirements To run code found in this book, you'll need to download some free pieces of software.

      Microsoft Windows SDK. This includes the Microsoft C++ compiler

    • and relevant platform headers and libraries. The latest versions as of this writing are the Windows Vista and Server 2008 SDKs. Microsoft . NET Framework SDK. This includes the Microsoft C#
    • and Visual Basic compilers, and relevant framework libraries. The latest version as of this writing is the .NET Framework 3.5 SDK.

      Both can be found on MSDN: http: / /msdn.microsoft.com.

      In addition, it's highly recommended that you consider using Visual Studio. This is not required-and in fact, much of the code in this book was written in emacs-but provides for a more seamless development and debugging experience. Visual Studio 2008 Express Edition can be down­ loaded for free, although it lacks many useful capabilities such as perform­ ance profiling.

      Finally, the debugging tools for Windows package, which includes the popular WINDBG debugging utility-can also come in handy, partic­ ularly if you don't have Visual Studio. It is freely downloadable from http: / /www.microsoft.com. Similarly, the Sysinternals utilities available from http: / / technet.microsoft.com / sysinternals are quite useful for inspecting aspects of the Windows OS.

      Preface xxvi

      A companion website is available at:

      

    http://www.bluebytesoftware.comlbooks Joe Duffy June 2008 jOe@bluebytesoftware.com http://www.bluebytesoftware.com Acknowledgments

    MANY PEOPLE HAVE

      helped with the creation of this book, both directly and indirectly. First, I have to sincerely thank Chris Brumme and Jan Gray for inspiring me to get the concurrency bug several years ago. You've both been incredi­ bly supportive and have helped me at every turn in the road . This has led to not only this book but a never-ending stream of career, technical, and per­ sonal growth opportunities. I'm still not sure how I'll ever repay you guys.

      Also, thanks to Herb Sutter, who was instrumental in getting this book's contract in the first place. And also to Craig Mundie for writing a terrific Foreword and, of course, leading Microsoft and the industry as a whole into our manycore future.

      Vance Morrison deserves special thanks for not only being a great men­ tor along the way, but also for being the toughest technical reviewer I've ever had. His feedback pushed me really hard to keep things concise and relevant. I haven't even come close to attaining his vision of what this book could have been, but I hope I'm not too far afield from it.

      Next, in alphabetical order, many people helped by reviewing the book, discussing ideas along the way, or answering questions about how things work (or were supposed to work): David Callahan, Neill Clift, Dave Detlefs, Yves Dolce, Patrick Dussud, Can Erten, Eric Eilebrecht, Ed Essey, Kang Su Gatlin, Goetz Graefe, Kim Greenlee, Vinod Grover, Brian Grunkemeyer, Niklas Gustafsson, Tim Harris, Anders Hejlsberg, Jim Larus, Eric Li, Weiwen Liu, Mike Magruder, Jim Miller, Igor Ostrovsky,

      xxvii Ac k n owled,m e n t s xxviii

      Joel Pobar, Jeff Richter, Paul Ringseth, Burton Smith, Stephen Toub, Roger Wolff, and Keith Yedlin. For those reviewers who were constantly prom­ ised drafts of chapters that never actually materialized on time, well, I sin­ cerely appreciate the patience.

      Infinite thanks also go out to the staff from Addison-Wesley. In particu­ lar, I'd like to give a big thanks to Joan Murray. You've been the only con­ stant throughout the whole project and have to be the most patient person

      I've ever worked with. When I originally said the book would only take eight months, I wasn't lying intentionally. Hey, a 22-month underestimate isn't too bad, right? Only a true software developer would say that.

    About the Author

      is the development lead, architect, and founder of the Parallel Joe Duffy Extensions to the .NET Framework team at Microsoft, in the Visual Studio division. In addition to hacking code and managing a team of amazing developers, he defines the team's long-term vision and strategy. His current interests include functional programming, first-class concurrency safety in the type system and creating programming models that will enable every­ day people to exploit GPUs and SIMD style processors. Joe had previous positions at Microsoft as the developer for Parallel LINQ (PLINQ) and the concurrency program manager for the Common Language Runtime (CLR) .

      Before joining Microsoft, he had seven years of professional programming experience, including four years at EMC. He was born in Massachusetts, and currently lives in Washington. While not indulging in technical excur­ sions, Joe spends his time playing guitar, studying music theory, listening to and writing music, and feeding his wine obsession.

      xxix

      PART I Concepts

      1

      No matter whether you're doing

    C

      server-side programming for the web or cloud computing, building a responsive graphical user interface, or creating a new interactive client appli­ cation that uses parallelism to attain better performance, concurrency is ever present. Learning how to deal with concurrency when it surfaces and how to exploit it to deliver more capable and scalable software is necessary for a large category of software developers and is the main focus of this book.

      Before jumping straight into the technical details of how to use concur­ rency when developing software, we'll begin with a conceptual overview of concurrency, some of the reasons it can be important to particular kinds of software, the role it plays in software architecture, and how concurrency will fit progressively into layers of software in the future.

      Everything in this chapter, and indeed most of the content in this book, applies equally to programs written in native C++ as it does to programs written in the .NET Framework.

      Why Concurrency? There are many reasons why concurrency may be interesting to you.

      You are programming in an environment where concurrency

    • is already pervasive. This is common in real-time systems,

      3 C h a pter I n t rod u c t i o n 1 :

      4

      OS programming, and server-side programming. It is the reason, for example, that most database programmers must become deeply familiar with the notion of a transaction before they can truly be effective at their jobs. You need to maintain a responsive user interface (UI) while

    • performing some compute- or I /O-intensive activity in response to some user input. In such cases, running this work on the UI thread will lead to poor responsiveness and frustrated end users. Instead, concurrency can be used to move work elsewhere, dramatically improving the responsiveness and user experience. You'd like to exploit the asynchrony that already exists in the
    • relationship between the CPU running your program and other hardware devices. (They are, after all, separately operating and independent pieces of hardware.) Windows and many device drivers cooperate to ensure that large I / O latencies do not severely impact program performance. Using these capabilities requires that you rewrite code to deal with concurrent orchestration of events. Some problems are more naturally modeled using concurrency.
    • Games, AI, and scientific simulations often need to model interac­ tions among many agents that operate mostly independently of one another, much like objects in the real world. These interactions are inherently concurrent. Stream processing of real-time data feeds, where the data is being generated in the physical world, typically requires the use of concurrency. Telephony switches are inherently massively concurrent, leading to special purpose languages, such as Erlang, that deal specifically with concurrency as a first class concept. You'd like to utilize the processing power made available by
    • multiprocessor architectures, such as multicore, which requires a form of concurrency called parallelism to be used. This requires individual operations to be decomposed into independent parts that can run on separate processors.

      In summary, many problem domains are ripe with inherent concur­ rency. If you're building a server application, for example, many requests

      Why C o n c u rre n cy

      5

      may arrive concurrently via the network and must be dealt with simultaneously. If you're writing a Web request handler and need to access shared state, concurrency is suddenly thrust to the forefront.

      While it's true that concurrency can sometimes help express problems more naturally, this is rare in practice. Human beings tend to have a diffi­ cult time reasoning about large amounts of asynchrony due to the combi­ natorial explosion of possible interactions. Nevertheless, it is becoming increasingly more common to use concurrency in instances where it feels unnatural. The reason for this is that microprocessor architecture has fun­ damentally changed; parallel processors are now widespread on all sorts of mainstream computers. Multicore has already pervaded the PC and mobile markets, and highly parallel graphics processing units (CPUs) are every­ where and sometimes used for general purpose computing. In order to fully maximize use of these newer generation processors, programs must be written in a naturally scalable manner. That means applications must contain sufficient so that, as newer machines are adopted,

      laten t concurrency

      program performance automatically improves alongside by realizing that latent concurrency as

      actual concurrency.

      In fact, although many of us program in a mostly sequential manner, our code often has a lot of inherent latent concurrency already by virtue of the way operations have been described in our language of choice. Data and control dependence among loops, if-branches, and memory moves can constrain this, but, in a surprisingly large number of cases, these are artifi­ cial constraints that are placed on code out of stylistic habit common to C-style programming.

      This shift is a change from the past, particularly for client-side pro­ grams. Parallelism is the use of concurrency to decompose an operation into finer grained constituent parts so that independent parts can run on separate processors on the target machine. This idea is not new. Parallelism has been used in scientific computing and supercomputing for decades as a way to scale across tens, hundreds, and, in some cases, thousands of processors. But mainstream commercial and Web software generally has been authored with sequential techniques based on the assumption that clock speed will increase 40 to 50 percent year over year, indefinitely, and that corresponding improvements in performance would follow "for free."

      C h a pter I n t ro d u c t i o n 1 :

      6

    Program Architecture and Concurrency

      Concurrency begins with architecture. It is also possible to retrofit concurrency into an existing application, but the number of common pitfalls is vastly decreased with careful planning. The following taxonomy is a use­ ful way to think about the structure of concurrent programs, which will help during the initial planning and architecture phases of your project:

      Agents. Most programs are already coarsely decomposed into

    • independent agents. An agent in this context is a very abstract term, but the key attributes are: ( 1 ) state is mostly isolated within it from the outset, (2) its interactions with the world around it are and (3) it is generally loosely coupled with respect to

      asynchronous,

      peer agents. There are many manifestations of agents in real-world systems, ranging from individual Web requests, a Windows Communication Foundation (WCF) service request, COM component call, some asynchronous activity a program has farmed off onto another thread, and so forth. Moreover, some programs have just one agent: the program's entry point. Tasks. Individual agents often need to perform a set of operations at

    • once. We'll call these tasks. Although a task shares many ideas with agents-such as being asynchronous and somewhat independent­ tasks are unique in that they typically share state intimately. Many sequential client-side programs fail to recognize tasks are first class concepts, but doing so will become increasingly important as fine­ grained parallelism is necessary for multicore. Many server-side programs also do not have a concept of tasks, because they already use large numbers of agents in order to expose enough latent concurrency to utilize the hardware. This is OK so long as the number of active agents exceeds the number of available processors; as processor counts and the workloads a single agent is responsible for grow, this can become increasingly difficult to ensure.

      Operations on data are often naturally parallel, so long as they Data.

    • are programmed such that the system is made available of latent concurrency. This is called data parallelism. Such operations might

      Progra m Arch ite c t u re a n d Co n c u rre n cy

      7