Big Data Computing Business Technology 4245 pdf pdf

  

Big Data

Computing

  

A Guide for Business

and Technology Managers

  

Chapman & Hall/CRC

Big Data Series

SERIES EDITOR

Sanjay Ranka

AIMS AND SCOPE

  This series aims to present new research and applications in Big Data, along with the computa- tional tools and techniques currently in development. The inclusion of concrete examples and applications is highly encouraged. The scope of the series includes, but is not limited to, titles in the areas of social networks, sensor networks, data-centric computing, astronomy, genomics, medical data analytics, large-scale e-commerce, and other relevant topics that may be proposed by poten- tial contributors.

  PUBLISHED TITLES BIG DATA COMPUTING: A GUIDE FOR BUSINESS AND TECHNOLOGY MANAGERS Vivek Kale BIG DATA OF COMPLEX NETWORKS Matthias Dehmer, Frank Emmert-Streib, Stefan Pickl, and Andreas Holzinger BIG DATA : ALGORITHMS, ANALYTICS, AND APPLICATIONS Kuan-Ching Li, Hai Jiang, Laurence T. Yang, and Alfredo Cuzzocrea NETWORKING FOR BIG DATA Shui Yu, Xiaodong Lin, Jelena Miši ´c, and Xuemin (Sherman) Shen

  

Big Data

Computing

  

A Guide for Business

and Technology Managers

Vivek Kale CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2017 by Vivek Kale CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed on acid-free paper Version Date: 20160426 International Standard Book Number-13: 978-1-4987-1533-1 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been

made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid-

ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright

holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this

form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may

rectify in any future reprint.

  

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti-

lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy-

ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the

publishers.

  

For permission to photocopy or use material electronically from this work, please acces) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,

978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For

organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

  Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

  

Library of Congress Cataloging-in-Publication Data

Names: Kale, Vivek, author.

  Title: Big data computing : a guide for business and technology managers / author, Vivek Kale. Description: Boca Raton : Taylor & Francis, CRC Press, 2016. | Series: Chapman & Hall/CRC big data series | Includes bibliographical references and index. Identifiers: LCCN 2016005989 | ISBN 9781498715331 Subjects: LCSH: Big data. Classification: LCC QA76.9.B45 K35 2016 | DDC 005.7--dc23 LC record available at

  Visit the Taylor & Francis Web site at and the CRC Press Web site at

  

To

Nilesh Acharya and family

for unstinted support on

references and research

for my numerous book projects.

  This page intentionally left blank

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Contents

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  This page intentionally left blank

  

Figure 2.1 A hierarchical organization ...................................................................................30 Figure 2.2 The three-schema architecture ............................................................................ 37

  

  

  

  

  

  

  

Figure 5.1 Schematic of CRISP-DM methodology .............................................................. 110 Figure 5.2 Architecture of a machine-learning system ...................................................... 115

  

  

  

  

  

  

  

  List of Figures

  

  

Figure 14.1 Big data systems architecture ............................................................................ 327Figure 14.2 Big data systems lifecycle (BDSL) ..................................................................... 328Figure 14.3 Lambda architecture ..........................................................................................348 Figure 15.1 Enterprise application in J2EE ........................................................................... 355

  

  

Table 2.1 Characteristics of the Four Database Models ..................................................... 36Table 2.2 Levels of Data Abstraction .................................................................................... 38 Table 3.1 Intelligence Maturity Model (IMM) ..................................................................... 55

  

Table 4.1 Comparison between OLTP and OLAP Systems ............................................... 82Table 4.2 Comparison between Operational Databases and Data Warehouses ............83 Table 4.3 The DSS 2.0 Spectrum ............................................................................................ 96

  

  

  

  

  

  

  

  This page intentionally left blank

  

  The rapid growth of the Internet and World Wide Web has led to vast amounts of infor- mation available online. In addition, business and government organizations create large amounts of both structured and unstructured information that need to be processed, ana- lyzed, and linked. It is estimated that the amount of information stored in a digital form in 2007 was 281 exabytes, and the overall compound growth rate has been 57% with informa- tion in organizations growing at an even faster rate. It is also estimated that 95% of all cur- rent information exists in unstructured form with increased data processing requirements compared to structured information. The storing, managing, accessing, and processing of this vast amount of data represent a fundamental need and an immense challenge in order to satisfy the need to search, analyze, mine, and visualize these data as information. This deluge of data, along with emerging techniques and technologies used to handle it, is commonly referred to today as big data computing.

  Big data can be defined as volumes of data available in varying degrees of complexity, generated at different velocities and varying degrees of ambiguity, that cannot be pro- cessed using traditional technologies, processing methods, algorithms, or any commercial off-the-shelf solutions. Such data include weather, geo-spatial and GIS data, consumer- driven data from social media, enterprise-generated data from legal, sales, marketing, procurement, finance and human-resources departments, and device-generated data from sensor networks, nuclear plants, X-ray and scanning devices, and airplane engines. This book describes the characteristics, challenges, and solutions for enabling such big data computing.

  The fundamental challenges of big data computing are managing and processing expo- nentially growing data volumes, significantly reducing associated data analysis cycles to support practical, timely applications, and developing new algorithms that can scale to search and process massive amounts of data. The answer to these challenges is a scalable, integrated computer systems hardware and software architecture designed for parallel processing of big data computing applications. Cloud computing is a prerequisite to big data computing; cloud computing provides the opportunity for organizations with lim- ited internal resources to implement large-scale big data computing applications in a cost- effective manner.

  Relational databases are based on the relational model and provide online transac- tion processing (OLTP), schema-on-write, and SQL. Data warehouses are based on the relational model and support online analytical processing (OLAP). Data warehouses are designed to optimize data analysis, reporting, and data mining; data are extracted, trans- formed, and loaded (ETL) from other data sources to load data into the data warehouse. However, today’s data environment demands innovations that are faster, extremely scal- able, scale cost effectively, and work easily with structured, semistructured, and unstruc- tured data. Hadoop and NoSQL databases are designed to work easily with structured, unstructured, and semistructured data. Hadoop, along with relational databases and data warehouses, increases the strategies and capabilities of leveraging data to increase the accuracy and speed of business decisions. Hadoop plays an important role in the modern data architecture. Organizations that can leverage the capabilities of relational databases, data warehouses and Hadoop, and all the available data sources will have a competitive advantage.

  Preface What Makes This Book Different?

  This book interprets the 2010s big data computing phenomenon from the point of view of business as well as technology. This book unravels the mystery of big data computing environments and applications and their power and potential to transform the operating contexts of business enterprises. It addresses the key differentiator of big data comput- ing environments, namely, that big data computing systems combine the power of elas- tic infrastructure (via cloud computing) and information management with the ability to analyze and discern recurring patterns in the colossal pools of operational and transac- tions data (via big data computing) to leverage and transform them into success patterns for an enterprise’s business. These extremes of requirements for storage and processing arose primarily from developments of the past decade in the areas of social networks, mobile computing, location-based systems, and so on.

  This book highlights the fact that handling gargantuan amounts of data became possi- ble only because big data computing makes feasible computing beyond the practical limits imposed by Moore’s law. Big data achieves this by employing non–Neumann architectures enabling parallel processing on a network of nodes equipped with extremely cost-effective commoditized hardware while simultaneously being more tolerant of fault and failures that are unavoidable in any realistic computing environments.

  On April 19, 1965, Gordon Moore, the cofounder of Intel Corporation, pub- lished an article in Electronics Magazine titled “Cramming More Components onto Integrated Circuits” in which he identified and conjectured a trend that computing power would double every 2 years (this was termed as Moore’s law in 1970 by CalTech professor and VLSI pioneer Calvin Mead). This law has been able to predict reliably both the reduction in costs and the improvements in comput- ing capability of microchips, and those predictions have held true since then. This law effectively is a measure and professes limits on the increase in computing power that can be realistically achieved every couple of years. The requirements of modern applications such as social networks and mobile applications far outstrip what can be delivered by conventional Von Neumann architectures employed since the 1950s.

  The phenomenon of big data computing has attained prominence in the context of the heightened interest in business analytics. An inevitable consequence of organizations using the pyramid-shaped hierarchy is that there is a decision-making bottleneck at the top of the organization. The people at the top are overwhelmed by the sheer volume of decisions they have to make; they are too far away from the scene of the action to really understand what’s happening; and by the time decisions are made, the actions are usually too little and too late. The need to be responsive to evolving customer needs and desires creates operational structures and systems where business analysis and decision mak- ing are pushed out to operating units that are closest to the scene of the action—which, however, lack the expertise and resources to access, process, evaluate, and decide on the course of action. This engenders the significance of analysis systems that are essential for enabling decisive action as close to the customer as possible.

  Preface

  The characteristic features of this book are as follows:

  1. It enables IT managers and business decision-makers to get a clear understanding of what big data computing really means, what it might do for them, and when it is practical to use it.

  2. It gives an introduction to the database solutions that were a first step toward enabling data-based enterprises. It also gives a detailed description of data warehousing- and data mining-related solutions that paved the road to big data computing solutions.

  3. It describes the basics of distributed systems, service-oriented architecture (SOA), web services, and cloud computing that were essential prerequisites to the emer- gence of big data computing solutions.

  4. It provides a very wide treatment of big data computing that covers the functionalities (and features) of NoSQL, MapReduce programming models, Hadoop development ecosystem, and analysis tools environments.

  5. It covers most major application areas of interest in big data computing: web, social networks, mobile, and location-based systems (LBS) applications.

  6. It is not focused on any particular vendor or service offering. Although there is a good description of the open source Hadoop development ecosystem and related tools and technologies, the text also introduces NoSQL and analytics solutions from commercial vendors.

  In the final analysis, big data computing is a realization of the vision of intelligent

  

infrastructure that reforms and reconfigures automatically to store and/or process

  incoming data based on the predefined requirements and predetermined context of the requirements. The intelligent infrastructure itself will automatically capture, store, man- age and analyze incoming data, take decisions, and undertake prescribed actions for standard scenarios, whereas nonstandard scenarios would be routed to DevOperators. An extension of this vision also underlies the future potential of Internet of Things (IoT) discussed in the Epilogue. IoT is the next step in the journey that commenced with com- puting beyond the limits of Moore’s law made possible by cloud computing followed by the advent of big data computing technologies (like MapReduce and NoSQL). I wanted to write a book presenting big data computing from this novel perspective of computing beyond the practical limits imposed by Moore’s law; the outcome is the book that you are reading now. Thank you!

  How This Book Is Organized

  This book traces the road to big data computing, the detailed features and characteristics of big data computing solutions and environments, and, in the last section, high-potential application areas of big data.

  rovides an overview of traditional database environments.

  Preface

  resents the stan- dard characteristics of data miningrap up this part with a detailed discussion on the nature and characteristics of service oriented architecture (SOA) and web services and cloud computing.

  resents a detailed discussion on various aspects of a big data computing solution. The approach adopted in this book will be useful to any professional who must present a case for realizing big data computing solutions or to those who could be involved in a big data computing project. It provides a framework that will enable business and technical managers to make the optimal decisions necessary for the successful migra- tion to big data computing environments and applications within their organizations. ntroduces the basics of big data computing and gives an introduction to the tools and technologies, including those that are essential for big data computing.

  etails the offerings of various big data NoSQL database vendorsresents details on analysis languages, tools and development environmentsescribes big data-related management and operation issues that become critical as the big data computing environments become more complex.

   addresses popular social network applications such as Facebook and Twitter describes lesser known but more promising areas of location-based applications.

  Context-aware applications can significantly enhance the efficiency and effectiveness of even routinely occurring transactionsntroduces the concept of context as constituted by an ensemble of function-specific decision patterns. This chapter highlights the fact that any end-user application’s effectiveness and performance can be enhanced by transforming it from a bare transaction to a transaction clothed by a surrounding context formed as an aggregate of all relevant decision patterns in the past. This generation of an important component of the generalized concept of context is critically dependent on employing big data computing techniques and technologies deployed via cloud computing.

  Who Should Read This Book? All stakeholders of a big data project can read this book.

  This book presents a detailed discussion on various aspects of a big data computing solution. The approach adopted in this book will be useful to any professional who must present a case for realizing big data computing solutions or to those who could be involved in a big data computing project. It provides a framework to enable business and technical managers to make the optimal decisions necessary for the successful migration to big data computing environments and applications within their organizations.

  Preface

  All readers who are involved with any aspect of a big data computing project will profit by using this book as a road map toward a more meaningful contribution to the success of their big data computing initiative(s).

  Following is the minimal recommendations of tracks of chapters that should be read by different categories of stakeholders:

  • Executives and business managers should rea
  • Operational managers should rea
  • Project managers and module leaders should rea
  • Technology managers should rea
  • Professionals interested in big data computing should rea
  • Students of computer courses should r
  • Students of management courses should r
  • General readers interested in the phenomenon of big data computing should read

  Vivek Kale Mumbai, India

  You may want to get a copy of Guide to Cloud Computing for Business and

  Technology Managers as a companion for this book. The guide is designed to

  help you better understand the background, characteristics, and applications of cloud computing. Together, these two books form a complete road map for building successful cloud-based analytical applications.

  This page intentionally left blank

  

  I would like to thank all those who have helped me with their clarifications, criticism, and valuable information during the writing of this book.

  Thanks again to Aastha Sharma for making this book happen and Ed Curtis for guiding its production to completion. I would like to thank my family, especially, my wife, Girija, who has suffered the long hours I have dedicated to the book instead of the family and whose love and support is what keeps me going.

  Vivek Kale

  This page intentionally left blank

  

Vivek Kale has more than two decades of professional IT experience during which he

  has handled and consulted on various aspects of enterprise-wide information modeling, enterprise architectures, business process redesign, and e-business architectures. He has been Group CIO of Essar Group, the steel/oil and gas major of India, as well as Raymond Ltd., the textile and apparel major of India. He is a seasoned practitioner in transforming the business of IT, facilitating business agility and enhancing IT-enabled enterprise intel- ligence. He is the author of Implementing SAP R/3: The Guide for Business and Technology

  

Managers (Sams 2000) and Guide to Cloud Computing for Business and Technology Managers:

From Distributed Computing to Cloudware Applications (CRC Press 2015).

  This page intentionally left blank

  

  Since the advent of computer in the 1950s, weather forecasting has been a hugely chal- lenging computational problem. Right since the beginning, weather models ran on a sin- gle supercomputer that could fill a gymnasium and contained a couple of fast (for the 1970s) CPUs with very expensive memory. Software in the 1970s was primitive, so most of the performance at that time was in clever hardware engineering. By the 1990s, software had improved to the point where a large program running on monolithic supercomput- ers could be broken into a hundred smaller programs working simultaneously on a hun- dred workstations. When all the programs finished running, their results were stitched together to form a weeklong weather simulation. What used to take fifteen days to com- pute and simulate seven days of weather even in the 90’s, today the parallel simulations corresponding to a weeklong forecast can be accomplished in a matter of hours.

  There are lots of data involved in weather simulation and prediction, but weather simulation is not considered a representative of “big data” problems because it is computationally intensive rather than being data intensive. Computing problems in science (including meteorology and engineering) are also known as high-performance computing (HPC) or scientific supercomputing because they entail solving millions of equations.

  Big data is the commercial equivalent of HPC, which could also be called high- performance commercial computing or commercial supercomputing. Big data can also solve large computing problems, but it is less about equations and more about discov- ering patterns. Today companies such as Amazon, eBay, and Facebook use commercial supercomputing to solve their Internet-scale business problems. Big data is a type of supercomputing for commercial enterprises and governments that will make it possible to monitor a pandemic as it happens, anticipate where the next bank robbery will occur, optimize fast-food supply chains, predict voter behavior on election day, and forecast the volatility of political uprisings while they are happening.

  Big data can be defined as data sets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. Big data is different from the traditional concept of data in terms of the following:

  • Bigger volume: There is more than a half-a-trillion pieces of content (photos, notes, blogs, web links, and news stories) shared on Facebook every month and 4 billion hours of video are watched at YouTube every month. It is believed that there will be more than 50 billion connected devices in the world by 2020.
  • Higher velocity: At 140 characters per tweet, Twitter-generated data volume is larger than 10 terabytes per day. It is believed that more data were created between 2008 and 2011 than in all history before 2008.

  Big Data Computing

  • More data variety: It is estimated that 95% of the world data are unstructured, which makes big data extremely challenging. Big data could exist in various formats, namely, video, image, audio, text/numbers, and so on.
  • Different degree of veracity: The degree of authenticity or accuracy of data ranges from objective observations of physical phenomenon to subjective observations or opinions expressed on social media.

  Storing, managing, accessing, and processing of this vast amount of data represent a fun- damental need and an immense challenge in order to satisfy the need to search, analyze, mine, and visualize these data as information.

  On April 19, 1965, Gordon Moore, the cofounder of Intel Corporation, published an article in Electronics Magazine titled “Cramming More Components onto Integrated Circuits” in which he identified and conjectured a trend that computing power would double every 2 years (this was termed as Moore’s law in 1970 by the CalTech professor and VLSI pioneer, Calvin Mead). This law has been able to predict reliably both the reduction in costs and the improvements in computing capability of microchips, and those predictions have held true (se

  Transistors 10,000,000,000 Dual-core Intel “Itanium” 2 processor 1,000,000,000

  Intel “Itanium” 2 processor Moore’s law Intel “Itanium” processor 100,000,000 Intel “Pentium” 4 processor

  

Intel “Pentium” III processor

Intel “Pentium” II processor 10,000,000 Intel “Pentium” processor TM Intel486 processor TM

  1,000,000 Intel386 processor 286

  100,000 8086 10,000 8080

  8008 4004 1,000 1970 1975 1980 1985 1990 1995 2000 2005 2010

  

  Computing Beyond the Moore’s Law Barrier While Being More Tolerant of Faults and Failures 1,000,000 Disk capacity

  Disk throughput 100,000 Network bandwidth CPU speed

  10,000 1,000 Improvement

  100

  10

  1 1990 2000 2010 Year

  .

  In 1965, the amount of transistors that fitted on an integrated circuit could be counted in tens. In 1971, Intel introduced the 4004 microprocessor with 2,300 transistors. In 1978, when Intel introduced the 8086 microprocessor, the IBM PC was effectively born (the first

  IBM PC used the 8088 chip)—this chip had 29,000 transistors. In 2006, Intel’s Itanium 2 processor carried 1.7 billion transistors. In the next couple of years, we will have chips with over 10 billion transistors. While all this was happening, the cost of these transistors was also falling exponentially, as per Moore’s predictio

  In real terms, this means that a mainframe computer of the 1970s that cost over $1 million had less computing power than the iPhone has today. The next generation of smartphone in the next few years will have GHz processor chips, which will be roughly one million times faster than the Apollo Guidance Computer that put “man on the moon.” Theoretically, Moore’s law will run out of steam somewhere in the not too distant future. There are a number of possible reasons for this.

  First, the ability of a microprocessor silicon-etched track or circuit to carry an electrical charge has a theoretical limit. At some point when these circuits get physically too small and can no longer carry a charge or the electrical charge bleeds, we will have a design limitation problem. Second, as successive generations of chip technology are developed, manufacturing costs increase. In fact, Gordon Moore himself conjectured that as toler- ances become tighter, each new generation of chips would require a doubling in cost of the manufacturing facility. At some point, it will theoretically become too costly to develop manufacturing plants that produce these chips.

  The usable limit for semiconductor process technology will be reached when chip pro- cess geometries shrink to be smaller than 20 nanometers (nm) to 18 nm nodes. At those scales, the industry will start getting to the point where semiconductor manufacturing tools would be too expensive to depreciate with volume production; that is, their costs will

  Big Data Computing

  Lastly, the power requirements of chips are also increasing. More power being equiv- alent to more heat equivalent to bigger batteries implies that at some point, it becomes increasingly difficult to power these chips while putting them on smaller platforms.

1.2 Types of Computer Systems Today’s computer systems come in a variety of sizes, shapes, and computing capabilities.

  The Apollo 11 spacecraft that enabled landing men on the moon and returning them safely to the earth was equipped with a computer that assisted them in everything from navigation to systems monitoring, and it had a 2.048 MHz CPU built by MIT. Today’s standards can be measured in 4 GHz in many home PCs (megahertz [MHz] is 1 million computing cycles per second, while gigahertz [GHz] is 1 billion computing cycles per second). Further, the Apollo 11 computer weighed 70 pounds versus today’s powerful laptops weighing as little as 1 pound—we have come a long way. Rapid hardware and software developments and changing end user needs continue to drive the emergence of new models of computers, from the smallest handheld personal digital assistant/cell phone combinations to the largest multiple-CPU mainframes for enterprises. Categories such as microcomputer, midrange, mainframe, and supercomputer systems are still used to help us express the relative processing power and number of end users that can be supported by different types of computers. These are not precise classifications, and they do overlap each other.

  

  Microcomputers are the most important category of computer systems for both business and household consumers. Although usually called a personal computer, or PC, a micro- computer is much more than a small computer for use by an individual as a communication device. The computing power of microcomputers now exceeds that of the mainframes of previous computer generations, at a fraction of their cost. Thus, they have become powerful networked professional workstations for business professionals.

  

  Midrange computers are primarily high-end network servers and other types of servers that can handle the large-scale processing of many business applications. Although not as powerful as mainframe computers, they are less costly to buy, operate, and maintain than mainframe systems and thus meet the computing needs of many organizations. Midrange systems first became popular as minicomputers in scientific research, instru- mentation systems, engineering analysis, and industrial process monitoring and control. Minicomputers were able to easily handle such functions because these applications are narrow in scope and do not demand the processing versatility of mainframe systems. Today, midrange systems include servers used in industrial process control and manu- facturing plants and play major roles in computer-aided manufacturing (CAM). They can also take the form of powerful technical workstations for computer-aided design (CAD) and other computation and graphics-intensive applications. Midrange systems are also

  Computing Beyond the Moore’s Law Barrier While Being More Tolerant of Faults and Failures

  used as front-end servers to assist mainframe computers in telecommunications process- ing and network management.

  Midrange systems have become popular as powerful network servers (computers used to coordinate communications and manage resource sharing in network settings) to help manage large Internet websites, corporate intranets and extranets, and other networks. Internet functions and other applications are popular high-end server applications, as are integrated enterprise-wide manufacturing, distribution, and financial applications. Other applications, such as data warehouse management, data mining, and online analytical processing, are contributing to the demand for high-end server systems.

  

  Mainframe computers are large, fast, and powerful computer systems; they can process thousands of million instructions per second (MIPS). They can also have large primary storage capacities with main memory capacity ranging from hundreds of gigabytes to many terabytes. Mainframes have downsized drastically in the last few years, dramati- cally reducing their air-conditioning needs, electrical power consumption, and floor space requirements—and thus their acquisition, operating, and ownership costs. Most of these improvements are the result of a move from the cumbersome water-cooled mainframes to a newer air-cooled technology for mainframe systems.

  Mainframe computers continue to handle the information processing needs of major cor- porations and government agencies with high transaction processing volumes or complex computational problems. For example, major international banks, airlines, oil companies, and other large corporations process millions of sales transactions and customer inquiries every day with the help of large mainframe systems. Mainframes are still used for computation- intensive applications, such as analyzing seismic data from oil field explorations or simu- lating flight conditions in designing aircraft.