Data Analytics Practical Guide to Leveraging the Power of Algorithms, Data Science, Data Mining, Statistics, Big Data, and Predictive Analysis to Improve Business, Work, and Life pdf pdf


Data Analytics

Practical Guide to Leveraging the Power of Algorithms, Data Science,

Data Mining, Statistics, Big Data, and Predictive Analysis to Improve


Business, Work, and Life

  By: Arthur Zhang

  Legal notice

  This book is copyright (c) 2017 by Arthur Zhang. All rights are reserved. This book may not be duplicated or copied, either in whole or in part, via any means including any electronic form of duplication such as recording or transcription. The contents of this book may not be transmitted, stored in any retrieval system, or copied in any other manner regardless of whether use is public or private without express prior permission of the publisher.

  This book provides information only. The author does not offer any specific advice, including medical advice, nor does the author suggest the reader or any other person engage in any particular course of conduct in any specific situation. This book is not intended to be used as a substitute for any professional advice, medical or of any other variety. The reader accepts sole responsibility for how he or she uses the information contained in this book. Under no circumstances will the publisher or the author be held liable for damages of any kind arising either directly or indirectly from any information contained in this book.

Table of Contents












  How do you define the success of a company? It could be by the number of employees or level of employee satisfaction. Perhaps the size of the customer base is a measure of success or the annual sales numbers. How does management play a role in the operational success of the business? How critical is it to have a data scientist to help determine what’s important? Is fiscal responsibility a factor of success? To determine what makes a business successful, it is important to have the necessary data about these various factors.

  If you want to find out how employees contribute to your success, you will need a headcount of all the staff members to determine the value they contribute to business growth. On the other hand, you will need a bank of information about customers and their transactions to understand how they contribute to your success.

  Data is important because you need information about certain aspects of your business to determine the state of that aspect and how it affects overall business operations. For example, if you don’t keep track of how many units you sell per month, there is no way to determine how well your business is doing. There are many other kinds of data that are important in determining business success that will be discussed throughout this book. Collecting the data isn’t enough, though. The data needs to be analyzed and applied to be useful. If losing a customer isn’t important to you, or you feel it isn’t critical to your business, then there’s no need to analyze data. However, a continual lack of appreciation for customer numbers can impact the ability of your business to grow because the number of competitors who do focus on customer satisfaction is growing. This is where predictive analytics becomes important and how you employ this data will distinguish your business from competitors. Predictive analytics can create strategic opportunities for you in the business market, giving you an edge over the competition. The first chapter will discuss how data is important in business and how it can increase efficiency in business operations. The subsequent chapters will outline the steps and methods involved in analyzing business data. You will gain a perspective on techniques for predictive analytics and how it can be applied to various fields from medicine to marketing and operations to finance.

  You will also be presented with ways that big data analysis can be applied to gaming and retail industries as well as the public sector. Big data analysis can benefit private businesses and public institutions such as hospitals and law enforcement, as well as increase revenue for companies to create a healthier climate within cities.

  One section will focus on descriptive analysis as the most basic form of data analysis and how it is necessary to all other forms of analysis – like predictive analysis – because without examining available data you can’t make predictions. Descriptive analysis will provide the basis for predictive and inferential analysis. The fields of data analysis and predictive analytics are vast and complex, having so many sub-branches that add to the complexity of understanding business success. One branch, prescriptive analysis, will be covered briefly within the pages of this book.

  The bare necessities of the fields of analytics will be covered as you read on. This method is being how to prevent or encourage certain events or activities. The information contained in this book will help you to manage data and apply predictive analytics to your business to maximize your success.

Chapter 1: Why Data is Important to Your Business

  Have you ever been fascinated with ancient languages, perhaps those now known as “dead” languages? The complexity of these languages can be mesmerizing, and the best part about them is the extent to which ancient peoples went to preserve them. They used very monotonous methods to preserve texts that are anywhere from a few hundred years old to some that are several thousands of years old. Scribes would copy these texts several times to ensure they were preserved, a process that could take years.

  Using ink made from burned wood, water, and oil they copied the text to papyrus paper. Some used tools to chisel the text into pottery or stone. While these processes were tedious and probably mind- numbing, the people of the time determined this information was so valuable and worth preserving that certain members of a society dedicated their entire lives to copying the information. What is the commonality between dead languages and business analytics? The answer is data. Data is everywhere and flows through every channel of our lives. Think about social media platforms and how they help shape the marketing landscape for companies. Social media can provide companies with analytics that help them measure how successful – or unsuccessful

  • – company content may be. Many platforms provide this data for free, yet there are other platforms that charge high prices to provide a company with high-quality data about what does or doesn’t work on their website. When it comes to business, product and market data can provide an edge over the competition. That makes this data worth its weight in gold. Important data can include weather, trends, customer tendencies, historical events, outliers, products, and anything else relevant to an aspect of business. What is different about today is how data can be stored. It no longer has to be hand-copied to papyrus or chiseled into stone. It is an automatic process that requires very little human involvement and can be done on a massive scale. Sensors are connected to today’s modern scribes. This is the Internet of Things. Most of today’s devices are connected, constantly collecting, recording, and transmitting usage and performance data. Sensors collect environmental data. Cities are connected to record data relevant to traffic and infrastructure information to ensure they are operating efficiently. Delivery vehicles are connected to monitor their location and functionality, and if mechanical problems arise they can usually be addressed early. Buildings and homes are connected to monitor energy usage and costs. Manufacturing facilities are connected in ways that allow automatic communication of critical data

  sets. This is the present – and the future – state of “things.” The fact that data is important isn’t a new concept, but the way in which we collect the data is. We no longer need scribes; they have been replaced with microprocessors. The ways to collect data, as well as the types of data to be collected, is an ever-changing field itself. To be ahead of the game when it comes to business, you’ve got to be up-to-date about how you collect and use data. The product or service provided can establish a company in the market, but data will play the critical role in sustaining the success of the business.

  The technology-driven world in which we live can make or break a business. There are large companies that have disappeared in a short amount of time because they failed to monitor their customer base or progress. In contrast, there are smaller startup businesses that have flourished because of the importance they’ve placed on customer expectations and their numbers.

Data Sources

  Sources of data for a business can range from customer feedback to sales figures to product or service demands. Here are a few sources of data a business may utilize:

  Social media: LinkedIn, Twitter, and Facebook can provide insight into the kind of customer traffic your web page receives. These platforms also provide cost-effective ways to conduct surveys about customer satisfaction with products or services and customer preferences. Online Engagement Reporting: Using tools such as Google Analytics or Crazy Egg can provide you with data about how customers interact with your website. Transactional Data: This kind of data will include information collected from sales reports, ledgers, and web payment transactions. With a customer relationship management system, you will also be able to collect data about how customers spend their money on your products. How Data Can Improve Your Business

  By now you’ve realized that proper and efficient use of data can improve your business in many ways. Here are just a few examples of data playing an important role in business success.

  Improving Marketing Strategies: Based on the types of data collected, it can be easier to find

  attractive and innovative marketing strategies. If a company knows how customers are reacting to current marketing techniques, it will allow them to make changes that will fall in line with trends and expectations of their customers.

  Identifying Pain Points: If a business is driven by predetermined processes or patterns, data can

  help identify points of deviation. Small deviations from the norm can be the reason behind increased customer complaints, decreased sales, or a decrease in productivity. By collecting and analyzing data regularly, you will be able to catch a mishap early enough to prevent irreversible damages.

  Detecting Fraud: In the absence of proper data management, fraud can run rampant and seriously

  affect business success. With access to sales numbers in hand, it will be easy to detect when and where fraud may be occurring. For instance, if you have a purchase invoice for 100 units, but your sales reports only show that 90 units have been sold, you know that ten units are missing from inventory and you will know where to look. Many companies are silent victims of fraud because they fail to utilize the data to realize that fraud is even occurring.


Identifying Data Breaches: With the availability of data streams ever-increasing, it creates another

  problem when it comes to fraudulent practices. Although comprehensive yet subtle, the impacts of data breaches can negatively affect accounting, payroll, retail, and other company systems. Data hackers are becoming more sneaky and devious in their attacks on data systems. Data analytics will allow a company to see a possible data breach and prevent further data compromises which might completely cripple the business. Tools for data analytics can help a company to develop and implement data tests that will detect early signs of fraudulent activity. Sometimes standard fraud testing is not possible for certain circumstances, and tailored tests may be a necessity for detecting fraud in specific systems.

  In the past, it was common for companies to wait to investigate possible fraudulent activity and implement breach safeguards until the financial impacts became too large to ignore. With the amount of data available today this is no longer a wise – or necessary – method to prevent data breaches. The speed at which data is dispersed throughout the world can mean a breach could happen from one point to the next, crippling a company from the inside out on a worldwide scale. Data analytics testing can prevent data destruction by revealing certain characteristics or parameters that may indicate fraud has entered the system. Regular testing can give companies the insight they need to protect the data they are entrusted to keep secure.


Improving Customer Experience: Data can also be gathered from customers in the form of feedback

  about certain business aspects. This information will allow a company to alter business practices, services, or products to better satisfy the customer. By maintaining a bank of customer feedback and continually asking for feedback you are better able to customize your product or service as the customers’ needs change. Some companies send customized emails to their customers, creating the data management.

  Making Decisions: Many important decisions about a business require data about market trends,

  customer bases, and prices offered by competitors for the same or similar products or services. If data does not influence the decision-making process, it could cost the company immensely. For example, launching a new product in the market without considering the price of a competitor’s product might cause your product to be overpriced – therefore creating problems when trying to increase sales. Data should not only apply to decisions about products or services, but also to other areas of business management. Certain datasets will provide information on how many employees it will take to foster the efficient functioning of a department. This will allow you determine where you are understaffed or overstaffed.


Hiring Process: Using data to select the right personnel seems to be neglected by many corporations.

  For effective business operation, it is crucial to put the right candidate in the right position. Using data to hire the most qualified person for a position will ensure the business will remain highly successful. Large companies with even larger budgets use big data to seek out and choose skilled people for their open positions. Smaller companies would benefit from using big data from the beginning to staff appropriately to further the successes of a startup or small business. This method of using gathered data during hiring has been proven to be a lucrative practice for various sizes of organizations. Data scientists can extract and interpret specific data needed from the human resources department for hiring the right person.


Job Previews: By providing an accurate description of an open position, a job seeker will be better

  prepared about what to expect should they be hired for the position. Pre-planning the hiring process utilizing data about the open position is critical in appealing to the right candidate. Trial and error are no doubt a part of learning a new job, but it slows down the learning process. It will take the new employee longer to catch up to acceptable business standards which also slows their ability to become a valuable company resource. By incorporating job preview data into the hiring process, the learning curve is reduced, and the employee will become more efficient faster.

  Innovative Methods for Gathering Data for Hiring: Using new methods of data collection in the

  hiring process can prove to be beneficial in hiring the right professional. Social sites that collect data, such as Google+, Twitter, Facebook, and LinkedIn can give you additional resources for recruiting potential candidates. A company can search these sites for relevant data from posts made by the users to connect to qualified applicants. Keywords are the driving force for online searches. Using the most visible keywords in a job description will increase the number of views your job posting will receive.

  Traditionally, software and computers have been used to determine if an employee would be better suited for another position within the company or to terminate employment. However, using this type of resource can also help to find the right candidate for a job outside of the company. Basic standards such as IQ or skills tests can be limiting, but focusing on personality traits may open the field of potential candidates. By identifying personality characteristics, it will help to filter out candidates based on traits that will not be beneficial to the company. If a person is argumentative or prefers to be isolated, they certainly wouldn’t thrive in a team-oriented environment. By eliminating mismatches resources. By utilizing this type of data collection, it would not only find candidates with the right skills but also with the right personalities to align with current company culture. Being sociable and engaging will foster the new employee as they learn their new role. It’s important that new candidates fit well with seasoned employees to reinforce working relationships. The health of the working environment greatly influences how productive the company is overall.

  Using Social Media to Recruit: Social media platforms are chock full of data sources for finding

  highly qualified individuals to fill positions within a company. On Twitter, recruiters can follow people who tweet about a certain industry. A company can then find and recruit ideal candidates based on their interest and knowledge of an industry or a specific position within that industry. If someone is constantly tweeting about new ideas or innovations about an industry aspect, they could make a valuable contribution to your company. Facebook is also valuable for this kind of public recruitment. It’s a cost-effective way to collect social networking data for companies who are seeking to expand their employee base or fill a position. By “liking” and following certain groups or individuals a company can establish an online presence. Then when the company posts a job ad, it is likely to be seen by many people. It is also possible to promote ads for a small fee on Facebook. This means your ad will be placed more often in more places, increasing your reach among potential candidates. It’s a geometrical equation – furthering your reach with highly effective job data posts increases the number of skilled job seekers who will see your ad, resulting in a higher engagement of people who will be a great fit for your company.

  Niche Social Groups: By joining certain groups on social media platforms recruiters will have

  access to a pool of candidates who most likely already possess certain specific skills. For instance, if you need to hire a human resources manager, joining a group comprised of human resource professionals can potentially connect you with your next hire. Within this group, you can post engaging and descriptive job openings your company has. Even if your potential candidate isn’t in the group, other members will most likely have referrals. Engaging in these kinds of groups is a very cost-effective method to advertise open positions.

  Gamification: This is an underused data tool but can be effective if the hiring process requires

  multiple steps or processes. By rewarding candidates with virtual badges or other goods, it will motivate candidates to put forth effort during the selection process. This will allow their relevant skills in performing the job to be highlighted and is a fun experience when applying for a job which is typically a rather boring process.

  These are only a few of the ways in which data can help companies and human resource departments streamline the hiring process and save resources. As you can see, data can be very important for effective business functioning, and you’ve also seen the multitude of uses it has for just the hiring process. This is why proper data utilization is critical in business decision making for all other aspects of your business.

Chapter 2: Big Data

  Across the globe, data and technology are interwoven into society and the things we do. Like other production factors – such as human capital and hard assets – there are many parts of the modern economic activity that couldn’t happen without data. Big data is, in short, the large amounts of data that are gathered in order to be analyzed. From this data, we can find patterns that will better inform future decisions. This data and what can be learned from it will become how companies compete and grow in the near future. Productivity will be greatly improved, as well. Significant value will be created in the economy of the world because of increase in the quality of services and products while reducing waste. While this data has been around, it has only really excited people that are already interested in data. As times have changed, we are getting more and more excited by the amount of data that we’re generating, mining and storing. This data is now one of the most important economic factors for so many different people. In the present, we can look back at trends in IT innovation and investment. We can also see the impact on productivity and competitiveness that have resulted from those trends and how big data can make large changes in our modern lives. Like the previous IT-enabled innovations, big data has the same requirements to move productivity further. For example, if you see innovations in current technology, then there will need to be a close following after of complementary management innovations. Big data technology supplies and analytic capabilities are so advanced now that it will have just as much of an impact on productivity as suppliers of other technologies. Businesses around the world will need to start taking big data seriously because of the potential it has to create some real value. There are already retail companies that are putting big data to work because of the potential it has to increase the operating margins.

Big Data – A New Advantage

  Since it has come to light, big data is becoming an incredibly important way that companies are outperforming each other. Even new entrants into the market are going to be able to leverage strategies that data has found in order to compete, innovate, and attain real value. This will be the way that all the different companies, new and established, will compete on the same level.

  There are already examples of this competition everywhere. In the healthcare industry, data pioneers are looking at the outcomes of some pharmaceuticals that are widely prescribed. From the analysis of the results, they learned that there were risks and benefits that had not been seen in the limited trials that companies had run with the pharmaceuticals.

  There are other industries that are using the sensors in their products to gain data that they can use. This can be seen in children’s toys, large-scale industrial goods, and so many others. The data that they gather show how the products are used in real life. With this data, companies can make improvements on the products based on how people are really using them. This will make these products so much better for the future users.

  Big data is going to help create new growth opportunities and create new companies that specialize in aggregating and analyzing data. There’s a good proportion of companies that will sit right in the middle of flowing information. They’ll be receiving information and data that comes from many sources just to analyze it. Managers and company leaders that are thinking ahead need to start creating and finding new ways to make their companies capable of dealing with big data. People that do so will need to be especially aggressive about it.

  It’s important to realize that not only the amount of big data but the high frequency and real-time nature of data as well. There’s the idea of “nowcasting” around right now. This process is estimating metrics right away. These metrics can be things like consumer confidence. Knowing that information so soon used to be impossible and only something that could be done after a while. “Nowcasting” is being used more and more, adding a lot of potential to the ways that companies predict things.

  The high frequency of the data will allow users to try to test theories and analyze the results in ways that they were incapable of before. There have been studies of major industries that have found ways that big data can be used:

  1. Big data can unlock serious value for industries because it makes information transparent. There is a lot of data that isn’t being recorded and stored. There is still a lot of information that cannot be found as well. There are people that are spending a quarter of their time looking for extremely specific data and then storing it, sometimes in a digital space. There’s a lot of inefficiency in this work right now. More and more companies are storing data from transactions online, these people are able to collect tons of accurate and detailed information about everything. They can find out inventory and even the number of sick days that people are taking. Some companies are already using this data collection and analysis to do experiments and see how they can make better-informed management decisions. Big data allows companies to put their customers into smaller groups. This will allow them to tailor the services and products that they are offering. More sophisticated analytics are also allowing for better decision making to happen. There are fewer risks and bring light to information and insights that might not have seen the light of day.

  Big data can be used to create a brand new generation of services and products that wouldn’t have been otherwise possible. Some manufacturers are already using the data that has been collected from their sensors to figure out more efficient and useful after-sales services.

Big Data Creates Value

  Using the US healthcare system as an example, we can look at ways that big data can really create good value. If the healthcare system used big data to use the efficient and quality of their services, they would actually create $300 billion of value every year. 70% of that value would have been seen from a cut in expenditures. These expenditures that would be cut are only 8% of the current expenditures.

  If you look at European developed economies instead, you can see a different way that big data creates value. The government administrations could use big data in the right way to improve operational efficiency. That would result in about €100 billion worth of value every year. This is just one area. If the governments used advanced analytics and boosted tax revenue collection, they would create ever more value just from cutting down on errors and fraud in the system. Even though we’ve been looking at companies and governments so far, they aren’t the only ones that are going to benefit from using big data. A consumer will benefit from this system as well. Using location data in specific services, people could find a consumer surplus of up to $600 billion. This can be seen especially in systems and apps that use real-time traffic information to make smart routing. These systems are some of the most used on the market and they use location data. There are more and more people using smartphones. Those that have smartphones are taking advantage of the free map apps that are available. With an increase in demand, it’s likely that the nmber of apps that use smart routing are going to increase.

  By the year 2020, more than 70% of mobile phones are going to have GPS capabilities built into them. In 2010, this number was only 20%. Because of the increase in GPS capable devices, we can expect that smart routing will have the potential to create savings of around $500 billion in fuel and time that people will spend on the road. That amount of money is equal to around 20 billion driving hours. It’s like saving a driver 15 hours a year on the road. This would save them $150 billion dollars in fuel.

  While we have seen specific pools of data in the examples listed above, but big data has a huge potential in combined pools of data. The US healthcare system is a great way to look at the potential future of big data. The healthcare system has four distinct data pools: clinical, medical, pharmaceutical products; research and development; activity and cost; and patient data. Each data pool is captured and managed by a different portion of the healthcare system. If big data was used to its full potential, then the annual productivity of the healthcare system could be improved around 0.7%. But it would take the combination of data from all these different sources to create that improved efficiency. The unfortunate part is that some of the data would need to come from places that do not share their data at scale right now. Data like clinical claims and patient records would need to somehow be integrated into the system. The patient, in turn, would have better access to more of their healthcare information and would be able to compare physicians, treatments, and drugs. This would allow patients to pick out their medications and treatments based on the statistics that are available to them. However, in order to get these kinds of benefits, patients would have to accept a trade for some of their privacy.

  Data security and privacy are two of the biggest roadblocks in the way of this. We must find a way around them if we really ever want to see the true benefits of using big data. The most prevalent challenge right now is the fact that there is a shortage of people that are skilled in analyzing big data properly. By 2018, the US will be facing a shortage of 140,000 and 190,000 people with training in deep analysis. They’ll also be facing a shortage of roughly 1.5 million people that have the quantitative skills and managerial experience needed to interpret the analyses correctly. These people will be basing their decisions off of the data.

  There are many technological issues in the way as well that will need to be resolved before big data can be used effectively by more companies. There are so many incompatible formats and standards that are floating around as well as legacy systems that are stopping people from integrating data and from using sophisticated analytical tools to really look at the data sets.

  Ultimately, there will have to be technology made for computing and storage through to the application of visualization and analytical software. All this technology will have to be available in a stack so that it is more effective. In order to take true advantage of big data, there has to be better access to data, and that means all of it. There are going to be so many organizations that will need to have access to data stores and maintained by third parties to add that data in with their own. These third parties could be customers or business partners.

  This need for data will mean that companies that really need data will have to be able to come up with interesting proposals for suppliers, consumers, and possibly even competitors in order to get their hands on that data. As long as big data is understood by governments and companies, the potential it has to deliver better productivity will ensure that there will be some incentive for companies to take the actions that they have to get over the barriers that are standing in the way. In getting around these barriers, companies will find new ways to be competitive in their industries and against individual companies. There will be greater productivity and efficiency all around which will result in better services, even when money is tight.

  Big Data Brings Value to Businesses Worldwide Big data has been bringing value to business internationally for a while. The amount of value that it will continue to bring is almost immeasurable. There are several ways that the big data has impacted the world so far. It has created a brand new career field in Data Science. Data interpretation has been changed drastically because of big data. The healthcare industry has been improving quickly and considerably since they added predictive analytics into part of their business. Laser scanning technology is changing and has changed the way that law enforcement officers reconstruct crime scenes. Predictive analytics are changing how caregivers and patients interact. There are even data models that are being built now to look at business problems and help find solutions. Predictive analytics has had an impact on the way that the real estate industry conducts business.

Big Data is a Big Deal

  Besides the fact that data is bringing so much value to so many different companies and industries, it is also opening up a whole new path of management principles that companies can use. Early on in professional management, corporate leaders discovered that one of the key factors for competitive success was a minimum scale of efficiency.

  Comparatively, one of the modern factors for competitive success is going to be capturing higher quality data and using that data with more efficiency at scale. For the current company executives that might be doubting how much big data is going to help them, there are these five questions that will really help them figure out how big data is going to benefit them and their organizations.

  What can we expect to happen in a world that is “transparent” meaning that data is readily available? Over time information is becoming more accessible in all sectors. The fact that that data is coming out of the shadows means that organizations, which have relied heavily on data as a competitive asset, are potentially going to feel threatened. This can be seen especially in the real-estate industry. The real-estate industry has typically provided a gateway to transaction data and a knowledge of bids and buyer behaviors that haven’t been available elsewhere. Gaining access to all of that requires quite a bit of money and even more effort. In recent years, online specialists are bypassing the agents to create a parallel resource for real-estate data. This data is gotten directly from buyers and sellers, and available to those same groups. Pricing and cost data has also seen a spike in availability for several industries. There are even companies using satellite imagery that is available at their fingertips. They’re using processing and analysis to look at the physical facilities of their competitors. That information can provide insights into what expansion plans or physical constraints that their competitors are facing. But with all that data there comes a challenge. The data is being kept within departments. Engineering, R&D, service operations, and manufacturing will have their different information and it will be stored in different ways depending on the department. However, the fact that all this information is kept in these little pockets means that the data cannot be used and analyzed in a timely manner. This can cause all sorts of problems for companies. For example, financial institutions don’t share data across departments like money management, financial markets, or lending. This segmentation means that the customers have been compartmentalized. They don’t see the customer across all of these different areas, but just as separate images. Some companies in the manufacturing business are trying to stop this separation of data. They’re integrating data from their different systems and asking their smaller units to collaborate in order to help their data flow. They’re even looking for data and information outside of their groups to see if there’s anything else out there that might help them figure out better products and services.

  The automotive industry has suppliers all around the world making components that are then used in the cars that they’re making. Integrating data across all of these would allow the companies and their supply chain partners to work together at the design stage instead of later on. Can testing decisions change the way that companies compete?

  Gaining the ability to really test decisions would cut down on costs and improve a company’s competitiveness. These automotive companies would be able to test and experiment with the different components. By going through this process, they’ll be able to gain results and data that will guide their decisions about operational changes and investments. Really, experimentation will allow companies and their managers to really see the difference between correlation and causation while also boosting financial performance and producing more effective products.

  The experiments that companies will use to collect data can take several forms. Some online companies are always testing and running experiments. In particular cases, there will be a set of their web page views that they are using to test the factors that drive sales and higher usage. Companies with physical products will use tests to help make decisions, however, big data can make these experiments go even further. McDonald’s put devices in some of their stores that track customer interaction, traffic, and ordering patterns. The data gained through these devices can help them make decisions about their menus, the design of their restaurants, as well as many other things. Companies that can’t use controlled experiments may turn to natural experiments to figure out which variables are in play. A government sector collected data on different groups of employees that were working in various places but doing similar jobs. This data was made available and the workers that were lagging were pushed to improve their performance.

  What effect will big data have on business if it is used for real-time customization? Companies that deal with the public have been dividing and targeting specific customers for quite a while now. Big data is taking that further than it ever by making it possible for real-time personalization to become part of these companies. Retailers may become able to track individual customers and their behaviors by monitoring their internet click streams. Knowing this, they will be able to make small changes on websites that will help move the customer in a direction to buy. They will be able to see when a customer is making a decision on something they might purchase. From here, they will be able to “nudge” the customer towards buying. They could offer bundled products, benefits, and reward programs. This is real-time targeting.

  Real-time targeting also brings in data from loyalty groups. This can help increase higher-end purchases made by the most valuable customers. The retail industry is likely to be the most driven by data. Because they’re keeping track of internet purchases, conversations taking place on social media, and location data pulled from smartphones, they’ve got tons of data at their fingertips. Besides the data, they have better analytical tools now that can divide customers into smaller segments for even bettering targeting.

  Will big data just help management or will it eventually replace it? Big data opens up new ways for algorithms and analysis, mediated by machines, to be used. Manufacturers are using algorithms to analyze the data that’s being collected from sensors on the production line. This data and analysis help the manufacturers regulate the processes, reduce their waste, increase outfit, and even cut down on potentially expensive and dangerous human intervention. systems all the time. The data is fed into computers where the data is turned into results that are given to the operation centers where the oil flows are adjusted to post production and reduce the amount of downtime for the whole process. One of the largest oil companies has managed to increase oil production by five percent, while also reducing staff and operating costs by ten and twenty-five percent.

  Products ranging from photocopiers to jet engines are now tracking data that helps people understand their usage. Manufacturers are able to analyze data and fix the problems, whether they’re just simply fixing glitches in software or needing to send out a repair representative. The data is even predicting when products will fail and being used to schedule repairs before they’re likely to fail. It’s obvious that big data can create huge improvements in performance and help make risk management easier. The data could be used to even find errors that would otherwise unseen. Because of the increasing demand for analytics software, communication devices, and sensors; prices for these things are falling fast. More and more companies will be able to find the time and money to get involved in collecting data. Will big data be used for the creation of brand new business models? Big data has already been responsible for the creation of new industries surrounding the analysis and use of the information it has. But the company categories that are also being produced big data have business models that are driven entirely by data. Many of these companies are intermediaries in a value chain. They are generating valuable “exhaust” data from transactions.

  A major transport company was keeping data about their own business, but they were also collecting vast amounts of data about what products were being shipped where. They took the opportunity and began selling the data that they were collecting to supplement economic and business forecasts. There was another global company that was learning a lot by looking at their own data. From doing the analysis for themselves, they eventually decided to branch out and create a business that analyzes data for other organizations. The business aggregates supply chain and chop floor data for manufacturers. It also sells relevant software tools that a company will need to improve their own performance. This side business that the company opened is outperforming the manufacturing business, and that is because of the value of big data.

  Big data is creating a whole new support model for the markets that already exist today. Companies have all sorts of new data needs, and they need qualified people to support that data. As a result, if you own a business, then you may need an outside firm to analyze and interpret any data you’re producing for you.

  These specialized firms can take large amounts of data in various forms and break it down for you. These firms exist because there is a need for support for larger companies in many different industries. The employees they hire are trained to locate and capture data in systems and processes.

  They’re allowing larger companies to focus on their work and doing the data aggregation for the company. They assimilate, analyze, and interpret trends in the data, and then they report to the company about any notifications that they have. department within their own company. This would be more cost-effective than hiring an entire outside firm, but it does require very specific and specialized skills within the company. The department would focus on taking the data flow and analyzing, interpreting, and finding new ways to use the data. These new applications and the new data department would monitor existing data for fraud, triggers, or issues.

  Big data has created a whole new field of studies in colleges and higher institutions of learning. People are training in the latest methods of big data gathering, analyzing, and interpreting. This path will lead people to critical positions in the newly trending data support companies.

  Big data has created all sorts of changes, and it will continue to make even more. In education areas, big data will influence and change the way that teachers are hired. The data will be able to look at recruiting processes and predictive analytics will be able to look at the traits that most effective teachers are going to need to most properly maximize the learning experience.

Chapter 3: Development of Big Data

  While most of our data collection and analysis has only happened in the last couple of years, the term “big data” has been in our vocabulary since 2005. Analysis of data has been around for as long as we could count. Accounting in ancient Mesopotamia tracked the increases and decreases of herds th and crops and even then we were trying to find patterns in that data. In the 17 century, John Graunt published a book, “Natural and Political Observations Made upon the Bills of Mortality,” that was the first large-scale example of data analysis. It provided insight into the causes of death at the time, and the book was meant to help stop the Bubonic plague.

  Graunt’s book and the way he approached the data was a revolution. Statistics, as it is now, was invented at that time, even though we couldn’t use it fully before the invention of computers. Data th analysis came in in the 20 century when the information age really began. There were many examples of early data analysis and collection even in the beginning. There was the machine invented Herman Hollerith that could analyze data in 1887; it was used to organize census data. Roosevelt’s administration used big data for the first time to keep track of the social security contributions for millions of Americans.

  The first real data processing machine came during World War 2. The British intelligence wanted to decipher Nazi codes. The machine, Colossus, processed 5,000 characters per second to find the patterns in coded messages. The task of deciphering went from weeks to just hours. This was a huge victory for technology and a massive improvement for statistical analysis.

  In 1965, the electronic storage of information started, as another idea of the American government. The system was put in place to store tax return claims and fingerprints. However, the project went unfinished because of the worries of the American people. They thought of that as something similar to “Big Brother,” but the electronic storage of information was already starting. It would be impossible to stop the flow of information.

  The invention of the Internet was really what sparked the true revolution in data storage and analysis. Tim Berners-Lee couldn’t have known what he had really started in the world. However, it was really in the 90s that his system was turned into the monster that it is today. In 1995, the first supercomputer was made. The machine was capable of doing in a single second what a human with a calculator could do in 30,000 years. This was the next great stride in data analysis.