Plugged In Cybersecurity in the Modern Age pdf pdf

  

GATORBYTES

PLUGGED IN

CYBERSECURITY IN THE MODERN AGE

  

Jon Silman

  CONTENTS

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

1

WHAT IS CYBERSECURITY?

  Imagine all of your information—medical, financial, and personal—available for anyone in the world to see. That medical procedure? The password to your bank account? All of it, readily available for anyone with a little bit of Internet know-how. Does that sound scary? Unlikely? Unfortunately, it’s not only real; it’s more common than you might think.

  Files used to be stored in warehouses, in physical folders and lockers, behind doors and guards. Physical keys were needed to access rooms. Physical presence was required to be inside a house. Now, you can check in with a camera connected to a network. You can download addresses, social security numbers, and credit cards with a few clicks.

  More and more personal information is now being stored on the cloud. Type in a name, at a doctor’s office, for example, and information comes right up. Need information on a particular person? Type in a driver’s license number. Want to check your bank account balance? Insert username and password. So yes, you have information at your fingertips, but at what cost?

  The ease of access belies the real trouble, the one most people would rather not think about, and that’s security. Ease of access is a sacrifice for security. Convenience is too. Security protocols, though annoying, at least secure information. The truth is there are many ways to get at personal information—not just through something as simple as a stolen password but also through backdoors and hacks of the software and even the hardware itself.

  When was the last time you updated all of your apps on your smartphone? Or your operating system? Have you ever thought about the reason why updates are necessary? It’s not just because things need to be up to date. Mostly it’s because adversaries (i.e., bad guys) have found ways to infiltrate the software. That’s why it’s so important.

  Cybersecurity is not a buzzword. It’s not something to be ignored. It will only become more and more important as more and more of our lives become interconnected and networked. Almost everything we do, every single day, will have some sort of online component. It’s important to know the impact of cybersecurity and the importance of staying on top of not only knowledge but the latest practices as well.

  So why is information so hard to secure? Why is cybersecurity such an important topic? According to the Department of Homeland Security, “there is increased risk for wide scale or high-consequence events that could cause harm or disrupt services upon which our economy and the daily lives of millions of Americans depend. In light of the risk and potential consequences of cyber events, strengthening the security and resilience of cyberspace has become an important homeland security mission.”

  This is where the University of Florida comes in. With its preeminence initiative, with the goal of becoming a top-ten university, UF has made the field of cybersecurity research one of its main priorities. The university has hired some of the best cybersecurity minds in the world to attack the problem from all sides and avenues. It has created the Florida Institute for Cybersecurity Research, experiment—to explore the best ways to stay safe online.

  They’re tackling the hard questions and working to stay ahead of the trends. They’re figuring out how to keep smartphones safe, how to track data, how to find hardware Trojans on chips—the list goes on.

  In a recent interview, UF College of Engineering Dean Cammy Abernathy spoke about the importance of the research, and what it means. “Cybersecurity is one of the most important issues facing our country,” she said. “It’s our responsibility to work on issues of importance, and we’ve made it one of our top priorities.”

  Abernathy said the topic comes up more and more in her daily conversations, especially as more computer systems become embedded in our everyday lives. It’s an issue from banking to health care to the automotive industry, and it’s only going to become more prevalent, especially as more and more devices become connected through the Internet.

  “Confidence in these systems is critical to sustain our way of life,” she said. “If we lose confidence in these things, it’s going to have a tremendous negative impact.” In addition, she said, it’s also a national security issue, as there are people out there who want to exploit our systems and degrade not only our way of life but our society as a whole. As for the birth of the Florida Institute for Cybersecurity Research, she said there were previously members of the college working on this subject, but not to the level that there is today. “We found some of the best minds of cybersecurity in the country,” she said, adding that they picked from the brightest young minds they could get and also some of the leading veterans of the field who bring valuable institutional knowledge to the table.

  “We brought them all together, colocated them in one space, formed the institute, and already we’re off to a great start,” she said. “We want to build on that.” She relishes the fact that there are such varied brands of research going on at the institute. For example, she said the work being done on hardware assurance has gotten national attention. She sees the institute and the cybersecurity initiative as a way to breed a new type of engineer, ones that can lead in the twenty-first century, not only with technical know-how, but with the communication and social skills necessary to teach and inform a whole new generation of learners.

  Abernathy said this is only the beginning, as she plans on hiring more people, especially someone who specializes in biomedical research. She knows that cybersecurity is a growing field, and she wants to be able to meet the need with the best around.

  “Let’s give them the ability to not only solve the problems of today,” she said, “but also prepare them to tackle the problems fifteen to twenty years from now.” Cybersecurity is a massive topic, and it touches upon all aspects of technology. This breadth is why the Florida Institute for Cybersecurity Research was established. It’s meant to be the foremost and premier multidisciplinary research institute in the country. It’s focused on advancing the field of cybersecurity to stay ahead of the hackers and cybercriminals and foreign agencies who try to compromise and steal not only our data but our cutting-edge technology. It’s a partnership between all types of industry and government. It’s also a chance for undergraduate and graduate students to learn from elite cybersecurity experts.

  So what challenges does it bring? Who is the team at the Florida Institute for Cybersecurity Research? What are they doing so the bad guys don’t win?

  

2

THE INSTITUTE AND THE PREEMINENCE INITIATIVE

  The initiative really got started in November 2013. That’s when the Florida Board of Governors gave the signal to the University of Florida to go on what would become one of the most prolific and consequential hiring sprees in the state. They gave the approval to go after the stars in their respective fields, with one goal in mind: make the University of Florida a top-ranked public research university.

  And they ponied up, with a plan to spend $15 million in funding from the legislature for different areas of focus, such as biodiversity and food security and informatics and creative writing—and, of course, cybersecurity. The idea is to snag the top minds in the country and bring them all to the same place—Gainesville, Florida.

  In addition to the state money, there’s also a strong private fund-raising contingent to get the campus, the initiatives, the faculty and students, and the systems in place to take on such a transformative endeavor.

  University of Florida President Bernie Machen said, “It’s

  not the rankings themselves that matter. But a rising reputation builds momentum that allows us to make ever greater contributions. Preeminence means UF can make more life-changing discoveries, create jobs through our startup companies and technology licensing, compete to get Florida its share of federal research dollars, and ensure that Floridians do not have to leave the state to get a world- class education.”

  The $15 million, five-year annual payout is matched by the university, with the plan to make more

  

address all of its initiatives. It was the second year in a row where donations topped $210 million.

  The goal was always to raise $800 million in three years. There’s also an additional $71 million in “deferred gifts and pledges,” bringing the total up to $285.9 million.

  All this money, of course, is spread around, and some of it went into hiring the world’s foremost cybersecurity experts. The truth is cybercriminals are relentless, and attacks against banks and personal information, military installations, and ATMs are going to continue and get more sophisticated.

  I spent a lot of time conversing with and learning from faculty at the University of Florida, the real experts in cybersecurity. They were generous with their time and spoke with gratitude about landing at a place that values and pushes them to greater heights. Some of them didn’t have access to computers until they were in college. Others learned computer theory through books. Some of them came from villages in countries far away. Others grew up in the United States and made names for themselves, became nationally regarded, and were plucked to be a part of what they consider the best cybersecurity team in the country.

  All of the hiring and the movement toward national education in the pressing issues of cybersecurity highlight the importance of the topic. It’s important to have a good understanding of the better understanding of the way the Internet and interconnectivity are changing the way we use everyday electronic devices.

  

3

THE INTERNET OF THINGS

  You can’t have a discussion about cybersecurity without a mention of the Internet. Cybersecurity wouldn’t be as much of an issue if not for the proliferation of network capabilities on all the network- capable devices, ones that are such a big part of our everyday lives—all the “smart” devices, including homes, power grids, thermostats, TVs, and wearable devices like the Apple Watch or the Fitbit. Practically everything’s connected to a network, and that connectivity means there are points of access for these devices to communicate. And with those access points come ways for them to be compromised.

  This phenomenon, the one where everything is network connected, is known as the “Internet of Things,” or IoT.

   , there will be fifty billion devices connected to the Internet by 2050.

  Joseph Wilson, an assistant professor at the University of Florida’s Computer and Information Science and Engineering Department, put it this way: “The real issue is not so much device security; it’s network security. A single IoT device, like a lightbulb or a thermostat or a garage-door opener, might not seem like something dangerous by itself, but that one device might provide a foothold for someone to access the rest of your devices. They’re not going to get your credit card number from your garage-door opener, but if they can listen to your wifi and use it to get into your Amazon account, that’s a different story.”

  Currently, IoT is mostly a loose collection of specific, purpose-built networks, such as heating systems, telephone security, and lighting. However, these networks are going to coalesce and thus become more powerful—and therein lies the danger.

  At the 2016 cybersecurity conference put on by the Florida Institute for Cybersecurity Research, Yier Jin, a professor at the University of Central Florida, gave a presentation called “IoT Security: From Hacking to Defense.”

  In his speech, he outlined the ways hackers access everyday devices and explained how vulnerable some of these devices really are. Devices that seem innocuous might not be, he said. He’s been investigating the problem of security with some of these items, like the Roku, which is a network-connected streaming peripheral, or the Fitbit, which tracks fitness activity. What he found was troubling.

  He also investigated the smart car, he said, and with its series of network-connected systems comes the potential danger of hacking. What about these so-called fitness-tracking smartbands? They don’t have a keyboard, so they should be difficult to compromise, right? Not really, as he successfully hacked them. The ease of it surprised and alarmed him, he said, so he called the company and told them about some security issues. They conceded that they hadn’t put security on the devices because it would make them more difficult to use. harder it will be to use.

  For this reason, the U.S. government has been working diligently with researchers to combat the problem. As more and more devices become interconnected, cyberattacks get more involved and complicated, and the real-world consequences become more consequential.

  about cybersecurity outlined the realities of the danger.

  Whereas previous attacks were directed mostly to steal confidential information and create havoc online, new attacks will affect the physical world, like a recent hacking of the Associated Press Twitter account by the Syrian Electronic Army, which sent a tweet about an explosion at the White House. The tweet caused the stock market to decline almost one percent before it was revealed as a hoax.

  In 2012, a hacker built a device that could open electronic locks in hotel rooms without a key.

   , criminals continued to use the exploit for months.

  In 2014, the Sony movie studio hack revealed thousands of documents of personal correspondence between studio executives and disrupted the movie industry. Perhaps particularly troubling is the issue of driverless cars. Engines, locks, hood and trunk releases, heat, dashboard functionalities, and brakes have all been shown to be vulnerable to attack.

  

, a man wrote an article for Wired describing the vulnerability. He

  was driving a Jeep, and the cold air suddenly blasted at maximum, without any input from him. The radio went full volume, despite his trying to turn it off. Then the windshield wipers started going off, and suddenly he started decelerating. All without his control. It all comes from one small vulnerability: “Uconnect, an Internet-connected computer feature in hundreds of thousands of Fiat Chrysler cars, SUVs, and trucks, controls the vehicle’s entertainment and navigation, enables phone calls, and even offers a Wi-Fi hot spot,” he described. “And thanks to one vulnerable element…, Uconnect’s cellular connection also lets anyone who knows the car’s IP address gain access from anywhere in the country.”

  The problem involves much more than smart cars or wearables. Network-connected biometric devices like pacemakers are also at risk, as are insulin pumps and implantable heart regulators. These are not inconsequential problems, and it’s why the U.S. government is keenly interested in the issue. It’s also why the University of Florida has spent so much special attention on cybersecurity. It’s why they’ve hired people from as far away as Iran and India to tackle this complicated problem. They are seeking to find innovative new ways to use IoT devices safely and to find new ways to get by security protocols, so as not to sacrifice usability for security.

  IoT, however, is just one aspect of cybersecurity, and it’s more of an explanation than a method. The different methods for hacking devices usually start from one place—the user.

  This idea of the user is interesting because the human component is the key to any device. Cybersecurity measures are generally made with the user in mind. Perhaps no one is better suited to tackle these issues than Dr. Juan E. Gilbert, a human-centered computing expert.

  

4

JUAN GILBERT AND HIS VOTING MACHINE

  Juan E. Gilbert, PhD, the Andrew Banks Family Preeminence Endowed Professor and Chair and a member of the Florida Institute for Cybersecurity Research, is an expert in human-centered computing. He is a professor in the Computer and Information Science and Engineering Department, and he leads the Human-Experience Research Lab. Also, Gilbert was recently named one of the fifty most important African Americans in technology.

  His work involves how we interact with computers and how they can be used to improve the lives of those around us. His research includes data mining, culturally relevant computing and databases, spoken language systems, usability and accessibility, and advanced learning techniques.

  One of the biggest and most visible projects he’s involved with is an open-source voting machine specifically made for everyone—including the blind, deaf, or disabled. The machine works by allowing voters to vote with a touchscreen or speak into a microphone. It will either display letters for you to read or speak to you. Those who can’t speak can blow into the microphone to make his or her choice.

  Gilbert developed the Prime III with help from research assistants, a National Science Foundation grant, and a $4.5 million grant from the U.S. Election Assistance Commission. The open-source release was funded by the Knight Foundation.

  Voting, by design, is meant to be a private endeavor. For years, disabled persons have not been able to truly vote by themselves because of their limitations. Even in the precinct, they might have someone else telling them what to do, and it could influence them, perhaps because of uncomfortable or uneasy feelings. Many might choose to stay away altogether.

  Ensuring private ballots, Gilbert said, was a challenging proposition. “You have to make sure they can’t be tied back to an individual,” he said. “They have to be secure, and they can’t be modified in the system.”

  How does he ensure that happens? The key is to keep it off the network and use an “old-school paper ballot.” In fact, the Prime III prints out a ballot for the person, which he or she can then use to vote.

  Security concerns are always important and need to be considered heavily. Gilbert said he’s not aware of any elections being stolen by hacking, but he does have evidence of an outcome being changed because of poor interface design. In Sarasota, there was a voting machine with a suspicious number of under votes, and the team thought there was a possibility it might have been hacked. Upon closer inspection, however, it turned out to be an issue with how the ballot was displayed on the screen. Because the line that was highlighted, and drew the eye, was the second contest, many people didn’t notice there was another contest above it.

  It was an issue of design. There has to be a balance between usability, security, and accessibility. When it comes to cybersecurity, Gilbert stresses diligence on the part of the user. He recommends virus protection and being cognizant of phishing attacks. The proliferation of the Internet of Things, he Gilbert enjoys being at UF and being part of such a prolific team working to address these types of issues.

  “We put together an outstanding team that can address hardware, software, and usable security,” he said. “We have a full house in the sense that we cover all aspects of cybersecurity, and we’re very, very good at it now.”

  Gilbert’s words echo throughout most of the conversations I had with members of the cybersecurity team. Another common topic of conversation? The importance of network security.

  

5

HOW DOES A NETWORK WORK?

  Security threats spread through networked computers much easier than, say, a lone desktop with no Internet connection. In fact, before the Internet, computer viruses and damaging programs had to be installed through disks. However, this didn’t necessarily mean they were less likely to be distributed.

  A disgruntled IT employee could easily move from computer to computer in a big company and infect them one at a time.

  Nowadays, most computers are connected to the Internet through networks. In order to understand exactly how and why cybersecurity is such a looming threat, it’s good to get a grasp of the basics, such as how computers connect to each other and share information.

  When we think of computer networks, what comes to mind? Any workplace will generally have a large network of computers connected to each other. That’s usually called a LAN, or a local area network. Generally an office or a school or a library uses a LAN. The computers are linked and typically have peripherals, such as scanners or printers and modems.

  A personal network can be referred to as a PAN, or personal area network. This is the setup that a person would use in a home. There are also MANs, which are metropolitan area networks, and WANs, or wide area networks. The Internet itself can be considered a wide area network.

  Computers need nodes and connections to link up to each other. Linking the nodes enables these connections. Wireless communications are usually the most common, but offices are sometimes linked through cables, which can run through the ceilings or walls. Hardwired cables are faster than wireless connections. But think about a router, which serves as a node. A router is always hardwired to the source of the service provider. The service provider allows connection to the Internet.

  That covers, in simple terms, how computers network to each other and the Internet, but what’s the method for how they connect? about TCP/IP (transmission control protocol/Internet protocol).

  TCP/IP was the result of many years of research and development by the Defense Advanced Research Projects Agency (DARPA). Research began in the late 1960s. The idea was to find a way to communicate over satellite and radio waves. The first iteration was called ARPANET, an early packet switching network and the first to use the protocol. Packet switching is a way of transmitting digital data in appropriate-sized blocks, called packets. The packets are then transmitted and shared.

  Packets have a header and a payload. The header is data at the beginning of the packet. Information in the header is used by the network to direct the packet to where it’s going, and where the payload is used. The payload is the purpose of the message, or transmission. Because of packet switching, data can travel in many different routes, and reassemble when it arrives.

  The Internet is based on this TCP/IP system. Computers hook up over networks using TCP, and

  

  identified by an IP address. The address is a series of digits separated by dots or colons. That way, with catchy names like “ instead of actual IP addresses. Underneath the domain name, there are IP addresses, and machines communicating with each other.

  DNS, or domain name system, enables a machine to look up the IP address for a website. Original

  IP addresses were limited to four pairs of digits divided by periods, but eventually that system needed to be expanded to include more symbols to accommodate the large number of websites all around the world.

  TCP, for its part, sorts out the way the packets move back and forth between sender and recipient (the IP addresses). It gets the data and arranges the packets, sort of like a mailman does. With these frameworks, hackers and adversaries have figured out ways to manipulate the way information is sent, to hide malicious data in the payload, or to censor information based on what’s in the header.

  So what about security of transmitted data? That’s where SSL, or secure sockets layer, comes in. SSL is the standard security technology for establishing an encrypted link between a web browser and a server.

  All this information ties into why cybersecurity is such an important topic. Kevin Butler, associate professor in the Computer and Information Science and Engineering Department, and a key member of UF’s cybersecurity team, explained that when you’re connected to a public network, like one in a grocery store or a coffee shop, it’s important to remain vigilant.

  “You have to be very careful about what’s going on when you’re on an open network,” he said. (An open network is one that doesn’t require a password to join.) “Anytime you send information, and it’s not being sent over SSL—anytime you’re communicating without those protections—it’s possible someone can be ‘sniffing’ the network data.”

  The network, he said, radiates in all directions, and if some of that information goes out without encryption on it, adversaries can potentially capture and steal personal data. A home network is encrypted. We’ll have more on encryption later, but it’s basically a way to protect your data so that no one but your intended recipient can read it. This way, someone walking by can’t jump on your network and spy on you—at least, not without a password.

  Open networks, like the ones at coffee shops or supermarkets, are generally unencrypted—at least, the ones that don’t ask for passwords. When you log onto these networks, your unencrypted network traffic is visible to everyone who knows how to look for it. And if you’re visiting nonencrypted websites without the green lock in the browser bar, which shows that SSL is being used, adversaries can see what you’re doing. If you get an email asking for personal information, or go to a non-SSL dummy site, like a fake bank site meant to look like a real one, personal information can be stolen. On open networks, adversaries can even see what encrypted websites you’re visiting and use that information to mount an attack.

  Now that we have a little better understanding of how computers communicate using networks, let’s look at storage issues in terms of all the data being sent back and forth.

  

6

SERVERS AND THEIR RELATIONSHIP TO CYBERSECURITY

  Data doesn’t just exist in the ether. Every image or word on a computer screen, every video streamed or file downloaded, has to live somewhere. Every time you log onto a website, be it the University of Florida website or the site for National Geographic, the images you see, the words you read, the information for it, it’s all stored somewhere. The same goes for government websites, merchant sites, and everything else, all around the world. All of these things live on servers, and this also makes them vulnerable to cyberattacks. But what exactly do servers do?

  When we think about servers, maybe we picture a cold room with stacks and stacks of hardware in neat rows, like a library. Maybe we think about Hillary Clinton and her personal email server. The truth is the word server is fairly general, and there are many different types of them, for many different purposes, including web hosting, file hosting, faxing, computing, email, and so forth.

  The word itself refers to the act of hosting something that makes things happen—for example, executable programs. A server has the ability to store all of the files of different users on a specific network, so that everyone can have access to the same files.

  For the purposes of cybersecurity, I’m going to focus on the applications of servers in regard to the web. In these terms, a server refers to a specific type of computer, but not the type that goes on a desk with a mouse and a monitor. These are computers with extra powerful processors and super high-speed RAM (random access memory), and multiple hard drives. The combination of all these means the machine runs faster than regular desktops and allows for significantly higher performance capabilities.

  So what can they do? Well, they can store files and manage peripherals. Many focus on only one task, like storage or email. Some only organize specific sets of data. When we talk about Internet use, we’re talking about what is called a client-server model. The person clicking through a website makes a request, and the server provides the results. When someone goes to any large website, there are a myriad of machines that process the request.

  Now, let’s talk about vulnerability. There are quite a few recent examples of compromised servers Washington Free Beacon , hackers linked to the Chinese government are using American servers to conduct cyberattacks against private companies.

  A large investigation by a U.S. security firm has shed some light on the cyberattacks. The firm said the Chinese groups used seven computer-hosting companies to target a European energy firm, a European telecommunications company, and a U.S. air carrier.

  The Chinese simultaneously conduct the attacks while also trying to stay out of the eye of U.S. intelligence and law enforcement trying to track them.

  “It’s like playing whack-a-mole,” said an executive, in the article, at one of the companies who voiced frustration at the difficulties of blocking IP addresses used covertly by Chinese hackers on U.S.-hosted domains. Perhaps one of the most eye-opening revelations in the article is that for many it has been common knowledge that the problems have been going on for years, and that the Chinese are Office of Personnel Management, the article said. , involved the theft of records of millions of former, current, and potential federal employees and contractors.

  The attack started in early 2014. By mid-June, the agency released information about a second, bigger attack that targeted information for millions more Americans, ones who simply applied for security clearances, according to NBC News. The ensuing fallout from the attacks and subsequent disclosures led to the resignation of Katherine Archuleta as director of the OPM. The smoking gun suspected of linking the attack to China is the use of Sakula malware, a program designed to steal large amounts of data of companies located in Los Angeles and one in Nevada.

  Cyberattacks are a cat-and-mouse game. Adversaries are always looking for new ways to steal data or compromise sites and are trying to stay one step ahead of authorities. This ongoing fight is why the gathering of the world’s best researchers at the University of Florida is so important. Attacks happen every day, and it takes teams of talented people to fight back. It takes time and resources, something the University of Florida has shown it’s committed to providing. Many of the researchers and professors I spoke to talked about the caliber of the team that has been put together to tackle this important issue.

  Now that we have a better sense of how servers work, what about the cloud? Is it really a different, new type of storage?

  

7

THE CLOUD AND THE MYTH OF STORAGE IN THE SKY

  The idea of the cloud is fairly new, in terms of Internet and security and storage. When we used to think about our data, or our documents or programs or a hard drive, it would be in terms of physical storage. Everything had to be stored somewhere. Systems needed to be backed up so data wouldn’t be lost. We could keep our family photos and important programs on external hard drives or disks.

  The cloud came along and changed all that. Apple can store all your photos for you. Google can save all your documents for you. In fact, you can back up everything on the cloud. No need for any storage for yourself anymore. But what exactly is the cloud? Does everything just exist in the heavens? In the radio waves? How does it work?

  Let’s start with the terminology. The computing term cloud actually predates the Internet itself, though the exact origin of it is hard to pinpoint. One possible explanation for the term comes from a book called How Google Works by Eric Schmidt and Jonathan Rosenberg, with Alan Eagle. It involves the way old programs function, and how network schematics were drawn by surrounding the icons for servers with a circle, and how a cluster of servers in the diagram usually had lots of overlapping circles that looked like—you guessed it—a cloud.

  It’s also a metaphor, though, for the Internet and how it connects everything together, how the endpoints of the network are numerous and irrelevant, hard to pin down—a cloud. Let’s talk about storage. The first thing to understand is that data does not just exist in the air. Every bit, every word, every Excel sheet—everything has to be stored somewhere. As I said before, the only options used to be physical storage that a person would keep nearby. The word physical is important because it has to be stored somewhere.

  So, no, cloud storage doesn’t have anything to do with weather. What it really refers to is an off- site storage center kept up by whomever the person is paying to do so, be it Google or Apple or whatever company you choose. The key to it all, of course, is the Internet, and with the Internet comes vulnerabilities. Storing data at some off-site facility somewhere arguably isn’t as safe as keeping an external hard drive under your bed. As with anything, there are tradeoffs.

  So what are the advantages? Well, there’s the accessibility, for starters. You can get to your data anywhere that you have Internet access. You don’t have to carry around anything. Also, cloud storage makes it easier to collaborate with other people. If everyone has easy access to storage, it’s much easier to share work.

  Remember those cold rooms filled with rows and rows of servers? Well, that’s basically what cloud storage is: data stored on servers in data centers in a remote location. There are many different types of cloud storage systems. Some have pinpoint focus on certain specific things, like email or pictures or music. Other cloud storage systems are a catchall and take all of your data. Some are smaller, and some, like Amazon, can fill up entire football fields.

  is just one data server. The user sends

  files over the Internet to the server, and the server saves the information. Then there are usually two files on the actual server itself. Most remote storage systems rely on a system of redundancy, meaning they store the data in more than one location. This redundancy is important because computers can break down or fail, and then data is lost. Some users have cloud storage only as backups, others to store extra files. It can be free up to a point. For example, Google will allow a certain amount of storage before it asks you to pay, for each user account. It caps out at fifteen gigabytes (GB).

  To put it in perspective, the smallest unit for storage is a bit. Next is a byte, which is about eight

  

  hundred KB. Next unit up is MB, or megabytes. That would be about one million bytes. Next is GB, and after that is TB, or terabyte, which used to be unheard of but is now more common.

  Many companies provide many different services, but the truth is that the more services there are, the more opportunities for security breaches and stolen data. Who are the biggest companies? Social media sites aren’t typically thought of as storage centers, but let’s take a moment to think about it. All of your pictures posted and statuses and updates have to be saved somewhere, right?

  Facebook is actually a really good example of what it takes to store huge amounts of data.

   , with more than one trillion page views each month, and the site is responsible for 9 percent of all Internet traffic, slightly edging out Google.

  This gargantuan size requires equally goliath-like data centers, as the site continues to grow, and its

  

Texas. It’s a huge undertaking. The data centers are typically the size of two Walmarts.

  What about Google? It’s the same thing, in terms of size. Emails, pictures sent in emails, documents, and spreadsheets tied to a person’s Google account are all saved somewhere. And of course, Google handles millions of search queriesthat it had a total of 900 million users using its Gmail service, up from 425 million in 2012. Also, 75 percent of those users access accounts on mobile devices.

  Google also has huge data centers all over the world. Other smaller sites—specialized ones like Flickr or grandparents. com—all have to live somewhere, and that somewhere is in machines in data centers—that is, the cloud.

  Security is obviously the biggest concern of storing data offsite. A couple of security methods are in place. Encryption is a big one. It basically involves using an algorithm, a very complex one, to encode information. It requires ridiculous amounts of processing power to crack encryption, so hackers and adversaries generally don’t try to crack it as much as they try to steal the key, be it a password or something similar. With the key, you don’t need to break the code. Having a username and password is another form of security, but an efficient one, because it also helps organize your own data. That process is called authentication. There’s also authorization, where access to files on a server can differ based on employee rank. There is always a risk of theft and disruption, just like in a real-life storage center that houses paper files.

  Because of these inherent issues, the University of Florida has well-qualified researchers working on the problems associated with cloud security. For example, Kevin Butler, the associate professor who warned us about the dangers of open networks, has made cloud security one of the main thrusts of his many-pronged research avenues.

  “We’ve been looking at a cloud provenance solution that can make cloud computing safer and more trustworthy,” he said. “Once you’re putting data in someone else’s system, how do we know what’s happening with the data? Provenance in cloud environments ensures trustworthiness.”

  He’s working on a project involving putting a virtual machine on the cloud and also attempting, in his research, to make cloud computing a little safer, in terms of where the data actually is and making sure that it’s there where the provider says it is.

  “Everything is in a computer somewhere,” he said. “Ultimately, the cloud is about getting your info to some remote location where it can be backed up and expanded.”

  

8

KEVIN BUTLER AND THE IMPORTANCE OF DATA PROVENANCE

  Kevin Butler is an associate professor in the Department of Computer and Information Science and Engineering, a Preeminence hire, and the recipient of a National Science Foundation CAREER award in 2013. His work in cybersecurity is varied and includes storage systems, large-scale systems architectures, and networks.

  Like most of the other members of the team at the Florida Institute for Cybersecurity Research, Butler is straightforward and eager to explain the type of work he does. Many of the members of the team, though they’re well known and respected in their field, are like this—happy to talk and discuss matters of cybersecurity and grateful to get to work in the environment they do.

  “In general, my research is about the trustworthiness of data and devices,” he said. “How do we ensure we’re talking to the right computer, and how do we know it’s trustworthy? How can we assume it’s from the right place?”

  This idea of provenance shows the depth of the cybersecurity research being done at the university. They really are attacking the issue from all angles within the discipline. So why is data provenance so important? Butler likens it to painting. “If you look at a Rembrandt, you’ll be able to trace the history and who was in charge of it and who it was sold to for fifty, one hundred, four hundred years. That’s how you know you’ve got the real deal. In the art world, that’s how you would show the article is genuine.”

  Trustworthiness is of utmost importance because we need to make sure adversaries aren’t tampering with data and compromising security. This is especially important for military and government organizations in the United States, especially since the majority of companies that manufacture chips are not located in the United States.

  The applications of this, Butler said, are many, including using provenance to prevent data loss from companies. For example, if an employee or someone steals a piece of data, the company has a record of the information about where the data came from as well as a way of tracking where it went, and who accessed it.

  This capability could be used, for example, in hospitals, to make sure no private and protected patient data is leaving the network. Traceability, he said, would be the ultimate goal, although the process itself is fairly new.

  “We’ve created a software that runs in the operating system below the layer of what’s running,” he said. “It’s collecting info every time you open or log in or perform an operation, silently at a layer below the standard system, at the heart of the OS.”

  The software, which runs in the Linux operating system, acts as a sort of log and monitors the data. Butler has gotten lots of attention for his work, including from the MIT Lincoln Laboratory, where they’ve used his software in a prototype system. One of his main goals is to make cloud computing more secure as well. This is a difficult

  “Part of the problem we have with computers is that we don’t know what’s going on inside them,” he said. “They’re so complicated and have millions of lines of source code and dozens of apps. The web browser alone is so much source code. They’re sort of opaque to us.”

  Because our machines are so powerful and the things we’re able to do with them are so complex, being able to trace data is a “shockingly hard problem.” The sheer amount of data is overwhelming, so there needs to be a way to keep track of it and catalogue it.

  Butler said he and his team have a good system for single computers, but once the data is there, what comes next? They’re also working on ways to reduce the amount of data they collect. In an early prototype, a computer running for only ten minutes created four GBs of provenance data. Since then, Butler’s team has found ways to cut this number by up to 90 percent.

  Still, with these issues to iron out, he’s looking to what comes next. “The challenge is how do we expand this from a single computer,” he said. For example, how can it be expanded to more computer-enabled environments, like a power grid for an electric company?

  Energy, he said, is a field where cybersecurity is becoming more and more important, because a lot of it is on a network and therefore vulnerable.

  “How can we use these provenance techniques to find out what is going on in the power system?” What needs to be done to prevent security lapses? He hopes to use his techniques to generate information and start to answer some of these questions.

  He’s working on techniques to represent the data in ways that are easy to access. Think of provenance, he said, as a mathematical graph. Events are linked to others in connected pathways, but the problem is making sure the representations are efficient and compact enough to be doable. Data that has four million graphs obviously takes up too much memory.