Essential Discrete Mathematics for Computer Science [2019]

  ESSENTIAL DISCRETE MATHEMATICS FOR COMPUTER SCIENCE

  ESSENTIAL DISCRETE MATHEMATICS FOR COMPUTER SCIENCE Harry Lewis and Rachel Zax

  P R I N C E T O N U N I V E R S I T Y P R E S S P R I N C E T O N A N D O X F O R D Copyright c 2019 by Harry Lewis and Rachel Zax Requests for permission to reproduce material from this work should be sent to Published by Princeton University Press

41 William Street, Princeton, New Jersey 08540

  6 Oxford Street, Woodstock, Oxfordshire OX20 1TR All Rights Reserved LCCN

  ISBN 978-0-691-17929-2 British Library Cataloging-in-Publication Data is available Editorial: Vickie Kearn and Arthur Werneck Production Editorial: Kathleen Cioffi Jacket Design: Lorraine Doneker Jacket/Cover Credit: Production: Erin Suydam Publicity: Alyssa Sanford Copyeditor: Alison S. Britton This book has been composed in MinionPro Printed on acid-free paper. ∞ Printed in the United States of America

  To Alexandra, Stella, Elizabeth, and Annie and to David, Marcia, Ben, and Aryeh

  An engineer is said to be a man who knows a great deal about a very little, and who goes around knowing more and more, about less and less, until finally, he practically knows everything about nothing; whereas, a Salesman, on the other hand, is a man who knows a very little about a great deal, and keeps on knowing less and less about more and more until finally he knows practically nothing, about everything.

  Van Nuys, California, News, June 26, 1933

  CONTENTS xi

  

   233

   211

   201

   187

   179

   173

   161

   151

   141

   133

   119

   111

   101

  89

  79

  

  

  69

  

  59

  

  49

  

  39

  

  25

  

  11

  

  1

   243

  x contents

   261

   277

   297

   311

   323

   335

   359

   371

  381 PREFACE Τοῦ δὲ ποσοῦ τὸ μέν ἐστι διωρισμένον, τὸ δὲ συνεχες.

  As to quantity, it can be either discrete or continuous.

  —Aristotle, Categories (ca. 350 BCE)

  This introductory text treats the discrete mathematics that computer scien- tists should know but generally do not learn in calculus and linear algebra courses. It aims to achieve breadth rather than depth and to teach reasoning as well as concepts and skills.

  We stress the art of proof in the hope that computer scientists will learn to think formally and precisely. Almost every formula and theorem is proved in full. The text teaches the cumulative nature of mathematics; in spite of the breadth of topics covered, seemingly unrelated results in later chapters rest on concepts derived early on.

  The text requires precalculus and occasionally uses a little bit of calculus.

  Chapter 21, on order notation, uses limits, but includes a quick summary of the needed basic facts. Proofs and exercises that use basic facts about deriva- tives and integrals, including l’Hôpital’s rule, can be skipped without loss of continuity.

  A fast-paced one-semester course at Harvard covers most of the material in this book. That course is typically taken by freshmen and sophomores as a prerequisite for courses on theory of computation (automata, computabil- ity, and algorithm analysis). The text is also suitable for use in secondary schools, for students of mathematics or computer science interested in topics that are mathematically accessible but off the beaten track of the standard curriculum.

  The book is organized as a series of short chapters, each of which might be the subject of one or two class sessions. Each chapter ends with a brief summary and about ten problems, which can be used either as homework or as in-class exercises to be solved collaboratively in small groups.

  Instructors who choose not to cover all topics can abridge the book in several ways. The spine of the book includes Chapters 1–8 on foundational concepts, Chapters 13–18 on digraphs and graphs, and Chapters 21–25 on order notation and counting. Four blocks of chapters are optional and can be included or omitted at the instructor’s discretion and independently of each other:

  Chapters 9–12 on logic; • Chapters 19–20 on automata and formal languages; •

  xii preface

  Chapters 26–29 on discrete probability; and • Chapters 30–31 on modular arithmetic and cryptography. •

  None of these blocks, if included at all, need be treated in full, since only later chapters in the same block rely on the content of chapters earlier in the block.

  It has been our goal to provide a treatment that is generic in its tastes and therefore suitable for wide use, without the heft of an encyclopedic textbook. We have tried throughout to respect our students’ eagerness to learn and also their limited budgets of time, attention, and money.

  .

  With thanks to the CS20 team: Deborah Abel, Ben Adlam, Paul Bamberg, Hannah Blumberg, Crystal Chang, Corinne Curcie, Michelle Danoff, Jack Dent, Ruth Fong, Michael Gelbart, Kirk Goff, Gabriel Goldberg, Paul Handorff, Roger Huang, Steve Komarov, Abiola Laniyonu, Nicholas Longenbaugh, Erin Masatsugu, Keenan Monks, Anupa Murali, Eela Nagaraj, Rebecca Nesson, Jenny Nitishinskaya, Sparsh Sah, Maria Stoica, Tom Silver, Francisco Trujillo, Nathaniel Ver Steeg, Helen Wu, Yifan Wu, Charles Zhang, and Ben Zheng; to Albert Meyer for his generous help at the start of CS20; and to Michael Sobin, Scott Joseph, Alex Silverstein, and Noam Wolf for their critiques and support during the writing.

  Harry Lewis and Rachel Zax, June 2018 ESSENTIAL DISCRETE MATHEMATICS FOR COMPUTER SCIENCE

Chapter 1 The Pigeonhole Principle How do we know that a computer program produces the right results? How

  do we know that a program will run to completion? If we know it will stop eventually, can we predict whether that will happen in a second, in an hour, or in a day? Intuition, testing, and “it has worked OK every time we tried it” should not be accepted as proof of a claim. Proving something requires formal reasoning, starting with things known to be true and con- necting them together by incontestable logical inferences. This is a book about the mathematics that is used to reason about the behavior of computer programs.

  The mathematics of computer science is not some special field. Com- puter scientists use almost every branch of mathematics, including some that were never thought to be useful until developments in computer science created applications for them. So this book includes sections on mathemat- ical logic, graph theory, counting, number theory, and discrete probability theory, among other things. From the standpoint of a traditional mathemat- ics curriculum, this list includes apples and oranges. One common feature of these topics is that all prove useful in computer science. Moreover, they are all discrete mathematics, which is to say that they involve quantities that change in steps, not continuously, or are expressed in symbols and structures rather than numbers. Of course, calculus is also important in computer sci- ence, because it assists in reasoning about continuous quantities. But in this book we will rarely use integrals and derivatives.

  .

  One of the most important skills of mathematical thinking is the art of generalization. For example, the proposition

  ?

  There is no triangle with sides of lengths 1, 2, and 6

  2

  1

  is true, but very specific (see Figure The sides of lengths 1 and 2 would

  6

  have to join the side of length 6 at its two ends, but the two short sides

Figure 1.1. Can there be a triangle

  2 essential discrete mathematics for computer science

  A more general statement might be (Figure

  ? b

  There is no triangle with sides of lengths a, b, and c if a, b, c are

  a any numbers such that a + b ≤ c. c

Figure 1.2. There is no triangle with

  The second form is more general because we can infer the first from the sides of lengths a, b and c if a + b ≤ c. second by letting a = 1, b = 2, and c = 6. It also covers a case that the pic- ture doesn’t show—when a + b = c, so the three “corners” fall on a straight line. Finally, the general rule has the advantage of not just stating what is impossible, but explaining it. There is no 1 − 2 − 6 triangle because 1 + 2 ≤ 6.

  So we state propositions in general form for two reasons. First, a propo- sition becomes more useful if it is more general; it can be applied with confidence in a greater variety of circumstances. Second, a general propo- sition makes it easier to grasp what is really going on, because it leaves out irrelevant, distracting detail.

  .

  As another example, let’s consider a simple scenario.

  Annie, Batul, Charlie, Deja, Evelyn, Fawwaz, Gregoire, and Hoon talk to each other and discover that Deja and Gregoire were both born on Tuesdays. (1.1)

  Well, so what? Put two people together and they may or may not have been born on the same day of the week. Yet there is something going on here that can be generalized. As long as there are at least eight people, some two of them must have been born on the same day of the week, since a week has only seven days. Some statement like must be true, perhaps with a different pair of names and a different day of the week. So here is a more general proposition.

  In any group of eight people, some two of them were born on the same day of the week. But even that isn’t really general. The duplication has nothing to do with properties of people or days of the week, except how many there are of each. For the same reason, if we put eight cups on seven saucers, some saucer would have two cups on it. In fact there is nothing magic about “eight” and “seven,” except that the one is larger than the other. If a hotel has 1000 rooms and 1001 guests, some room must contain at least two guests. How can we state a general principle that covers all these cases, without mentioning the irrelevant specifics of any of them?

  First, we need a new concept. A set is a collection of things, or elements. The elements that belong to the set are called its members. The members of the pigeonhole principle

  3

  from each other. So the people mentioned in form a set, and the days of the week form another set. Sometimes we write out the members of a set explicitly, as a list within curly braces {}:

  P = {Annie, Batul, Charlie, Deja, Evelyn, Fawwaz, Gregoire, Hoon} D = {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday}.

  When we write out the elements of a set, their order does not matter—in any order it is still the same set. We write x ∈ X to indicate that the element x is a member of the set X. For example, Charlie ∈ P and Thursday ∈ D.

  We need some basic terminology about numbers in order to talk about sets. An integer is one of the numbers 0, 1, 2, . . . , or −1, −2, . . . . The real numbers are all the numbers on the number line, including all the integers

  √

  1

  and also all the numbers in between integers, such as 2, and π . A num- , −

  2

  ber is positive if it is greater than 0, negative if it is less than 0, and nonnegative if it is greater than or equal to 0.

  For the time being, we will be discussing finite sets. A finite set is a set that can (at least in principle) be listed in full. A finite set has a size or cardinality, which is a nonnegative integer. The cardinality of a set X is denoted |X|. For example, in the example of people and the days of the week on which they were born, |P| = 8 and |D| = 7, since eight people are listed and there are seven days in a week. A set that is not finite—the set of integers, for example—is said to be infinite. Infinite sets have sizes too—an interesting subject to which we will return in our discussion of infinite sets in Chapter 7.

  Now, a function from one set to another is a rule that associates each member of the first set with exactly one member of the second set. If f is a function from X to Y and x ∈ X, then f (x) is the member of Y that the function f associates with x. We refer to x as the argument of f and f (x) as the value of f on that argument. We write f : X → Y to indicate that f is a function from set X to set Y. For example, we could write b : P → D to denote the function that associates each of the eight friends with the day of the week on which he or she was born; if Charlie was born on a Thursday, then b(Charlie) = Thursday.

  A function f : X → Y is sometimes called a mapping from X to Y, and f is said to map an element x ∈ X to the element f (x) ∈ Y. (In the same way, a real map associates a point on the surface of the earth with a point on a sheet of paper.)

  Finally, we have a way to state the general principle that underlies the example of If f : X → Y and |X| > |Y|, then there are elements x , x and f (x ) ) . (1.2)

  ∈ X such that x = x = f (x

  1

  2

  1

  2

  1

  2 The statement is known as the Pigeonhole Principle, as it captures in

  4 essential discrete mathematics for computer science

  pigeonholes and every pigeon goes into a pigeonhole, then some pigeonhole

  ?

  must have more than one pigeon in it. The pigeons are the members of X and the pigeonholes are the members of Y (Figure

  We will provide a formal proof of the Pigeonhole Principle on page once we have developed some of the basic machinery for doing proofs. For now, let’s scrutinize the statement of the Pigeonhole Principle with an eye

  X Y

  toward understanding mathematical language. Here are some questions we

Figure 1.3. The Pigeonhole

  might ask:

  Principle. If |X| > |Y| and f is any function from X to Y, then the

  1. What are X and Y? values of f must be the same for some two distinct members of X. They are finite sets. To be absolutely clear, we might have begun the

  statement with the phrase, “For any finite sets X and Y,” but the assertion that f is a function from X to Y makes sense only if X and Y are sets, and it is understood from context that the sets under discussion are finite—and we therefore know how to compare their sizes.

  2. Why did we choose “x ” and “x ” for the names of elements of X?

  1

  2 We could in principle have chosen any variables, “x” and “y” for

  example. But using variations on “X” to name elements of the set X suggests that x and x are members of the set X rather than the set Y.

  1

  2 So using “x ” and “x ” just makes our statement easier to read.

  1

  

2

  3. Was the phrase “such that x ” really necessary? The sentence is 1 = x

  2 simpler without it, and seems to say the same thing.

  Yes, the “x ” is necessary, and no, the sentence doesn’t say the

  1 = x

  2

  same thing without it! If we didn’t say “x ,” then “x ” and “x ”

  1 = x

  2

  1

  2

  could have been two names for the same element. If we did not stipulate that x and x had to be different, the proposition would not

  1

  2

  ) ) have been false—only trivial! Obviously if x , then f (x .

  1 = x

  2 1 = f (x

  2 That is like saying that the mass of Earth is equal to the mass of the

  third planet from the sun. Another way to state the Pigeonhole Principle would be to say, “there are distinct elements x , x

  1 2 ∈ X such

  ) ) that f (x .”

  1 = f (x

2 One more thing is worth emphasizing here. A statement like “there are

  elements x , x

  1 2 ∈ X with property blah” does not mean that there are exactly

  two elements with that property. It just means that at least two such elements exist for sure—maybe more, but definitely not less.

  .

  Mathematicians always search for the most general form of any principle, because it can then be used to explain more things. For example, it is equally obvious that we can’t put 15 pigeons in 7 pigeonholes without putting at least 3 pigeons in some pigeonhole—but there is no way to derive that from the Pigeonhole Principle as we stated it. Here is a more general version: the pigeonhole principle

  5 Theorem 1.3. Extended Pigeonhole Principle. For any finite sets X and Y

  and any positive integer k such that |X| > k · |Y|, if f : X → Y, then there are at , . . . , x ) ) . least k + 1 distinct members x

  1 ∈ X such that f (x 1 = . . . = f (x k+1 k+1

  The Pigeonhole Principle is the k = 1 case of the Extended Pigeonhole Principle. We have used sequence notation here for the first time, using the same variable with numerical subscripts in a range. In this case the x , where

  i

  1 ≤ i ≤ k + 1, form a sequence of length k + 1. This notation is very conve- nient since it makes it possible to use an algebraic expression such as k + 1

  th

  in a subscript. Similarly, we could refer to the 2i member of a sequence y , y , . . . as y .

  1 2 2i

  The minimum value of the parameter k in the Extended Pigeonhole Prin- ciple, as applied to particular sets X and Y, can be derived once the sizes of X and Y are known. It is helpful to introduce some notation to make this calculation precise.

  An integer p divides another integer q, symbolically written as p | q, if the

  q

  quotient is an integer—that is, dividing q by p leaves no remainder. We

  p

  write p ∤ q if p does not divide q—for example, 3 ∤ 7. If x is any real number, we write ⌊x⌋ for the greatest integer less than or equal to x (called the floor of x).

  17

  6

  ⌋ = 5, and ⌊ ⌋ = 3. We will also need the ceiling notation: For example, ⌊

  3

  2 ⌈x⌉ is the smallest integer greater than or equal to x, so for example ⌈3.7⌉ = 4.

  With the aid of these notations, we can restate the Extended Pigeonhole Principle in a way that determines the minimum size of the most heavily occupied pigeonhole for given numbers of pigeons and pigeonholes:

  Theorem 1.4. Extended Pigeonhole Principle, Alternate Version. Let X and

  Y be any finite sets and let f : X → Y. Then there is some y ∈ Y such that f (x) = y for at least |X|

  |Y| values of x. Proof. Let m = |X| and n = |Y|. If n | m, then this is the Extended Pigeonhole

  m m

  − 1 = − 1. If n ∤ m, then again this is the Extended Principle with k =

  n n m

  Pigeonhole Principle with k = ⌈ ⌉ − 1, since that is the largest integer less

  n |X|

  than .

  ■ |Y| .

  Once stated in their general form, these versions of the Pigeonhole Prin- ciple seem to be fancy ways of saying something obvious. In spite of that, we can use them to explain a variety of different phenomena—once we

  6 essential discrete mathematics for computer science

  application to number theory—the study of the properties of the integers. A few basics first.

  If p | q, then p is said to be a factor or divisor of q. A prime number is an integer greater than 1 that is divisible only by itself and 1. For example, 7 is prime, because it is divisible only by 7 and 1, but 6 is not prime, because 6 = 2 · 3. Note that 1 itself is not prime.

  Theorem 1.5. The Fundamental Theorem of Arithmetic. There is one and

  only one way to express a positive integer as a product of distinct prime numbers in increasing order and with positive integer exponents.

  We’ll prove this theorem in Chapter 4, but make some use of it right now. The prime decomposition of a number n is that unique product

  e 1 e k

  , (1.6) n = p · . . . · p

  1 k

  where the p are primes in increasing order and the e are positive integers.

  i i e e 1 k

  2

  

2

  1

  , and there is no other product p · 3 · 5 · . . . · p

  For example, 180 = 2

  1 k

  < < . . . < equal to 180, where p p p , all the p are prime, and the e are

  1 2 i i k

  integer exponents.

  The prime decomposition of the product of two integers m and n com- bines the prime decompositions of m and of n—every prime factor of m · n is a prime factor of one or the other.

  Theorem 1.7. If m, n, and p are integers greater than 1, p is prime, and p | m · n, then either p | m or p | n.

  Proof. By the Fundamental Theorem of Arithmetic (Theorem there is one and only one way to write

  e e 1 k

  , · . . . · p m · n = p

  1 k

  where the p are prime. But then p must be one of the p , and each p must

  i i i

  appear in the unique prime decomposition of either m or n. ■ The exponent of a prime p in the prime decomposition of m · n is the sum of its exponents in the prime decompositions of m and n (counting the exponent as 0 if p does not appear in the decomposition). For example, consider the product 18 · 10 = 180. We have

  1

  2

  (exponents of 2, 3, 5 are

  1 , 2 , )

  · 3 18 = 2

  1

  1

  ) (exponents of 2, 3, 5 are

  1 , ,

  1

  · 5 10 = 2

  2

  2

  1

  · 3 · 5 180 = 2

  1

  1

  2

  1

  • .

  = 2 · 3 · 5 the pigeonhole principle

  7 We have color-coded the exponents to show how the exponents of 2, 3, and 5

  in the product 180 are the sums of the exponents of those primes in the decompositions of the two factors 18 and 10.

  Another important fact about prime numbers is that there are infinitely many of them.

  Theorem 1.8. There are arbitrarily large prime numbers.

  “Arbitrarily large” means that for every n > 0, there is a prime number greater than n. Proof. Pick some value of k for which we know there are at least k primes, and let p , . . . , p be the first k primes in increasing order. (Since p

  1 k 1 = 2,

  p

  2 = 3, p 3 = 5, we could certainly take k = 3.) We’ll show how to find a prime

  number greater than p . Since this process could be repeated indefinitely,

  k there must be infinitely many primes.

  Consider the number N that is one more than the product of the first k primes: ) (1.9)

  N = (p

  1 · p 2 · . . . · p k + 1.

  Dividing N by any of p , . . . , p would leave a remainder of 1. So N has no

  1 k

  prime divisors less than or equal to p . Therefore, either N is not prime but

  k

  has a prime factor greater than p , or else N is prime itself. ■

  k

  In the k = 3 case, for example, N = 2 · 3 · 5 + 1 = 31. Here N itself is prime; Problem asks you to find an example of the case in which N is not prime.

  A common divisor of two numbers is a number that divides both of them. For example, 21 and 36 have the common divisors 1 and 3, but 16 and 21 have no common divisor greater than 1.

  With this by way of background, let’s work a number theory example that uses the Pigeonhole Principle.

  Example 1.10. Choose m distinct numbers between 2 and 40 inclusive, where

  m ≥ 13. Then at least two of the numbers have some common divisor greater than 1.

  “Between a and b inclusive” means including all numbers that are ≥ a and also ≤ b—so including both 2 and 40 in this case. Solution to example. Observe first that there are 12 prime numbers less than or equal to 40: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, no two of which share a factor greater than 1. Let’s define P to be this set of 12 prime numbers.

  8 essential discrete mathematics for computer science

  ■ A sequence of terms can be denoted by a repeated variable with different numerical subscripts, such as x 1 , . . . , x n . The subscript of a term may be an algebraic expression.

  (d) The set of divisors of 100. (e) The set of prime divisors of 100.

  5 111 .

  (c)

  5 .

  (b) 111

  Problems

  ■ The Fundamental Theorem of Arithmetic states that every positive integer has exactly one prime decomposition.

  ■ The Extended Pigeonhole Principle states that if X is a set of pigeons and Y a set of pigeonholes, and |X| > k|Y|, then any function mapping pigeons to pigeonholes assigns more than k pigeons to some pigeonhole.

  m = 12 instead: the set P would be a counterexample.) Now consider a set X of 13 numbers in the range from 2 to 40 inclusive. We can think of the members of X as pigeons and the members of P as pigeonholes. To place pigeons in pigeonholes, use the function f : X → P, where f (x) is the smallest prime that divides x. For example, f (16) = 2, f (17) = 17, and f (21) = 3. By the Pigeonhole Principle, since 13 > 12, the values of f must be equal for two distinct members of X, and therefore at least two members of X have a common prime divisor.

  ■ The Pigeonhole Principle states that if X is a set of pigeons and Y a set of pigeonholes, and |X| > |Y|, then any function mapping pigeons to pigeonholes assigns more than one pigeon to some pigeonhole.

  ■ A function or mapping between two sets is a rule associating each member of the first set with a unique member of the second.

  A set’s size is always a nonnegative integer.

  ■ A set is finite if its members can be listed in full one by one. The number of members of a finite set X is called its cardinality or size and is denoted |X|.

  ■ A set is an unordered collection of distinct things, or elements. The elements of a set are its members.

  ■ Mathematical thinking focuses on general principles, abstracted from the details of specific examples.

  ■ Chapter Summary

1.1. What are each of the following? (a) |{0, 1, 2, 3, 4, 5, 6}|.

  the pigeonhole principle

  9

  1.2. Let f (n) be the largest prime divisor of n. Can it happen that x < y but f (x) > f (y)? Give an example or explain why it is impossible.

  1.3. Under what circumstances is ⌊x⌋ = ⌈x⌉ − 1?

  1.4. Imagine a 9 × 9 square array of pigeonholes, with one pigeon in each pigeonhole. (So 81 pigeons in 81 pigeonholes—see Figure ) Suppose that all at once, all the pigeons move up, down, left, or right by one hole. (The pigeons on the edges are not allowed to move out of the array.) Show that some pigeon- hole winds up with two pigeons in it. Hint: The number 9 is a distraction. Try

some smaller numbers to see what is going on.

  1.5. Show that in any group of people, two of them have the same number of friends in the group. (Some important assumptions here: no one is a friend of him- or herself, and friendship is symmetrical—if x is a friend of y then y is a friend of x.)

  1.6. Given any five points on a sphere, show that four of them must lie within a

Figure 1.4. Each pigeonhole in a

  closed hemisphere, where “closed” means that the hemisphere includes the circle 9 × 9 array has one pigeon. All that divides it from the other half of the sphere. Hint: Given any two points on a simultaneously move to another sphere, one can always draw a “great circle” between them, which has the same pigeonhole that is immediately circumference as the equator of the sphere. above, below, to the left, or to the right of its current hole. Must some

  1.7. Show that in any group of 25 people, some three of them must have pigeonhole wind up with two birthdays in the same month. pigeons?

  1.8. A collection of coins contains six different denominations: pennies, nick- els, dimes, quarters, half-dollars, and dollars. How many coins must the collection contain to guarantee that at least 100 of the coins are of the same denomination?

  1.9. Twenty-five people go to daily yoga classes at the same gym, which offers eight classes every day. Each attendee wears either a blue, red, or green shirt to class. Show that on a given day, there is at least one class in which two people are wearing the same color shirt.

  1.10. Show that if four distinct integers are chosen between 1 and 60 inclusive,

some two of them must differ by at most 19.

  1.11. Find a k such that the product of the first k primes, plus 1, is not prime, but has a prime factor larger than any of the first k primes. (There is no trick for solving this. You just have to try various possibilities!)

  1.12. Show that in any set of 9 positive integers, some two of them share all of their prime factors that are less than or equal to 5.

  1.13. A hash function from strings to numbers derives a numerical hash value h(s) from a text string s; for example, by adding up the numerical codes for the characters in s, dividing by a prime number p, and keeping just the remainder.

  The point of a hash function is to yield a reproducible result (calculating h(s) twice for the same string s yields the same numerical value) and to make it likely that the hash values for different strings will be spread out evenly across the possible hash values (from 0 to p − 1). If the hash function has identical hash values for two different strings, then these two strings are said to collide on that

  10 essential discrete mathematics for computer science hash value. We count the number of collisions on a hash value as 1 less than the number of strings that have that hash value, so if 2 strings have the same hash value there is 1 collision on that hash value. If there are m strings and p possible hash values, what is the minimum number of collisions that must occur on the hash value with the most collisions? The maximum number of collisions that might occur on some hash value?

Chapter 2 Basic Proof Techniques Here is an English-language restatement of the Pigeonhole Principle (page 3): If there are more pigeons than pigeonholes and every pigeon goes

  into a pigeonhole, then some pigeonhole must contain more than one pigeon. But suppose your friend did not believe this statement. How could you convincingly argue that it was true?

  You might try to persuade your friend that there is no way the opposite could be true. You could say, let’s imagine that each pigeonhole has no more than one pigeon. Then we can count the number of pigeonholes, and since each pigeonhole contains zero or one pigeons, the number of pigeons can be at most equal to the number of pigeonholes. But we started with the assump- tion that there were more pigeons than pigeonholes, so this is impossible! Since there is no way that every pigeonhole can have at most one pigeon, some pigeonhole must contain more than one pigeon, and that is what we were trying to prove.

  In this chapter, we’ll discuss how to take informal, specific arguments like this and translate them into formal, general, mathematical proofs. A proof is an argument that begins with a proposition (“there are more pigeons than pigeonholes”) and proceeds using logical rules to establish a conclusion (“some pigeonhole has more than one pigeon”). Although it may seem easier to write (and understand!) an argument in plain English, ordinary language can be imprecise or overly specific. So it is clearer, as well as more general, to describe a mathematical situation in more formal terms.

  For example, what does the statement Everybody loves somebody (2.1) mean? It might mean that for every person in the world, there is someone whom that person loves—so different lovers might have different beloveds. In semi-mathematical language, we would state that interpretation as

  12 essential discrete mathematics for computer science

  For every person A, there is a person B such that A loves B. (2.2) But there is another interpretation of namely that there is some special person whom everybody loves, or in other words,

  There is a person B such that for every person A, A loves B. (2.3) There is a big difference between these interpretations, and one of the pur- poses of mathematical language is to resolve such ambiguities of natural language.

  The phrases “for all,” “for any,” “for every,” “for some,” and “there exists” are called quantifiers, and their careful use is an important part of mathe- matical discourse. The symbol ∀ stands for “for all,” “for any,” or “for every,” and the symbol ∃ stands for “there exists” or “for some.” Using these symbols saves time, but in writing mathematical prose they can also make statements more confusing. So we will avoid them until we discuss the formalization of quantificational logic, in Chapter 12.

  Quantifiers modify predicates, such as “A loves B.” A predicate is a tem- plate for a proposition, taking one or more arguments, in this case A and B. On its own, a predicate has no truth value: without knowing the values of A and B, “A loves B” cannot be said to be either true or false. It takes on a truth value only when quantified (as in or when applied to specific arguments (for example, “Romeo loves Juliet”), and may be true for some arguments but false for others.

  Let’s continue with a simple example of a mathematical statement and its proof.

  Theorem 2.4. Odd Integers. Every odd integer is equal to the difference between the squares of two integers.

  First, let’s make sure we understand the statement. An odd integer is any integer that can be written as 2k + 1, where k is also an integer. The square of

  2

  = an integer n is n n · n. For every value of k, Theorem says that there are two integers—call them m and n—such that if we square them and subtract one result from the other, the resulting number is equal to 2k + 1. (Note the quantifiers: for every k, there exist m and n, such that . . . .)

  An integer m is said to be a perfect square if it is the square of some integer, so a compact way to state the theorem is to say that every odd integer is the difference of two perfect squares. It is typical in mathematics, as in this case, that defining just the right concepts can result in very simple phrasing of general truths.

  The next step is to convince ourselves of why the statement is true. If the reason is not obvious, it may help to work out some examples. So let’s start by listing out the first few squares: basic proof techniques

  13

  2

  = 0

  2

  1 = 1

  2

  2 = 4

  2

  3 = 9

  2

  4 = 16. We can confirm that the statement is true for a few specific odd integers, say 1, 3, 5, and 7:

  2

  2

  − 0 1 = 1 − 0 = 1

  2

  2

  − 1 3 = 4 − 1 = 2

  2

  2

  − 2 5 = 9 − 4 = 3

  2

  2 .

  − 3 7 = 16 − 9 = 4 After these examples we might notice a pattern: so far, all of the odd integers are the difference between the squares of two consecutive integers: 0 and 1, then 1 and 2, 2 and 3, and 3 and 4. Another observation—those consecutive integers add up to the target odd integer: 0 + 1 = 1, 1 + 2 = 3, 2 + 3 = 5, and 3 + 4 = 7. So we might conjecture that for the odd integer 2k + 1, the integers that should be squared and subtracted are k + 1 and k. Let’s try that:

  2

  2

  

2

  2

  ( , − k = k + 2k + 1 − k k + 1) which simplifies to 2k + 1.

  So our guess was right! And by writing it out using the definition of an odd integer (2k + 1) rather than looking at any specific odd integer, we’ve confirmed that it works for all odd integers. It even works for negative odd integers (since those are equal to 2k + 1 for negative values of k), although the idea was inspired by trying examples of positive odd integers.

  This chain of thought shows how we might arrive at the idea, but it’s too meandering for a formal proof. The actual proof should include only the details that turned out to be relevant. For instance, the worked-out examples don’t add anything to the argument—we need to show that the statement is true all the time, not just for the examples we tried—so they should be left out. Here is a formal proof of Theorem Proof. Any odd integer can be written as 2k + 1 for some integer k. We can rewrite that expression:

  2

  2

  2

  (adding and subtracting k )

  • 2k + 1) − k 2k + 1 = (k

  2

  2 (writing the first term as a square).

  = (k + 1) − k

  14 essential discrete mathematics for computer science

  2

  2

  , so we have identified Now let m = k + 1 and n = k. Then 2k + 1 = m − n integers m and n with the properties that we claimed. ■

  Before examining the substance of this proof, a few notes about its style. First, it is written in full sentences. Mathematical expressions are used for precision and clarity, but the argument itself is written in prose. Second, its structure is clear: it starts with the given assumption, that the integer is odd, and clearly identifies when we have reached the end, by noting that m and n are the two integers that we sought. Third, it is rigorous: it gives a mathematical definition for the relevant term (“odd integer”), which forces us to be precise and explicit; and each step of the proof follows logically and clearly from the previous steps.

  Finally, it is convincing. The proof gives an appropriate amount of detail, enough that the reader can easily understand why each step is correct, but not so much that it distracts from the overall argument. For example, we might have skipped writing out some of the arithmetic, and stated just that

  2

  2 .

  2k + 1 = (k + 1) − k But this equality is not obvious, and a careful reader would be tempted to double-check it. When we include the intermediate step, the arithmetic is clearly correct. On the other hand, some assumptions don’t need to be proven—for example, that it is valid to write

  2

  2 .

  • 2k + 1) − k 2k + 1 = (k

  This relies on the fact that if we move the terms around and group them in different ways, they still add up to the same value. In this context, these rules seem rather basic and can be assumed; proving them would distract the reader from the main argument. But in a text on formal arithmetic, these properties might themselves be the subject of a proof. The amount of detail to include depends on the context and the proof ’s intended audience; as a rule of thumb, write as if you are trying to convince a peer.

  Let’s return now to the substance of the above proof. First, it is construc- tive. The statement to be proved merely asserts the existence of something: it says that for any odd integer, there exist two integers with the prop- erty that the difference of their squares is equal to the number we started with. A constructive proof not only demonstrates that the thing exists, but shows us exactly how to find it. Given a particular odd integer 2k + 1, the proof of Theorem shows us how to find the two integers that have the property asserted in the statement of the theorem—one is k + 1 and the other is k. For example, if the odd integer we wanted to express as the differ- ence of two squares was 341, the proof shows that we can subtract 1 from 341 and divide by 2, and that integer and the next larger integer are the desired pair—170 and 171. This is easy to check:

1 A stronger meaning of the term “con-

  One final note about Theorem The proof not only is constructive, but proves more than was asked for. It shows not just that every odd integer is the difference of two squares, but that every odd integer is the difference of the squares of two consecutive integers, as we noted while working through the examples. After finishing a proof, it is often worth looking back at it to see if it yields any interesting information beyond the statement it set out to prove.

  same name can be used in different parts of the statement to refer to

  This is a fairly typical mathematical statement. Several things about it are worth noting.

  2 is odd if and only if n is odd.

  Theorem 2.5. For any integer n, n

  The square of an integer is odd if and only if the integer itself is odd. Or, to write the same thing in a more conventionally mathematical style:

  A common goal in mathematical proof is to establish that two state- ments are equivalent—that one statement is true in all the circumstances in which the second statement is true, and vice versa. For example, consider the statement

  .

  structive” is used in the school of con- structive mathematics, which disallows any mathematical argument that does not lead to the construction of the thing that the argument says exists. In con- structive mathematics it is impermissi- ble to infer from the fact that a statement is demonstrably false that its negation is necessarily true; the truth of the nega- tion has to be demonstrated directly. For example, constructive mathemat- ics disallows proofs by contradiction (explained below) and arguments like that of Problem Computer scien- tists prefer arguments that yield algo- rithms, but generally don’t insist on “constructive proofs” in the strict sense used by constructive mathematicians. For us, to show something is true it suf- fices to show that its negation is not true.

  basic proof techniques

  1

  = 2k + 1. Not every proof is constructive—sometimes it is possible to show that something exists without showing how to find it. Such a proof is called nonconstructive. We will see some interesting examples of nonconstructive proofs—in fact, the proof of the Pigeonhole Principle will be nonconstruc- tive, since it cannot identify which pigeonhole has more than one pigeon. But computer scientists love constructive arguments, because a construc- tive proof that something exists yields an algorithm to find it—one that a computer could be programmed to carry out.

  2

  − n

  2

  In general, a procedure for answering a question or solving a problem is said to be an algorithm if its description is sufficiently detailed and precise that it could, in principle, be carried out mechanically—by a machine, or by a human being mindlessly following instructions. A constructive proof implicitly describes an algorithm for finding the thing that the proof says exists. In the case of Theorem the proof describes an algorithm that, given an odd integer 2k + 1, finds integers m and n such that m

  15 There was no real need to check this particular case, but doing so makes us more confident that we didn’t make an algebraic mistake somewhere.

  • It uses a variable n to refer to the thing that is under discussion, so the

  16 essential discrete mathematics for computer science