cs2102: Discrete Math

Class 8: Relations (Thu, 15 Sep 2016)

### Schedule

**Problem Set 3** is due **Friday at 6:29pm**.

**Challenge Problem Opportunity.**Apparently, people were not entirely convinced by my proof sketch in class that the egg came before the chicken. So, we’ll make it a challenge problem to write the best proof that answers the question of “which came first, the chicken or the egg?” (That is, you can either prove the egg came first, or prove the chicken came first). A convincing proof would need to include clear definitions of “chicken” and “egg”, and use inference rules, as well as axioms based on biological assumptions (clearly stated), to make an irrefutable argument supporting your position.

## Links

Dijkstra’s explanation of what the natural numbers should start with 0

# Functions

A **function** is a mathematical datatype that associates elements from
one set, called the *domain*, with elements from another set, called a
*codomain*.

$$ f: \textrm{\em domain} \rightarrow \textrm{\em codomain} $$

If the function is *total*, every element of the domain has one
associated codomain element; if the function is *partial*, there may be
elements of the domain that do not have an associated codomain element.

**Defining a function.** To define a function, we need to describe how
elements in the domain are associated with elements in the codomain.

**Set Products.** A *Cartesian product* of sets $S_1, S_2, \cdots, S_n$
is a set consisting of all possible sequences of $n$ elements where the
$i$\textsuperscript{th} element is chosen from $S_i$.

$$ S_1 \times S_2 \times \cdots \times S_n = { (s_1, s_2, \cdots, s_n) | s_i \in S_i } $$

How many elements are in $A \times B$?

#

What are the (sensible) domains and codomains of each function below:

$$ f(n) ::= |n| \qquad \qquad f(x) ::= x^2 \qquad\qquad f(n) ::= n + 1 \qquad\qquad f(a, b) ::= a / b $$

$$ f(x) ::= \sqrt{x} \qquad \qquad \qquad f(S) ::= \textrm{minimum}_{<}(S) $$

##

For which of them is the function *total*?

How are functions in your favorite programming language like and unlike mathematical functions?

#

## Relations

A **binary relation**, $R$, consists of a domain set, $A$, and a
codomain set, $B$, and a subset of $A \times B$ called the *graph* of
$R$.

For each statement below, give the name and at least one example.

- A binary relation where no element of $A$ has more than one outgoing edge:

##

- A binary relation where every element of $A$ has exactly one outgoing edge:

##

- A binary relation where every element of $B$ has exactly one incoming edge:

##

- A binary relation where every element of $A$ has exactly one outgoing edge and every element of $B$ has exactly one incoming edge:

##

If there exists a *bijective* relation between $S$ and $T$ defined by the graph $G$ which of these *must* be true:

a. there exists some *injective* relation between $S$ and $T$.
b. there exists some *bijective* relation between $T$ and $S$.
c. there exists a *total* function, $f: S \rightarrow T$.
d. $S - T = \emptyset$.
e. the number of elements of $S$ is equal to the number of elements of $T$.
f. $G - (S \times T) = \emptyset$.
g. $(S \times T) - G = \emptyset$.
h. $(S \times T) - G \neq \emptyset$.

Generating PDFs (Wed, 14 Sep 2016)

We are finding many of the assignment submissions for PS1 and PS2 are barely readable. When it is hard for the graders to read your answers, you will get less useful feedback on your submission. We are not intentionally down-grading work that is hard to read, but you will lose the benefit-of-the-doubt for answers that are too hard to read, and it also makes a bad impression when your work is presented in a way that looks careless.

We provide some advice below on how to generate a readable PDF to submit. Please review what you are submitting before submitting it to make sure that it will not cause the graders headaches attempting to discern your answers from it.

We’ve been forgiving on PS1 and PS2, but from PS3 on will be expecting everyone to follow the directions well and generate readable submissions, and will refuse to grade unreadable submissions.

#### One Submission per Team, with Everyone’s Name and ID

Each team that works together should submit *exactly one* assignment,
and it should have each team member’s name and email ID clearly at the
top of it. When you neglect to do this, it causes a lot of extra
trouble for us.

#### PDF files only

Your submission must be a single, PDF file. We will not grade Word documents, images, or other formats. We do make two exceptions: entertaining videos and cakes. (If your answers are in the form of a cake, it make not be possible to submit it through collab, but you can drop it off in person.)

## Ways to Generate Good PDFs

There are three ways to generate reasonable PDFs to submit: (1) use a mathematical typesetting tool, (2) use a scanner that scans to PDF files, or (3) use a camera. If you use a scanner or a camera, the page you are scanning has to be readable to begin with, and you have to scan it well. It is best to write on only one side of the paper, since often the writing from the back side bleeds through in the scan or image.

The most difficult to read submissions are when people use their mobile phone camera to take a picture of handwritten paper, and then put the image into a Word document without any filtering or processing. This is not a way to make a readable PDF, and its not fair to expect graders to deal with that type of document.

There are ways to generate readable PDFs using your mobile phone camera.
One tool that works for this is
CamScanner (available as a
free app for limited use, for Android, iPhone, and Windows Phone).
There are lots of other scanning apps that should work - see *The best
scanning apps for Android and
iPhone*.
Please look at the file you have generated before submitting it. If you
are not able to generate a readable PDF using your phone, you should use
a scanner. There are public scanner to use in some of the UVA libraries
including in Clemons Library.

## Mathematical Typesetting

More ambitious, forward-thinking, aesthetically-minded, or handwriting-averse students will prefer to use a system that generates PDFs directly. The system used for nearly all technical typesetting is LaTeX (which was started by Leslie Lamport who won the Turing Award in 2013, and builds on the TeX typesetting program created by Donald Knuth, who won the Turing Award in 1974; neither of the awards were for work on typesetting). (Word and Google Docs do include equation editors, and you are welcome to use these if you want for cs2102. But, anyone doing serious mathematics quickly finds that these gui-type equation editors are both painfully slow and not able to generate good output. So, I don’t recommend using them unless you already know how and are happy with this.)

LaTeX has a fairly steep learning curve, but there are lots of tools available now to make it easier to get started. ShareLaTeX is a web-based, multi-user LaTeX editor; TeXnicCenter is a LaTeX editor for Windows; and texpad is available for Mac and iOS.

If you are expecting to take further math and CS courses, it is worth learning to use this to produce documents efficiently. Some later CS courses required submissions to be typeset using LaTeX (depending on whom is teaching them, this is sometimes the case for CS3102 and CS4102), and nearly all research papers are written this way. You are not required to produce beautifully-typeset problem sets for CS2102 and shouldn’t spend inordinate extra time just trying to make your answers look great, but we will definitely appreciate your efforts to make submissions easier to read, and after a small effort in getting started with LaTeX, you may find that it takes less time than hand-writing (especially since it is easy to edit, instead of needing to rewrite).

Anonymous Feedback Comments (Tue, 13 Sep 2016)

I wanted to discuss two anonymous comments that were submitted through Collab (there’s no way to respond directly to submitters, so I’ll post my response here).

Comment 1:

The homework is significantly longer and harder than Edwards’. Also, it extends beyond lecture content too much :(

Comment 2:

Professor,

I believe that we are moving in too fast a pace at the current moment. I am truly enjoying the class material however before I am fully able to grasp the concepts we quickly move on to the next. I understand that this class moves fast however I have attended every single lecture and still did not understand how to solve quite a few of the questions in problem set 2. In fact in more than one instance I had no idea what some of the questions were asking. I would appreciate it if we could have a brief overview of the problem set before they are due. I appreciate you taking the time to address the questions in the later part of the week but we had about 15 minutes to ask pertinent questions for an assignment due less than 48 hours.

I am truly enjoying this class but I will fall very far behind in this class if we continue to go at the pace we are. If we could solve more sample questions related to the problem sets I feel it would go a long way in understanding how to utilize the tools we are being taught.

I hope I am not out of line. I also didn’t feel comfortable bringing this up personally because I am not sure if I am in the majority, minority, or the rarity. I have never had problems with a math course before and have been comfortable doing proofs prior to this. I suppose I am not as fast a learner as the class demands.

First, I do appreciate getting comments like this, and although I would hope most students feel that they could safely send comments like this without needing anonymity, I understand why you prefer to send them anonymously.

As for the content of Comment #1, what you should be worried about is if the workload for the course seems unreasonably high for a 3-unit class. As the university defines things, a 3-unit course is supposed to involve 9 hours of academic work per week (assuming you are coming to class for 2.5 hours per week, this leaves 6.5 hours for outside work). It is sometimes hard to estimate how long it will take students to do an assignment, but I don’t think the readings and problem sets assigned so far in this course should take more than this for most students.

If you find you are spending unreasonably more than this, then it is either because you are not using the time effectively (this is especially a challenge when students work in teams to keep everyone focused and on task); you are trying to do more than is intended (which is great if you have time for it); you are solving problems using a tedious, painstaking approach rather than looking for a clever, fast solution (e.g., making an exhaustive truth table for a 7-input function); you are lacking in some of the necessary background or experience so need more time; or (perhaps most likely) because I have grossly underestimated how long things take when you are learning them for the first time (and me teaching them for the first time).

I do want to understand if things are taking too long and why, so if it seems like the workload for the course is unreasonable I encourage you to come to office hours and discuss what you are doing and whether their might be more effective ways to use your time.

As for the problems “extending beyond lecture content”, that is absolutely what you should expect (indeed, ambitious students should be insulted by assignments that don’t go beyond what was covered in lecture!). The problems are not designed to just test if you have been paying attenting in class and can repeat what is done in class in a straightforward way. Sometimes there will be “warm-up” problems intended to check basic understanding, but the goal of most of the problem set problems is to deepen your understanding of the concepts introduced in class and in the book and improve your abilities as a mathematical thinker, by having you apply them yourself in ways you haven’t seen before.

For Comment #2, I agree that we have covered a lot quite quickly at the beginning of the course. In general, there is always a tension between wanting to include and get to the most interesting material and making sure the foundations are strong.

The best way to make sure I spend more class time on topics is to ask
questions about them. My default assumption (which I know is
naïve, but I also think is fair) is that if something is explained
well in the assigned readings and no one asks questions about it, then I
shouldn’t waste class time going over it (unless I have something I
think is worthwhile beyond or different from what was in the reading).
So far (scanning over `#general`

now), very few students have asked
questions like this; nearly all the questions are about clarifying
specific problems from the problem sets. I am happy to spend more class
time on topics covered in the book if people ask questions about them.
It doesn’t even have to be a precise question - its fine to ask things
like, “I didn’t understand the set builder notation, can you go over an
example in class?” (otherwise, I think its quite boring to talk about
set builder notation, and that what is written in the book is sufficient
for everyone to understand it without me spending time on it in class).

The other thing I want to say on this is that we’ve introduced a lot of new things fast, but all of the main themes of the course will keep recurring throughout the semester. In particular, towards the two main goals of getting good at making and using definitions, and understanding and constructing proofs, most of the concepts we have introduced already will be seen over and over again throughout the course. The new topics will build on the knowledge and understanding you have, but also (I hope) deepen your understanding of the early topics.

(As one concrete example, following the design of the book, we introduced well-ordering proofs early. We will soon get to proof by induction, which is dependent on well-ordering, and similar to well-ordering proofs in many ways. Proof by induction is probably the most powerful proof technique you’ll learn in this class, and one used all the time in computer science, and we’ll see many different examples of proof by induction throughout the semester since its something that most people need to see many times in different ways and contexts before really getting it.)

Finally, in terms of the overall pacing, my plan for the course (which is very much flexible) is to cover about half the material that is covered in the comparable MIT course. This seems reasonable to me since I expect UVA students to be as good as MIT students, but the way MIT classes work students take fewer courses than UVA students do (it would be more like a 4-unit class here), and there is a lot more class time than we have (they have 3 meetings a week, plus recitation, and four evening “midterm” exams), so the actual class time is about 2x what we have. As another example, Berkeley’s CS70 class covers things with a quite different approach, but also covers fair bit more material than is planned for cs2102 (and with resources more like we have as a public school, they expect students to do their own homework grading). If you think I shouldn’t expect you to be competitive with Berkeley or MIT students, you can also compare to the UNC course, which covers a similar amount of material to this course but with 8am Monday lectures. (I do, of course, have much higher expectations of UVA students than what Virginia Tech does, but I hope you would not use that as a sign that our expectations are too high!)

Class 7: Sets (Tue, 13 Sep 2016)

### Schedule

Everyone should have received their comments and grade on PS1 yesterday. Please make sure to read the comments posted on the website about how PS1 was graded.

**Problem Set 3** is due **Friday at 6:29pm**.

I will have “make-up” office hours tomorrow (Wednesday) 3:30-4:30pm (in addition to my usual Wednesday 1-2pm office hours). There have been some adjustments to the office hours schedule, please check the calendar.

## Links

Tom Leighton’s Remarks on Danny Lewin

*A Lost Spirit Still Inspires*, Boston Globe, 4 September 2011.

*No Better Time: The Brief, Remarkable Life of Danny Lewin, the Genius Who Transformed the Internet* by Molly Knight Raskin.

*Akamai’s
Algorithms*
Technology Review, Interview with Tom Leighton, 1 September 2000.

David Karger, Eric Lehman, Tom Leighton, Matthew Levine, Daniel Lewin,
Rina Panigrah. *Consistent Hashing and Random Trees: Distributed
Caching Protocols for Relieving Hot Spots on the World Wide
Web*. (Note:
tradition in theory papers is to list all authors alphabetically). 29th
ACM Symposium on Theory of Computing, 1997.

The SAT solver I demo’d in class today was from the Class 6
notes: Sahand Saba’s *Understanding SAT by Implementing a Simple SAT
Solver in
Python*
[Code with my modifications:
https://github.com/evansuva/simple-sat]. You should, of course, feel free to use this (or any other SAT solver you want) to help with solving and checking your answers for Problem Set 3 (and any other assignment in this class).

# Notes and Questions

What is a *data type*? What are the differences between a *mathematical
data type* and a data type in your favorite programming language?

#

A **set** is an unordered colection of objects. A set is defined by its
membership operation: $x \in S$ is true if $x$ is in the set $S$.

## Set Operations

Subset: $\subseteq$ (note that this does not mean *strict subset*)
$$A \subseteq B \iff \forall x \in A. \fillin \in \fillin.$$

Set Equality: $=$ $$A = B \iff A \fillin B \wedge B \fillin A.$$

Set Union: $\cup$ $$\forall x. x \in A \cup B \iff x \in A \fillin x \in B.$$

Set Intersection: $\cap$ $$\forall x. x \in A \cap B \iff x \in A \fillin x \in B.$$

Set Difference: $-$ $$\forall x. x \in A - B \iff x \in A \wedge x \notin B.$$

Set Complement: $\overline{S}$ $$ \forall x. x \in D. x \in \overline{A} \iff x \notin A.$$

($D$ is the ``domain of discourse”, the universe of all objects under discussion.)

### Russell’s Paradox

$$ S_{R} = \textrm{ the set of all sets that are not members of themselves} $$

Is $S*{R} \in S*{R}$?

#

### Set Practice

Here are some practice problems involving sets. We won’t go through these in class, but you should ask questions about any are unclear. (At least a few of these will be on Exam 1.)

- Define $A \subset B$ (strict subset).

#

- Prove $A \cup B \equiv B \cup A$.

#

- Prove $A - B = \emptyset \iff A \subseteq B$.

#

- Prove $A = B \iff (\forall a \in A \ldotp a \in B) \wedge (\forall b \in B \ldotp b \in A).$

#

## Danny Lewin

See the web version for links to articles on Danny Lewin.

\begin{quote} {\em In true Danny form, he fought back against the terrorists in an effort to defend the stewardesses and the cockpit. To this day, those of us who knew him well can’t figure out how only five terrorists managed to overpower him. During his short life, Danny made extraordinary contributions to the internet and to computer science through his work in algorithms and complexity theory. The impact of his work will be felt throughout the hi-tech industry for many years to come.} (from Tom Leigthon’s \hyperlink{http://www.egr.unlv.edu/~bein/SIGACT/lewin.html}{remarks to commemorate naming of the STOC Best Student Paper Award in honor of Daniel Lewin}, 19 May 2002. \end{quote}

Problem Set 1 Comments (Sat, 10 Sep 2016)

The solutions and comments for PS1 are now posted in collab: PDF

The grades will be released soon. Sorry it took so long to get the grading and solutions done for PS1. We are sorting out how to manage grading using collab and figuring a better way to manage this, and getting tablets to make it easier to write on your submitted PDFs (instead of typing notes, as was done for PS1). I hope this will go more smoothly for PS2, and be fully worked out for PS3 and later. Our goal (which I think is reasonable for you to expect) is to always return assignments within a week of when they are due.

The way problem sets are graded in this class is probably different from
what most of you are accustomed to. The purpose of the problem sets is
primarily for *learning* and not for *grading*, so the grading system I
use is designed with the goals of providing useful feedback to you, as
well as giving us a way to see how the class is going and if students
are on track. This means instead of giving detailed point scores for
every problem and grading in a way that emphasizes losing points by
making mistakes, the problem sets are graded in a way meant to capture
how well students are understanding the material.

Collab requires specifying a *maximum* score for each assignment, but
there is no real maximum score, and you should not view the scores as
being “*n* out of *m*” where there is some maximum score and you are losing
points if you end up below that score.

The meaning of the grades for PS1 is:

Real problems - only attempted a few problems, and nothing that shows any understanding.

Substantial problems - attempted a few of the problems, but didn’t show any real understanding of many concepts.

Fundamental problems understanding key concepts; not able to complete reasonable answers for most questions.

Several fundamentally incorrect answers, or fundamental problems understanding important concepts for this assignment.

Nearly all answers correct, but justifications not as strong as desired.

All answers acceptable. (This is the “expected” score for problem sets.)

All answers essentially correct, with at least one especially good answer.

All answers well done, and more than one especially good answer.

Your answers broke new ground in computer science. (Not used for PS1)

You solved an important open problem! (Not expected to be used in cs2102)

You deserve a Turing award! (in theory, there is no upper bound on the grading, but I do not expect to have occassion to use grades higher than 11)

Getting the “expected” code (6 for PS1) on assignments means you are well on track to getting an A in the class (this is the “expected” grade for everyone). You’ll need to do well on the exams, also, of course.

If your score on PS1 is below 6, you should have a good idea from the comments and from reading the solutions, what you need to improve on to do better on future assignments.

If you notice a significant error in any of the provided solutions (where “significant” is at my discretion, but generally means something more than an obvious typo), you can earn bonus points by reporting it.

Survey Questions (Part 2) (Sat, 10 Sep 2016)

Here are my (in progress) responses to more questions submitted in the registration surveys (see Part 1 for the first set of responses). Sorry its taking a long time to answer these questions, but there are lots of great questions submitted, and I will eventually get to them all.

I’m willing to consider anyone who expresses interest for a summer
position in my research group, but want to try and have a good
expectation that it will be a positive and worthwhile experience for
both of us before offering a position. The main things I want to see
are *curiosity* and *initiative*. Beyond that, it does help to show
that you can do well on the content in the class, but it is not
necessary to be the best student or to do consistently well. Much more
valuable to do one impressive thing (which could be something in the
class like solving a challenge problem, or something you’ve done on your
own, like building an interesting mobile app or writing a blog, etc.).

*small*programs that do a lot, than for building

*big*programs.

The main challenge in building big programs is managing complexity and the best tool for managing complexity is to define clean and limited interfaces between components. This class might help some for that, at least in terms of the underlying skills, becuase it should improve your ability to think clearly about precise definitions and to strive for minimal requirements.

If you want to learn how to build big programs successfully the best thing to do is to actually try building progressively bigger programs, and to join and contribute to open source projects where you’ll get mentoring from people with experience building (and maintaining, which is often the hardest part!) big programs.

In college, grades have much less significance. There is, unfortunately, one exception to that which is getting into the CS majors. I hope this will change, and also encourage students to not accept being turned away from their desired major without a fight, but for now the unfortunate reality is that slots in the CS majors have been capped, and admissions are decided nearly entirely based on arbitrary GPA cut-offs.

Otherwise, there is very little that depends on your college grades (as long as they are respectable enough that you don’t run into trouble with academic probation or anything like that).

If you want to go to graduate school (in computing, at least) after college, it is a little better to have great grades than to have mediocre grades (although perfect or near-perfect grades will probably count against you), but far more important to have an impressive accomplishment and reference letters. Being a significant author on a research paper is the most important determiner in getting into the best PhD programs, and having letters from known researchers who can provide a lot of detail about interesting and impressive things you have done is the next most important thing. Most reviewers of PhD applications won’t even look at grades, or GPAs, except to possible check grades in a few key courses, and if the reason for getting poor grades in some less important classes is that you were more focused on doing some exciting research, so much the better!

If you want a software development job in industry, there are some (mostly big, old) companies that will have GPA cut-offs in their hiring searches, but generally the more interesting and worthwhile the company you are seeking to work for, the less they care about grades and the more they want to see evidence of what you can accomplish on your own initiative. Contributing significantly to an open source project, building your own interesting web/mobile app, initiating and leading a student club, providing a technical and leadership contribution to a service organization, etc., are all much more impressive and meaningful evidence that you would be a valuable employee for most companies and any number of As in your classes.

As for what we can do in cs2102, I do try to make things less stressful for students, but sometimes things intended to reduce stress are misinterpreted and actually increase student stress. I believe being somewhat vague about what is expected and how grading is done is highly beneficial for learning. My favorite quote about this is from John Stigloe (who was a professor at Harvard, but it applies equally to UVA students):

This generation of students got into Harvard by doing exactly and precisely what teacher wants. If teacher is vague about what he wants, they work a lot harder to figure out what they want and whether or not it is good. The vaguer the directions, the more likely the opportunity for serendipity to happen. It drives them nuts!(Harvard Professor John Stilgoe on “60 Minutes”, 4 January 2004)

That said, I understand that uncertainty is one of the main things that causes stress for students (and I don’t really want to drive students nuts!). I hope students will view the vagueness in my courses as meaning there are many and varried opportunities to show that you deserve an A in the class, and that I can be trusted to treat everyone fairly.

The other main cause of grading stress, I believe, is students fearing that even though they understand the material well and are putting in a solid effort, that they are at risk of getting less-than-desired grade because of making little mistakes. Grading done where each question is “out of” some number of points, and there are various ways to lose specific numbers of points for making mistakes, and the grade is computed as the total of all the points over all the questions, exaccerbates this stress in two ways.

First, it means students are mostly focused on getting to the threshold “full credit” answer and fearful of anything that might cause point deductions. This stifles creativity and ambition.

Second, over the course of a set of assignments, it means it is nearly impossible to recover from one misstep. If 90% is needed for an “A” grade, it takes 3 grades of 95 to make up for one 75, and 5 grades of 95 to make up for one 65. With the grading system I use, one great assignment (or one successful challenge solution), is enough to make up for several poor assignments. I’m looking for evidence that students have understood important concepts and shown that they can use them effectively, not for signs of occasional slip-ups.

I hope students find the grading systems I use to be less stressful, and more conducive to ambition and creativity.

So, the main thing I would say is don’t be distressed if you see other people getting things more quickly than you do. Its probably not because they have some magic discrete math gene that enables them to understand things that you are lacking. I’m pretty sure no such gene exists. Instead, when someone learns something quickly it is probably because they had a lot of previous experience that gave them the context necessary for the new concept to make sense.

When things seem overly hard to learn, instead of getting frustrated, your goal should be to figure out what context you are missing that is making it hard to learn. Rather than getting stuck on the new concept, go back to the previous ones that it is building on and make sure you really understand them (a great way to do this is to try and explain them to someone else who might be confused on them).

Don’t just assume you understand things because you can read solutions to problems and they seem to make sense, make sure you do by working out the problems yourself and then comparing them to others’ solutions. The collaboration policy for the assignments in this class is intended to encourage this, and I hope students will take advantage of it, and use opportunities to work with others as ways to improve each other’s understanding.

The only official prerequisite is cs11xx or equivalent, so if it seems like anything is ever unfairly assuming more CS background than you should expect to have by satisfying that prerequisite, you should make sure to let me know.

It will definitely be possible to earn an A in the class without doing any programming.

As for my life, it is mostly focused around my children (a four-year old daughter, and 17-month old son). Since having children, I view any professional accomplishment as inconsequential compared to my hopes of being a good parent and raising decent and fulfilled children.

As for my achievements and experiences in computing and teaching, the ones I think are most significant are:

When I was an undergraduate at MIT, I was an undergraduate researcher in Marc Raibert’s robotics lab where I worked on making physically-realistic animations of kangaroo-like creatures. One day Prof. Raibert came into the lab and told us we should make the creatures look like dinosaurs instead, but he couldn’t explain why. A few weeks later he came back from a meeting with people in Hollywood, saying he thought he had convinced them it would be possible to make a sufficiently realistic animated dinosaur for a movie (which turned out to be Jurassic Park). As an undergraduate, I also had an opportunity to work as an intern in John Backus’ group at IBM Almaden, on a functional programming language. John was a UVA dropout, who had gone on to lead the Fortan project at IBM, develop what is now known as Backus-Naur Form, and then lead a programming language research group at IBM Almaden.

The main research I did as a graduate student at MIT (which didn’t end up actually as part of my PhD dissertation) was to develop a tool for lightweight static checking, that would take advantage of annotations added to C programs to be able to warn programmers about likely programming errors. The tool I mostly built ended up being quite widely used, and included in many Linux distributions. (The work was mostly done in 1993-1996, with some more doing after I came to UVA in 1999-2001, but I still get bug reports form people using the tool. Some of the ideas we had for checking memory use in C programs are now being used, in farily similar form, in the Rust programming language, although I have no direct involvement in this.)

One of my first goals after starting at UVA (where I joined as a professor in 1999) was to develop a new introductory computing course targeted for College of Arts and Sciences students and intended for people who were only expected to take one computing course. This started as a University Teaching Fellowship Proposal, and the first course was offered as cs200:

*Computer Science from Ada and Euclid to Quantum Computing and the World Wide Web*in 2002. Initially, there were about 20 students in the class, but more than half of them dropped out after the first problem set which was not well designed. But, the remaining 9 students stuck with the class and over the years it developed. Since some students who took it wanted to continue into other computing courses but were not allowed to in the curriculum we had then, I created a follow-on course (initially offered as cs201j: Engineering Software, and with Snakes!), which could lead into our other CS courses (it counted as what is now cs2110, and satisfied the prerequisite for cs2150).Based on the success of these courses and the quality of students we were finding in the College, I pushed for creating a CS major that was open to CLAS students. In fall 2005, I was given support from my Department Chair to lead an effort to create a CS degree for the College, and after putting together a committee to help with this and designing a curriculum, the Interdisciplinary Major in Comptuer Science (BACS) degree was approved in 2006, and we graduated our first three students in 2007. By the time I stepped down as BACS director, the BACS had grown to be larger than any other major offered by the Engineering school.

I wrote an open textbook based around the cs200/cs150/cs1120 course,

*Introduction to Computing: Explorations in Language, Logic, and Machines*. Although it was not widely adopted, it does have some eager readers and I get occasional emails from people who have used it as a self-study book. It also was the main thing that led to me being recruited by an open education startup to teach their first course. (You can read about my experiences as the first VP of Education for Udacity here: CS101: One Year Later, and my favorite article about this is*Professors without borders: Will online learning spell the end of universities?*, not just because it somehow claims I have “Monty Python humour”.)I’ve had a chance to teach many remarkable students who have gone on to do amazing and wonderful things. Although its always hard to know how much impact one actually has, one of the joys and privledges of being a professor is to follow the careers of your students and to feel like you can claim some credit for them. Some of the students who have worked in my research group include Adrienne Porter Felt (UVAToday article) (who works on social network privacy and web browser security when she was an undergraduate in my research group, and now leads the Chrome security team at Google), Yan Huang (who worked on secure computation for his PhD in my group, and is now a professor at Indiana University), Karsten Nohl (who worked on RFID security for his PhD in my group, and went on to found Security Research Labs and expose high profile security vulnerabilities in phone systems, payment systems, USB, and GSM), Nate Paul (who work on malware security for his PhD in my group, and has gone on to do work on medical devices and embedded security at Oak Ridge and University of Tennessee), and many others.

I’ve led or been involved in research projects that have come up with some cool ideas, a few of which have led to lots of other research as well as use in industry. The most significant are our work on making secure multi-party computation practical, improving software security (e.g., ScriptInspector, Perracotta, PHPrevent, and Split), and mobile networking.

I’ve been involved in a few start-up companies, one as a co-founder (when I was a grad student), one as an early employee (with Udacity, growing from 5 to ~50 people), and many as an advisor. I’ve learned from these experiences (mostly that academia is much more pleasant than industry!), but also have lots of opportunites to work with great people and see different aspects of companies at various stages.

I’ve written a baby book on theoretical computer science, make some fun videos, and received an award from Tim Kaine (back when he was a lowly governor).

Hope that helps! Thanks for asking.

Grace Hopper (really enjoyed Kurt Beyer’s biography. I would also mention Claude Shannon, Alan Turing, and Richard Feynman as runner-ups.

Problem Set 3 (Fri, 9 Sep 2016)

**Problem Set 3** is now posted, and is due on **Friday, 16
September at 6:29pm**.

Class 6: SAT Solving (Thu, 8 Sep 2016)

### Schedule

**Problem Set 2** is due **Friday at 6:29pm**. A few of the
problems on PS2 are quite challenging! The problems are organized by
topic, not by difficulty (the hardest problems are #4 and #7). I hope
everyone makes a good attempt to solve all of the problems, but don’t be
overly stressed if you cannot solve these ones and move on to the other
problems if you are stuck on them (to hopefully return to these problems
after you’ve had more time to think about them). Good solutions to
these problems will impress us, but are not necessary to get a good
score on PS2.

I will be out-of-town Monday, and not able to hold my usual office hours on Monday. I will have “make-up” office hours on Wednesday, 3:30-4:30pm (in addition to my usual Wednesday 1-2pm office hours).

## Links

Sahand Saba’s *Understanding SAT by Implementing a Simple SAT Solver in
Python* [Code with my modifications: https://github.com/evansuva/simple-sat]

XKCDs on NP-Complete problems: Subset Sum, Travelling Seller

*Pancakes, Puzzles, and Polynomials: Cracking the Cracker
Barrel* (reducing
pegboard puzzle to 3SAT)

# Notes and Questions

**Definition.** A formula is in *3SAT* if it is in 3CNF form and it is
satisfiable.

$$ (x_1 \vee x_2 \vee x_3) \wedge (x_1 \vee \overline{x_2} \vee x_3) \wedge (\overline{x_1} \vee x_2 \vee \overline{x_3}) $$

\begin{center}
\tiny
\begin{math}
(x*{48} \vee x*{4} \vee \overline{x*{9}}) \wedge (\overline{x*{44}} \vee x*{50} \vee \overline{x*{37}}) \wedge (\overline{x*{8}} \vee \overline{x*{1}} \vee x*{28}) \wedge (x*{21} \vee x*{27} \vee \overline{x*{32}}) \wedge (x*{17} \vee x*{29} \vee \overline{x*{30}}) \wedge (x*{30} \vee x*{24} \vee x*{37}) \wedge (\overline{x*{22}} \vee \overline{x*{27}} \vee \overline{x*{44}}) \wedge (x*{8} \vee \overline{x*{25}} \vee \overline{x*{24}}) \wedge (\overline{x*{44}} \vee x*{50} \vee x*{14}) \wedge (x*{45} \vee x*{15} \vee x*{37}) \wedge (\overline{x*{16}} \vee x*{14} \vee \overline{x*{36}}) \wedge (\overline{x*{33}} \vee x*{5} \vee x*{26}) \wedge (x*{18} \vee \overline{x*{7}} \vee \overline{x*{24}}) \wedge (x*{31} \vee x*{38} \vee x*{28}) \wedge (x*{31} \vee \overline{x*{33}} \vee \overline{x*{8}}) \wedge (x*{49} \vee x*{7} \vee \overline{x*{6}}) \wedge (x*{34} \vee \overline{x*{8}} \vee x*{46}) \wedge (x*{4} \vee \overline{x*{5}} \vee \overline{x*{35}}) \wedge (x*{43} \vee x*{27} \vee x*{39}) \wedge (\overline{x*{46}} \vee \overline{x*{40}} \vee \overline{x*{27}}) \wedge (\overline{x*{25}} \vee x*{14} \vee \overline{x*{49}}) \wedge (x*{38} \vee x*{5} \vee x*{15}) \wedge (x*{9} \vee x*{14} \vee \overline{x*{19}}) \wedge (x*{45} \vee \overline{x*{42}} \vee \overline{x*{39}}) \wedge (x*{34} \vee \overline{x*{22}} \vee \overline{x*{28}}) \wedge (\overline{x*{20}} \vee x*{15} \vee \overline{x*{8}}) \wedge (\overline{x*{44}} \vee \overline{x*{10}} \vee \overline{x*{9}}) \wedge (x*{22} \vee \overline{x*{31}} \vee x*{14}) \wedge (\overline{x*{9}} \vee \overline{x*{42}} \vee \overline{x*{15}}) \wedge (\overline{x*{40}} \vee x*{12} \vee \overline{x*{32}}) \wedge (\overline{x*{20}} \vee \overline{x*{6}} \vee \overline{x*{15}}) \wedge (\overline{x*{37}} \vee x*{39} \vee \overline{x*{23}}) \wedge (\overline{x*{3}} \vee \overline{x*{40}} \vee \overline{x*{32}}) \wedge (\overline{x*{4}} \vee \overline{x*{25}} \vee x*{7}) \wedge (\overline{x*{20}} \vee \overline{x*{36}} \vee \overline{x*{37}}) \wedge (\overline{x*{40}} \vee \overline{x*{35}} \vee x*{39}) \wedge (\overline{x*{43}} \vee \overline{x*{40}} \vee \overline{x*{7}}) \wedge (x*{34} \vee x*{44} \vee x*{26}) \wedge (x*{13} \vee x*{27} \vee x*{28}) \wedge (x*{12} \vee \overline{x*{36}} \vee x*{7}) \wedge (\overline{x*{16}} \vee x*{9} \vee \overline{x*{24}}) \wedge (\overline{x*{48}} \vee x*{14} \vee x*{28}) \wedge (x*{16} \vee x*{4} \vee x*{40}) \wedge (\overline{x*{25}} \vee x*{15} \vee x*{37}) \wedge (x*{47} \vee \overline{x*{26}} \vee \overline{x*{23}}) \wedge (x*{4} \vee \overline{x*{13}} \vee x*{36}) \wedge (x*{48} \vee \overline{x*{13}} \vee \overline{x*{37}}) \wedge (x*{4} \vee x*{35} \vee \overline{x*{27}}) \wedge (\overline{x*{22}} \vee x*{47} \vee x*{26}) \wedge (\overline{x*{22}} \vee \overline{x*{46}} \vee x*{27}) \wedge (\overline{x*{20}} \vee x*{49} \vee x*{11}) \wedge (x*{42} \vee \overline{x*{10}} \vee x*{28}) \wedge (\overline{x*{45}} \vee x*{28} \vee \overline{x*{37}}) \wedge (x*{14} \vee \overline{x*{32}} \vee \overline{x*{23}}) \wedge (x*{22} \vee x*{14} \vee x*{23}) \wedge (\overline{x*{17}} \vee \overline{x*{46}} \vee \overline{x*{7}}) \wedge (\overline{x*{31}} \vee x*{46} \vee \overline{x*{50}}) \wedge (x*{34} \vee \overline{x*{41}} \vee x*{43}) \wedge (x*{17} \vee \overline{x*{9}} \vee x*{15}) \wedge (x*{46} \vee x*{14} \vee \overline{x*{12}}) \wedge (\overline{x*{20}} \vee x*{12} \vee x*{14}) \wedge (x*{41} \vee x*{42} \vee \overline{x*{15}}) \wedge (x*{48} \vee x*{46} \vee \overline{x*{36}}) \wedge (\overline{x*{22}} \vee \overline{x*{4}} \vee \overline{x*{49}}) \wedge (x*{22} \vee x*{12} \vee \overline{x*{42}}) \wedge (x*{13} \vee \overline{x*{38}} \vee x*{39}) \wedge (x*{48} \vee \overline{x*{16}} \vee \overline{x*{27}}) \wedge (x*{17} \vee \overline{x*{18}} \vee \overline{x*{26}}) \wedge (x*{48} \vee \overline{x*{40}} \vee \overline{x*{35}}) \wedge (\overline{x*{43}} \vee \overline{x*{40}} \vee \overline{x*{49}}) \wedge (x*{29} \vee x*{11} \vee \overline{x*{32}}) \wedge (x*{33} \vee \overline{x*{17}} \vee x*{39}) \wedge (\overline{x*{25}} \vee \overline{x*{9}} \vee \overline{x*{6}}) \wedge (x*{40} \vee \overline{x*{50}} \vee x*{19}) \wedge (x*{8} \vee x*{10} \vee \overline{x*{27}}) \wedge (x*{5} \vee x*{9} \vee \overline{x*{26}}) \wedge (x*{45} \vee \overline{x*{38}} \vee \overline{x*{27}}) \wedge (\overline{x*{4}} \vee \overline{x*{40}} \vee \overline{x*{42}}) \wedge (x*{21} \vee x*{50} \vee x*{12}) \wedge (\overline{x*{8}} \vee \overline{x*{14}} \vee \overline{x*{42}}) \wedge (\overline{x*{17}} \vee x*{47} \vee \overline{x*{27}}) \wedge (x*{49} \vee \overline{x*{12}} \vee \overline{x*{6}}) \wedge (x*{27} \vee x*{49} \vee \overline{x*{32}}) \wedge (\overline{x*{29}} \vee \overline{x*{12}} \vee \overline{x*{26}}) \wedge (x*{48} \vee \overline{x*{2}} \vee x*{6}) \wedge (x*{16} \vee x*{36} \vee x*{49}) \wedge (x*{33} \vee \overline{x*{12}} \vee \overline{x*{26}}) \wedge (\overline{x*{33}} \vee x*{29} \vee x*{49}) \wedge (\overline{x*{48}} \vee x*{2} \vee x*{19}) \wedge (x*{25} \vee x*{36} \vee x*{49}) \wedge (x*{21} \vee x*{40} \vee \overline{x*{14}}) \wedge (\overline{x*{34}} \vee \overline{x*{44}} \vee \overline{x*{6}}) \wedge (x*{48} \vee \overline{x*{50}} \vee \overline{x*{1}}) \wedge (x*{5} \vee \overline{x*{12}} \vee x*{7}) \wedge (x*{21} \vee \overline{x*{35}} \vee \overline{x*{27}}) \wedge (\overline{x*{22}} \vee \overline{x*{16}} \vee \overline{x*{14}}) \wedge (\overline{x*{13}} \vee \overline{x*{35}} \vee \overline{x*{12}}) \wedge (\overline{x*{4}} \vee \overline{x*{35}} \vee \overline{x*{42}}) \wedge (\overline{x*{50}} \vee \overline{x*{40}} \vee x*{7}) \wedge (x*{25} \vee x*{47} \vee \overline{x*{12}})
\end{math}
\end{center}

## Converting Truth Tables to CNF

\begin{tabular}{cc|cc}
$P$ & $Q$ & $P \implies Q$ & $P \oplus Q$ \ \hline
\T & \T & \T & \F

\T & \F & \F & \T

\F & \T & \T & \T

\F & \F & \T & \F

\end{tabular}

The output of the operator is \T\ if and only if the inputs do not match
*any row* where the output is \F. We can ensure the inputs do not match
a row, but OR-ing the negation of each input: in the disjunction, at
least one must be \T\ to satisfy the clause.

## Problems

**Definition.** A *problem* is a precise description of set of possible
inputs and desired property of an output corresponding to each input.

Define the *ADDITION* problem (adding two integers):

#

**Definition.** A *decision problem* is a problem where the output is
either *T* or *F*. Equivalently, we can view a decision problem as
testing set membership: $x \in S$.

The *SUM* problem:
\begin{quote}
{\bf Input.} Three integers, $x$, $y$, and $z$.

{\bf Output.} \T\ iff $z = x + y$.
\end{quote}

How could we solve *ADDITION* using *SUM*?

##

**Definition.** A *procedure* is a precise description of an information
process.

**Definition.** An *algorithm* for a particular problem is a procedure
that *solves* that problem. To *solve* a problem, an algorithm must
always (eventually) produce the correct output for any problem input.

**Definition.** A *program* is adescription of a procedure that can be
executed by a computer.

**Definition.** The *3SAT decision problem* takes as input a logical
formula written in CNF, and outputs *T* if the input formula is
satisfiable and outputs *F* otherwise.

How many uses of a solver for the *3SAT* decision problem are sufficient
to always find a satisfying assignment for a satisfiable formula?

##

## SAT Solving (will be covered Tuesday)

Sahand Saba’s *Understanding SAT by Implementing a Simple SAT Solver in
Python* (Code with a few modifications: *https://github.com/evansuva/simple-sat*)

What is the *worst-case* number of guesses needed to solve a SAT
instance with $v$ variables and $c$ clauses using the ``brute-force
guessing” strategy?

###

What is the *worst-case* number of guesses needed to solve a SAT
instance with $v$ variables and $c$ clauses using the ``watchlists”
strategy?

###

Why is it possible to solve *most* SAT instances reasonably quickly?

Class 5: CNF, Quantifiers, and Proofs (Tue, 6 Sep 2016)

### Schedule

**Problem Set 2** is due **Friday at 6:29pm**.

## Links

If you believe real computing systems have the property that the values you read from memory will match what you wrote there, see:

Sudhakar Govindavajhala and Andrew W. Appel. *Using Memory Errors to Attack a Virtual Machine*, IEEE Symposium on Security and Privacy 2003.

Bianca Schroeder, Eduardo Pinheiro, Wolf-Dietrich Weber. *DRAM Errors
in the Wild: A Large-Scale Field
Study*. SIGMETRICS 2009.

Kaveh Razavi, Ben Gras, Erik Bosman, Bart Preneel, Cristiano Giuffrida
and Herbert Bos. *Flip Feng Shui: Hammering a Needle in the Software
Stack*. USENIX
Security 2016.

Yuan Xiao, Xiaokuan Zhang, Yinqian Zhang, and Radu Teodorescu. *One Bit Flips, One Cloud Flops: Cross-VM Row Hammer Attacks and Privilege Escalation*, USENIX Security 2016.

# Notes and Questions

**Definition: satisfiable.** A logical formula is *satisfiable* if there
is *some* way to make it **true**. That is, there is at least one
assignment of truth value to its variables that makes the forumla
true.

**Definition: conjunctive normal form (CNF).** A logical formula that is
written as a conjunction of *clauses*, where each clause is a
disjunction of *literals*, and each literal is either a variable or a
negation of a variable, is in *conjunctive normal form*. If each
clause has excatly three literals, it is called *three conjunctive
normal form* (3CNF).

$$ (a_1 \vee a_2 \vee \neg a_3) \wedge (a_1 \vee \neg a_2 \vee a_3) \wedge (\neg a_1 \vee a_2 \vee \neg a_3) \wedge (\neg a_1 \vee a_2 \vee a_3) $$

##

Show that every logical formula can be written in 3-conjunctive normal form.

What is the maximum number of (different) clauses in a 3CNF formula involving 5 variables?

##

What is the maximum number of (different) clauses in a *satisfiable*
3CNF formula involving 5 variables?

##

What is the maximum number of (different) clauses in a *valid* 3CNF formula involving 5 variables?

##

## Logical Quantifiers

$\forall x \in S. P(x)$ means $P$ holds for *every* element of $S$.

$\exists x \in S. P(x)$ means $P$ holds for *at least one* element of $S$.

Define *valid* and *satisfiable* using logical quantifiers:

#

$\forall x \in S. P(x)$ is equivalent to $\neg(\exists x \in S. \qquad \qquad)$

###

Notation: $\textrm{pow}(S)$ denotes the *powerset* of $S$. The powerset
of a set is the set of all possible subsets of that S. So,
$\textrm{pow}(\mathbb{N})$ denotes all subsets of the natural numbers.

Notation: $A - B$ denotes the *difference* between two sets. It is the
elements of $A$, with every element of $B$ removed.

Notation: $\varnothing$ is the *empty set*. It is the set with no elements: ${ }$.

\begin{center} \Large

$$ \fillin S \in \textrm{pow}(\mathbb{N}) - { \varnothing } \ldotp \fillin m \in S\ldotp \fillin x \in S - {m}\ldotp m < x $$

\begin{comment} $$ \forall S \in \textrm{pow}(\mathbb{N}) - { \varnothing } \ldotp \exists m \in S\ldotp \forall x \in S - {m}\ldotp m < x $$ \end{comment} \end{center}

Write a logical expression that captures the meaning of: *Proofs can
certify that a computing system will always behave correctly, something
that no amount of testing can do.*

What does it mean to *test* a computing system? What does it mean for a computing system to *always behave correctly*?

#

Challenge Solved! (Fri, 2 Sep 2016)

The first challenge problem has been solved!

**Challenge 1.**From Class 2: I wasn’t very satisfied with my proof of the “Odd-Even Lemma” in class. I hope a student will come up with a better proof. Come up with a proof that passes the “skeptical reader” test and would convince someone with no prior knowledge of odd and even numbers.

There are three winning solutions (that all take quite different approaches):

- John Fry’s solution using the well-ordering principle
- Paul Sander’s solution using a game proof (although most game-style proofs are not so entertaingly written as Pauls, using a game to prove something is a very common strategy, especially in security and cryptography proofs.)
- Avi Parikh’s solution using proof by contradiction

Congratulations to the challenge winners!

(Challenge 2 is still open, and no viable solutions have yet been submitted.)