Writing Programs for "The Book" > Three in a Row

33.3. Three in a Row

If you were working out a collinearity problem with pencil and paper, how would you go about it? One natural approach is to plot the positions of the three points on graph paper, and then, if the answer isn't obvious by inspection, draw a line through two of the points and see whether the line passes through the third point (see Figure 33-2). If it's a close call, accuracy in placing the points and drawing the line becomes critical.

Figure 33-2. Three noncollinear points


A computer program can do the same thing, although for the computer nothing is ever "obvious by inspection." To draw a line through two points, the program derives the equation of that line. To see whether the third point lies on the line, the program tests whether or not the coordinates of the point satisfy the equation. (Exercise: For any set of three given points, there are three pairs of points you could choose to connect, in each case leaving a different third point to be tested for collinearity. Some choices may make the task easier than others, in the sense that less precision is needed. Is there some simple criterion for making this decision?)

The equation of a line takes the form y=mx+b, where m is the slope and b is the y-intercept, the point (if there is one) where the line crosses the y-axis. So, given three points p, q, and r, you want to find the values of m and b for the line that passes through two of them, and then test the x-and y-coordinates of the third point to see if the same equation holds.

Here's the code:

	(defun naive-collinear (px py qx qy rx ry)
	  (let ((m (slope px py qx qy))
	        (b (y-intercept px py qx qy)))
	    (= ry (+ (* m rx) b))))

The procedure is a predicate: it returns a Boolean value of true or false (in Lisp argot, t or nil). The six arguments are the x-and y-coordinates of the points p, q, and r. The let form introduces local variables named m and b, binding them to values returned by the procedures slope and y-intercept. I'll return shortly to the definitions of those procedures, but their functions should be apparent from their names. Finally, the last line of the procedure does all the work, posing the question: is the y-coordinate of point r equal to m times the x-coordinate of r, plus b? The answer is returned as the value of the naive-collinear function.

Could it be simpler? Well, we'll see. Does it work? Often. If you were to set the procedure loose on a large collection of points generated at random, it would probably run without error for a very long time. Nevertheless, it's easy to break it. Just try applying it to points with (x y) coordinates (0 0), (0 1), and (0 2). These points are surely collinear—they all lie on the y-axis—and yet the naive-collinear procedure can't be expected to return a sensible value when given them as arguments.

The root cause of this failure is lurking inside the definition of slope. Mathematically, the slope m is Δyx, which the program calculates as follows:

	(defun slope (px py qx qy)
	  (/ (- qy py) (- qx px))))

If p and q happen to have the same x-coordinate, then Δx is zero, and Δyx is undefined. If you insist on trying to calculate the slope, you'll get no further than a divide-by-zero error. There are lots of ways of coping with this annoyance. The method I chose as I first assembled the pieces of this little program was to have slope return a special signal value if px is equal to qx. The Lisp custom is to use the value nil for this purpose:

	(defun slope (px py qx qy)
	  (if (= px qx)
	      nil
	      (/ (- qy py) (- qx px))))

Like the slope, the y-intercept of a vertical line is also undefined because the line crosses the y-axis either nowhere or (if x=0) everywhere. The same nil trick applies:

	(defun y-intercept (px py qx qy)
	  (let ((m (slope px py qx qy)))
	    (if (not m)
	        nil
	        (- py (* m px)))))

Now, I also had to re-rig the calling procedure to handle the possibility that the slope m is not a number but a bogus value:

	(defun less-naive-collinearp (px py qx qy rx ry)
	  (let ((m (slope px py qx qy))
	        (b (y-intercept px py qx qy)))
	    (if (numberp m)
	        (= ry (+ (* m rx) b))
	        (= px rx))))

If m is numeric—if the predicate (numberp m) returns t—then I proceed as before. Otherwise, I know that p and q share the same x-coordinate. It follows that the three points are collinear if r also has this same x value.

As the program evolved, the need to make special provisions for vertical lines was a continual irritation. It began to look like every procedure I wrote would have some ugly patch bolted on to deal with the possibility that a line is parallel to the y-axis. Admittedly, the patch was just an if expression, an extra line or two of code, not a major issue of software engineering. Conceptually, though, it seemed a needless complication, and perhaps a sign that I was doing something wrong or making life harder than it had to be. Vertical lines are not fundamentally any different from horizontal ones, or from lines that wander across the plane at any other angle. It's an arbitrary convention to measure slope with respect to the y-axis; the universe would look no different if we all adopted a different reference direction.

This observation suggests a way around the problem: rotate the whole coordinate frame. If a set of points are collinear in one frame, they must be collinear in all other frames as well. Tilt the axes by a few degrees one way or the other, and the divide-by-zero impasse disappears. The rotation is not difficult or computationally expensive; it's just a matrix multiplication. On the other hand, taking this approach means I still have to write that if expression somewhere, testing to see whether px is equal to qx. What I'd really prefer is to streamline the logic and get rid of the branch point altogether. Shouldn't it be possible to test for collinearity by means of some simple calculation on the coordinates of the points, without any kind of case analysis?

Here's a solution recommended (in a slightly different context) by one web site, which I shall allow to remain anonymous: when Δx is 0, just set Δyx to 1010, a value "close enough to infinity." As a practical matter, I suspect that this policy might actually work quite well, most of the time. After all, if the input to the program derives in any way from measurements made in the real world, there will be errors far larger than 1 part in 1010. All the same, this is a strategy I did not consider seriously. I may not know what the Book version of collinear looks like, but I refuse to believe it has a constant defined as "close enough to infinity."