Stopping To Observe the Code

Posted on June 30, 2011

After reading the Little Schemer a while ago I became convinced that it was the best introduction to programming I have ever read. That it teaches the concepts of programming using Scheme is incicdental; they don’t spend much time teaching the language at all. Instead the book focuses on concepts and builds up from a small set of primitives all the way to the Y-Combinator (and a little bit beyond as well). The whole time I never felt I had to implicitly trust what the author was saying because the book would walk through the proof of each concept step by step. I am not aware of any other introductory title that takes this fundamental approach.

So suitably impressed was I that I have begun reading The Seasoned Schemer. It continues right where the previous book left off; picking up the chapter numbers as if it were literally a continuation of the first book (which it is in many regards). Within the first chapter I have been re-introduced to some familiar problems to get started that I wanted to share because I think there is some beauty here we programmers should all be aware of.

In order to ensure that I understood the problem and its solution, I decided to challenge myself to express the solution in other languages that I am familiar with. I also wanted to use the natural idioms of the language as much as I understood them. I immediately chose Python and Common Lisp. Feel free to contribute and discuss others. The goal isn’t to benchmark the run time performance of the solution but to discover a correlation between the expression of the solution, its implementation, and interpretation by humans.

First the problem: write a function that takes a tuple (a list of numbers) and returns a list summing the numbers that preceed each element to that element in the original tuple. For example:

sum_of_prefixes( (1, 1, 1) ) => (1, 2, 3)

sum_of_prefixes( (1, 2, 3, 4, 5) ) => (1, 3, 6, 10, 15)

The solution given in The Seasoned Schemer is:

(define sum-of-prefixes
  (lambda (tup)
    (sum-of-prefixes-b 0 tup)))

(define sum-of-prefixes-b
  (lambda (sonssf tup)
    (cond ((null? tup) '())
      (else (cons (+ sonssf (car tup))
              (sum-of-prefixes-b
               (+ sonssf (car tup))
               (cdr tup)))))))

This is by far my favorite solution. It states the problem and its solution explicitly. The structure of the function’s value is structure of the function itself. There is no implicit knowledge anywhere in this function. If you know Scheme, you can read this and understand what it is doing right away. No guessing or interpreting required.

Contrast that with Python:

def sum_of_prefixes(tup):
   return [x for x in _sum_of_prefixes_b(0, tup)]

def _sum_of_prefixes_b(sonssf, tup):
   for n in tup:
       sonssf = sonssf + n
       yield sonssf

The apparent shape of the problem is much different. Python doesn’t have a cons primitive and recursion, while supported, isn’t a paradigm that the language supports idiomatically or technically very well. Instead it prefers iteration which really is just a special case of recursion anyway. Naturally I reached for a generator when I needed a stateful iterator and the resulting solution was pretty succinct when compared to the Scheme version.

However, try to observe the shape of this problem. In sum_of_prefixes we see that the return value will be a list constructed from a list comprehension that iterates over our generator. Underneath this code we know that at some point Python has to allocate an array in memory to hold these values since it doesn’t have a linked list from which it can generatively construct the result. We can test and verify that this function is indeed correct and be happy with it. However, if for some large tup we find that it begins to slow down, we will soon realize that we have to figure out some of Python’s internals.

Unfortunately I feel that this is where the Python solution falls short — the true nature of this solution is hidden in a black-box. If we want to be able to reason about the time and space complexity of this solution we have to know how Python constructs and allocates the array that the list comprehension creates. Does it allocate an empty dynamic array and append elements to it as it iterates over the generator? Does it implement the “array” as a heap of some sort? Does it use some other trick? How can we reason about that which we cannot see?

Of course this is an extremely trivial problem. We could write a function like this and be be perfectly happy with it. Even if it wasn’t the most performant. Stretching the example to large tup would be an artificial excercise and not prove anything. The real cost here is that a Python programmer can write something so succinct and not worry about its implementation until it matters. That’s when things can get a little hairy.

(defun sum-of-prefixes (tup)
 (let ((sonssf 0))
   (loop for n in tup
         do (setf sonssf (+ sonssf n))
         collect sonssf)))

Finally I went for a Common Lisp solution. If brevity was our goal, this example would win by leaps and bounds. However, even more so than Python this solution is hiding a score of information under its skirts. That’s because the loop function there is actually a Lisp macro. At compile time it will generate a bunch more code to do what we actually want. So while Python hides some information from the programmer simply by nature of its implementation, we can willfully hide details from the programmer in Common Lisp.

How does the shape of this solution work out? Well we have to know what the code that loop will generate looks like. Fortunately Common Lisp makes this quite easy and if we’re interested, but I’m not going to go into expanding macros and all that jazz. You should take a moment to look it up sometime.

So there you go. I still prefer the Scheme version the most for being so correct and explicit. However, I also appreciate the brevity of the Common Lisp version. It’s quite practical to brush the details under the carpet once the parameters are well understood. I think this is what ultimately makes Common Lisp quite powerful. It has also affected the way I write Python quite a bit lately and I see it as a good thing. I think ESR was quite right in saying that learning Lisp might make you a better programmer.