Run length encoding in Lisp is kind of a trivial problem. Not one that would typically come up in most applications. I came across a treatment while reading the third chapter of ANSI Common Lisp by Paul Graham. I’ve read The Little Schemer and a friend just handed me a copy of ANSI Common Lisp. My style has been greatly influenced by The Little Schemer so when I encountered Paul Graham’s example, my nose crinkled a little and I wondered if I could write a more concise solution. I thought it would be a simple kata.
The “problem” goes like this. You have a list like:
(1 1 1 0 1 0 0 0)
And you want a more concise representation. Paul Graham suggests:
((3 1) 0 1 (3 0))
For example. Here’s Paul Graham’s solution. I hope I understand fair-use correctly:
(defun ansi-compress (lst)
(if (consp lst)
(compr (car lst) 1 (cdr lst))
lst))
(defun compr (elt n lst)
(if (null lst)
(list (n-elts elt n))
(let ((next (car lst)))
(if (eql next elt)
(compr elt (1+ n) (cdr lst))
(cons (n-elts elt n)
(compr next 1 (cdr lst)))))))
(defun n-elts (elt n)
(if (> n 1)
(list n elt)
elt))
And here is my attempt at a more “Little Schemer” style solution:
(defun compress-b (p c lat)
(cond ((null lat) (cons (cons c p) '()))
((equal (first lat) p)
(compress-b (first lat) (1+ c) (rest lat)))
(t (cons (cons c p) (compress-b (first lat) 1 (rest lat))))))
(defun compress (lat)
(cond ((null lat) '())
(t (compress-b (car lat) 1 (cdr lat)))))
As you can see, it’s a few lines shorter. It’s also in many ways, functionally identical. However, my return data structure is a little different. Even if there is a single element, I return a cons. With a little extra code I could probably return an identical data structure. But it will require some way of emulating “n-elts.” However I think I like the uniformity, even if it trades off a tiny, tiny but of space. My function will always return a list of cons cells unless the input list is empty (in which case it will return an empty list).
Of course I’m still learning. I did a trace of the “compress-b” and “compr” functions to test my assumptions. When I first expanded the calls to “compress-b” I was slightly alarmed by the number of returns on the tail end of the function. However, I see that Paul Graham’s solution is the same. The question I have left is whether this is normal and desirable behavior. It seems to me that a single return of the final “chunk” would be better. If there are any Lisp experts in the crowd, would you care to enlighten me?