You will often come across problems in life that require you to sort stuff. Sometimes its your unwieldy comic book collection. Other times its arrays haphazardly filled with data. Fortunately Python wins out on real-life once again and helps us sort things really easily.

A toy problem to illustrate our example: write a function that returns the N longest lines from a given input file.

We will need to know how to sort lists in order to solve this problem. Your clue should have been the word, “longest,” in the problem description. In order to determine which is the longest line, we have to know how long all the other lines in the file are. Since our solution must return as many of the longest lines as we want, the simplest way to do so is to sort the lines into order from longest to shortest (or vice-versa) and return as many lines from either end as needed.

Now, first things first: go read about the various sorting algorithms on Wikipedia. Then go look for implementations of those algorithms in Python. Then store that information away somewhere that you can get at it when someone quizzes on you them later. It won’t be me, but guaranteed it will come up at some point.

Fortunately Python has a sort method available to list instances. If you’re curious, it’s an implementation of a supposedly complicated algorithm known as timsort. It takes a few keyword arguments, but we’re only really interested in one: key. You may be tempted to use cmp, but you will find that your sorts will slow down quite a bit. This was the old way of doing sorting and it’s no good anymore. The reason being that cmp will be run consecutively many times over pairs of values in the list. It’s gone in Python 3.x. One last thing of note on sort — it will mutate the list in-place which can have consequences that you should read about in the Python documentation later on (the alternative is to use sorted which returns a new iterable).

So enough talk; let’s see one possible solution:

Code:
# This example assumes Python >= 2.6

def n_longest_lines(infile, n):
    """Return `n` longest lines from `infile`"""
    lines = []

    with open(infile) as f:
        for line in f:
            lines.append((line, len(line)))

    lines.sort(key=lambda x: x[1]) # sort by line-length
    return lines[-n:]

Pretty handy, eh? The key trick here is the data-structure we sort on. Take a look at the lines in the with block. Can you figure out what it looks like?

Code:
lines = [("The quick brown", 15), ("fox", 3)]

It’s a list of tuples. The first field of the tuple is a string read as a single line from the file and the second field is the string’s length. This structure then allows us to use the sort method to sort our list. All we have to tell sort is which field to look at.

The sort method takes a parameter, key, which accepts a function that will be called once on every value in the list. In this example I’ve passed it an anonymous function (hint: lambda) which simply returns the second element of the tuple. It’s job is to return a value that the sort method can work with.

Side note: You could improve the readability of this line by looking at the itemgetter, attrgetter, and methodgetter methods from the operator module. Read up on them. Handy stuff.

Finally, to round out our function we use Python’s array-splicing awesomeness to return the requested number of longest lines. If you don’t know how to splice arrays in Python, be sure to look it up. Here we specify a negative number in the from field in order to return the n number of elements from the end of the array.

That’s all there is to it! Sorting in Python is pretty easy and as of Python 2.3, guaranteed to be stable. There are only a few gotchas to remember: sort sorts the list in-place (meaning that it changes the original list and you won’t be able to get access to the original list) while sorted returns a new iterable (leaving the original list intact); try to ignore cmp as it’s not used anymore. Also, don’t think you can completely forget all your sorting algorithms! Try writing a few implementations on your own, test them out with large samples of test data, time them, and see if you can make them faster (or compare the different approaches). Sometimes Python makes things too easy and sorting is one of those things. It’s a double edged sword, but it’s a nice one to have!

If you want to know more about Python sorting, check out this mini how-to.

Technorati Tags: ,


G. H. Reynolds at the Washington Examiner writes:

… a college degree is an expensive way to get an entry-level credential. New approaches to credentialing, approaches that inform employers more reliably, while costing less than a college degree, are likely to become increasingly appealing over the coming decade.

As I’ve been recently looking for work, I couldn’t help but be struck by the number of positions, even mundane ones, that claim to accept no less than a Computer Science degree. Many of the people interviewing for these positions erect great barriers to keep the masses out: tests, multi-stage interview processes — filters to keep candidates out of their doors. Competition must be absolutely fierce.

When competition is this fierce it can only mean that the market is flooded with talent. While I had my head down for the past five years doing real work an entire generation was toiling away in the ivory tower. Now they are loose and sucking up all the available jobs no matter how over-qualified they are. That’s why they invest in higher-education even if they cannot afford it. If they didn’t, they might not have any job at all.

Unfortunately, I think Mr. Reynolds is right. Tuition is becoming (and has been for some time in my opinion) prohibitively expensive. The returns are not coming around either. These graduates are faced with a landscape of scarce job opportunities and diminishing salaries. It’s a wonder that they still show up every fall in droves, dying to get into classrooms. Even if they get accepted and graduate they’ll spend another eight, ten, or more years in crushing debt unless they’re really smart and lucky and get snapped up by a bank or win the startup lottery.

Meanwhile, the auto-didacts and hackers who’ve been doing this work in the field for years are losing credibility to these people with each graduating class hitting the market. The flood of bushy-eyed graduates looking for any job they can get their hands on to pay their tuition debts have left many businesses lapping in the luxury of having an abundance of highly-skilled cheap labour. It makes us highly skilled, expensive labour-types less appealing. Seems experience doesn’t count for much in this kind of market.

Call me what you will, but I think something needs to be done. These highly-skilled (or not, who knows) graduates invested in an education that promised them a career. Myself and others like me worked our behinds off to climb to where we are on our own (bypassing the debt, stress, and anxiety of higher education). Yet neither party is entirely happy in this market place. I think new authorities need to arise to the occassion, as Mr. Reynolds suggests, and offer a new form of credential system. Universities should stop being clearing houses for careers and develop life-long researchers and scholars (or at least stop kidding everyone into thinking that to build the next twitter, amazon, or ebay that you’ll need a degree).

The urgent problem is to win the trust of industry and provide an alternative way to certify the abilities of job seekers regardless of their background.

Or maybe there is no problem and we need to bypass any system all-together and organize ourselves instead of trusting institutions to certify our knowledge, understanding, and experience.

Technorati Tags: ,


Since I moved away from rolling everything by hand and hopped on the web framework bandwagon, I haven’t looked back. Having the ORM tied into the URL dispatcher and templating engine in one unified API is really nice. Everything I’ve needed to build in a website is handled.

However, since about the same time I started using framworks AJAX started making a come back. At first most of the clients I worked with didn’t take notice of it. As it gained traction however, the whole thing started to change. Now more than a few years on most clients practically expect it. It’s almost taken for granted.

The thing is, very little of the web sites and applications I build today are page-based. I think this trend is only going to continue. These javascript frameworks are becoming more like thin-clients requesting data from web services. This has forced me to contort my web applications to bend around this idea. The frameworks still think of dispatching a URL and sending out a rendered web page. Now more often than not they just need to spit out some JSON data. Most web sites and applications these days are rather sophisticated client UIs and they don’t make calls to a server for pages. They want data.

While I’m not the biggest fan of AJAX-based clients, I don’t think I’m going to win this one. So I decided to join in and I wrote a JSON-RPC controller class for Pylons. Many javascript frameworks support JSON-RPC so I figured my web framework ought to as well. If you can’t beat them, join them.

I think more frameworks should evolve into JSON-RPC frameworks. Or perhaps new frameworks entirely should spring-up that are based on simply providing data services instead of web pages. Let the javascript clients worry about the UI entirely.

Technorati Tags: , , , ,


Well it looks like I’m once again on the market, having finished my tenure at Digisphere. It’s been a great experience and I got to work with some very talented people. They’re about to launch the newest version of fotoglif soon, so keep an eye out for that. It’s been nearly three years since I started working with them. I wish them all the best going forward.

New horizons await. Though I’m not entirely sure where I’m heading yet. I’ve got about five years of experience under my belt now building web sites and applications big and small. It’s been a while since I’ve found the job challenging to be honest. I’m eager to stir things up and try a move to the next level. Maybe I need to work on some big iron enterprise applications again? Maybe I need to find a new startup to work with? Maybe I need to weasel my way into research and development? Or perhaps take a one-eighty and get into fixing cars? So many choices to be made.

Well, either way I’m keeping an open mind. If anyone out there in Internet-land has suggestions or needs a competent senior programmer, drop me a line: james [-A T ] agentultra dot com. Check out my colophon page to see what I’m all about if you don’t already know. You can check out some of my open source projects or ask me to see some source code if you are looking for specific talents. Looking forward to hearing from you.


Limbo. It opens with no explanation. No story.

A beautiful and haunting experience

The art in this game is fantastic. And perhaps it is the feature of this game that makes it stand alone. No text, context sensitive cues, or in-game tutorials. It’s blurry, monotone, and as foreboding as it sounds. I love the art in this game. It reminds me a lot of whoever did the flash animations and album covers for Toronto indie-band, “The Birthday Massacre.” It’s great stuff and well worth trying the demo just to see for yourself.

However, I’m afraid I didn’t find much of a game. This is a work of art sure, but Limbo left a certain part of my brain alone in the corner of the room. Maybe it’s because I’ve had a few and most of my brain is sitting alone in the corner of the room. But I don’t think even sober I would be very impressed. The tutorial was basically a gauntlet that you cannot win without dying first. The game is a series of clever traps and beautiful expositions. I spent most of the time dying. I’d walk the little boy-shadow to the right until he was killed by another innocuous looking shadow, respawn, and try to figure out how to move past the deadly shadow… until I was killed by another one further along.

Although maybe that’s the real purpose of the game and my life suddenly feels shallow and empty.

It was a little disappointing. I must admit that I had been salivating to play this game since I saw the first early preview videos. I love games that try to be something more than the status quo for the medium. Unfortunately, I don’t think this game lived up to that expectation. The art is excellent in every regard, but the gameplay just isn’t there. I’d say that this “game” is more of an “interactive experience” like an installation piece or improvised theatre. Only that you have to suffer its repetitive nature until you reach the end. One can only hope that there is some sort of mind-altering revelation after it is all said and done because frankly after playing the demo I can hardly see how one could be possibly motivated to do so.

YMMV

Technorati Tags: , ,


Why is it that mature students only get access to humanities and social science tracks? I’ve been looking around at various academic bridging programs and admissions requirements at universities in my area. Not one of them will admit mature students into science, engineering, or even mathematics. At best you can discuss your options with an “admissions officer.”

If someone could enlighten me, I would greatly appreciate it. I’ve probably read a lot of the course material for computer science and electrical engineering already. I’d just like to know if it’s possible to get into one of these tracks some how. Programming in the real world just isn’t challenging anymore and I’m not getting a lot of personal satisfaction from it. I’m much more interested in research and applications that can have more than a capital influence on the world.

I’m just getting the sense that the ivory tower sees me as a plebian and and is keeping the doors tightly shut.

Technorati Tags: , ,


I just uploaded some updates to my pylons fork this afternoon. The update is a little radical (as I suspect most projects are in their early stages). Among the changes are:

  • JSONRPCController will target the 2.0 spec and bypass all the 1.x non-sense
  • JSONRPCError now accepts code and message parameters
  • Pre-defined reserved errors exported
  • More unit tests

The error handling is coming along, but leaves something to be desired in my opinion. For one, some of the reserved errors aren’t necessary to export to the controller implementor (“Parse error” comes to mind). There are also a few instances where I’ll need to return better errors such as catching parse errors when decoding the request body. As always, if anyone has any suggestions I am all ears.

Once error handling is pretty and clean I think the next steps will be to look at batch and possibly notifications. I’m not even sure these can be supported on Pylons. Ideas welcome.

Technorati Tags: , ,


You do not have to attend a university to learn about math. The same is true of physics, writing, engineering, and a myriad of others. If you are motivated and curious you can do a few searches online, check out some books from your library, and receive help and guidance online in chatrooms, forums, and the like. Education has been free and available to inquiring minds for a very long time. I don’t think I need to go through the laundry list of famous individuals and free-thinkers who taught themselves everything they needed to know and left an indelible mark on human history. Access to fine published work and research material has never been more cheap and accessible than it is today. If you want to learn math, you can learn everything you need without stepping foot in a classroom.

The body of knowledge available online is continually growing. It grows not only in volume, but in quality as well. Fine institutions such as MIT give away lecture videos for free. Others give away course material as well. Experts in industry have been sharing knowledge on the Internet for over a decade. And there is no evidence to suggest that this deluge of freely available knowledge will ever stop.

Why then is the University of California making such a big deal about offering degrees and courses online? In an article on SFGate, the Berkley Law School Dean, Christopher Edley is quoted as saying:

“We want to do a highly selective, fully online, credit-bearing program on a large scale – and that has not been done.”

Well I suppose it has been done by Stanford, but perhaps he has a little more ambition. I applaud that ambition. I think it’s a fantastic idea and that UC Berkeley is lucky to have someone who doesn’t have their head in the ground.

“We find Dean Edley’s cyber campus to be just the beginning of a frightening trajectory that will undoubtedly end in the complete implosion of public higher education” in California, Berkeley doctoral student Shane Boyle testified.

Perhaps his quest for tenure has forced Shane Boyle to accept certain delusions. I don’t know and cannot say for sure. But it is obvious he’s not the only one that feels the way he does. I just want to point out that this is a form of wishful thinking. Many people believe that universities are an unshakeable pillar of civilisation. Perhaps it’s because these institutions have been around for nearly a millenium now. However, in that time the concept and model of university has undergone many changes. It’s just that people often resist change and fear it. Yet when presented with all the evidence that it’s true, they will find any excuse not to believe it. Some will argue unfavourably that it simply isn’t true; others will irrationally defend it with hyperbole, convinced that change will bring the entire Ivory Tower crashing down. They wish it wasn’t true and act as if it weren’t. Sadly, no matter how much they deny it or convince others not to believe it their delusions will reveal themselves and life will go on without them.

Such notions smell of elitism. The trouble I have with universities is that all they have to sell is experience and identity. They operate on one single asset: prestige. The knowledge they used to have authoritarian control over is practically free. They cannot keep it trapped in their monastic institutions anymore. All they can rely on now for their income (which is becoming an epidemic concern) is the allure of the prestige that their brand-name grants their students. You attend these universities because it says something about you and grants you a feeling of entitlement.

Education isn’t about which school you attended or who you sat next to in lecture. The truly curious have always sought knowledge. The Internet does nothing to hamper these efforts as we’ve seen. It fosters communication and community. Universities should be (and some have been) embracing the Internet and exploiting it to their advantage.

So I hope that the dean will have success in his lofty pursuit. If his school wants to stay relevant and connected to the modern world, they will not tarry to follow his lead. If the day comes that they do offer full-degree courses online I might even be one of their first students. I love learning and education. I just want there to be a better way to learn. The body of knowledege we have now and the tools available would be greatly enriched by the success of this dean’s initiative. I wish him the best of luck.

Technorati Tags: , , ,


As defined by Wikipedia:

In computer science, a remote procedure call (RPC) is an Inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When the software in question uses object-oriented principles, RPC is called remote invocation or remote method invocation.

I’ve recently worked on creating a controller class for Pylons that implements a form of RPC called JSON-RPC. It uses JSON (as you may have already determined) as the serialization format for initiating RPCs. And it turns out, this is really useful for web applications.

The beauty of it is that RPC allows us to translate from one language to another. As long as our source and target languages can both interpret the serialization format we are using to pass our method calls, we can expect that what we write in the source language will behave the same when we translate to the target language. For web applications this means that I can write my user interface code in Python and translate it later to Javascript. This is a really elegant way of building sophisticated user interfaces because it allows me to express my ideas using the semantics and idioms of user interfaces using the source language I am used to.

On the server-side I can write my controllers as services providing methods to a client application. This frees up my code from managing routing, headers, and serialization. All of those low-level details are handled by the JSONRPCController. I simply write normal-looking Python methods that are scarcely aware that they are serving a web-application at all.

So if you look at the entire stack from the top-down: Pyjamas, Pylons + JSONRPCController; the code doesn’t appear to be a web-application in the typical sense. The controller classes look more like DAOs. The interface code looks like typical desktop-interface code calling the DAOs to populate widgets and forms with data. This level of integration is possible because of JSON-RPC: when such an application is deployed, the Python interface code is compiled to Javascript and it works just as advertised.

As we port more and more of our desktop experience to the web, this will be the way to go for developers.

Technorati Tags: , , ,


Just a quick update here to let everyone know about my pylons fork. I’ve added my JSONRPCController and a small suite of unit tests. Now you can clone the repository and run the tests, play with the code, and if you’re up for it post some feedback. Oh the excitement is simply delicious.

Of course, there’s a bigger plan here. I hope JSONRPCController will be integrated into pylons, but it is just the first step. As you may be aware, I have been excited by the recent activity around pyjamas. Once JSONRPCController finds a home, the next step will be to create a setuptools plugin and pylons project template for building pyjamas-based projects on top of pylons.

However, it might be that this later stage of my plan will not be so easily accepted into the pylons core. I may have to spin it off into it’s own project. What would you call such a beast? A web application framework where you write the entire application in a single language. No HTML, templates, javascript, SQL, etc. Just straight Python. Nirvana? PyNirvana? I’m not good at the marketing stuff. Help me out here.