Static Type Checkers for Dynamic Programmers

Posted on October 29, 2018

There was a time in my career as a programmer when I balked at the utility of static type checking and type systems in general. For me, at the time, duck-typing was good enough. I believed unit tests were a good enough specification language. I imagined Haskell programmers spending more time noodling around with satisfying their type checkers than writing useful programs. After all, how would learning Haskell help me become a better programmer? I was already a good programmer and worked on important systems powering important businesses and I’d never needed a type checker to help me.

If you’ve been involved in Javascript or Python in the last few years you have probably encountered the trend towards adopting type systems. If you haven’t then check out Flow or TypeScript for Javascript. In Python you’ll want to search for MyPy. These are all tools that add static type analysis to otherwise dynamic programming languages. By annotating your code a program called a type checker will verify your specifications for you and help you write and maintain your code.

The Difference Between Static and Dynamic

I’ve come to understand that regardless of whether your language uses a static or dynamic type system the types are still there. Type checkers come from type theory. Type theory is a class of formal mathematics which happens to have a computational interpretation. That means that regardless of whether you use Javascript, Python, Ruby, C, or Rust the types are still there.

The difference between statically-typed and dynamically-typed language is how these languages make the types visible to the programmer. In a statically-typed language the types are encoded in the source code of the program. In a dynamically-typed language the types are interpreted. In the former you can see the types in the source code and use a program, called a type checker, to verify your program for you. In the latter you can only check your types by running the program with a program called an interpreter.

The user experience is quite different. As a programmer in a dynamic language I am used to writing a program and learning to predict how the interpreter will evaluate my program. I may write something that doesn’t make sense on purpose in order to assert my understanding of my program’s evaluation. The key difference from a static language is that I have to run my program to validate an assertion.

In a static language with a sound type system like Haskell’s I don’t have to execute my program in order to reason about its evaluation. The type checker, with a little bit of help from me, will validate my assertions while I am working on my program. In a language like Haskell I can’t even compile my program until those assertions are proven correct!

What Is It Good For?

You might be asking yourself this when you read through Flow’s documentation. If you’ve been shipping production Javascript code for years without it what are all of these extra annotations going to get you? Why does all of this sound so complicated?

The types are there whether your language and all of the literature you’ve come to read about it say. It’s a matter of visibility. When you start working with type systems there are going to be a number of new things to learn. The first thing to keep in mind is that you don’t have to learn everything all at once. It’s a journey and not a destination.

Just think of how hard it was for you to wrap your mind around variables and loops when you were first learning programming. Learning how to use type systems is much like learning how to program for the first time.

The benefits are:

  • Documentation
  • Maintainability
  • Reason-ability

Documentation

How do you document your code? You document your code, right? Many languages have documentation tools that allow you to annotate your functions with specially-formatted comments. Maybe you use one of those. Maybe you use conventions you’ve carefully curated into a style guide that allows you to write self-documenting code.

The problem with these solutions is that they require discipline from your team to maintain. All too often requirements change over time and with enough cooks in the kitchen someone is going to merge code that is out of sync with its documentation and… well there goes your trust in that documentation.

However if you document your code with type annotations you get a form of self-documenting code. However unlike conventions this kind of code is self-documenting and cannot drift out of date: it has to be valid or the type checker would complain and show you where to fix your code.

Take this trivial example:

/**
 * Sends a message
 * @param {User} from - The user sending the message
 * @param {User} to - The user receiving the message
 * @param {string} message
 */
const sendMessage = (from, to, message) => {
  ...
}

What is a User? What fields does it have? How do I make sure that from contains a valid value of that type? If I change the definition of a User how will I know that I’ve handled all of the new cases I need to handle?

Contrast that example with:

const sendMessage = (from: User, to: User, message: string): void => {
  ...
}

This might look a little more sparse. Try to remember the difference between dynamically typed and statically typed languages: the type checker can use the annotations in our source code in the latter example to verify our code for us. For documentation purposes it also means that our definition of User can change and if the program still passes the type checker then it must be type-correct as well.

Another useful property of many type checkers is that they allow our tools to query our source code. If you need to figure out what a User is you can ask the type checker and it will tell you (this kind of interaction will be a common theme…).

Maintainability

Not only does a rich type system ensure that people reading your code can trust that it is type-correct but it will also help them with extending your code.

One common pattern we use in many programming languages is defining a bunch of structures that are all related and we write code to discriminate them by some key field. Imagine a system that accepts commands from a user using the program. We often need a data structure to represent any one of the commands and we’ll have specific fields for different types of commands:

const registerUserCommand = {
  type: 'registerUser',
  username: 'foobar',
  email: 'foo@bar.baz'
}

const addToCartCommand = {
  type: 'addToCart',
  productId: 'abc-123',
  cartId: 'foobar'
}

const dispatchCommands = commands => {
  for (let command of commands) {
    if (command.type === 'registerUser') {
      registerUser(command.username, command.email)
    } else if (command.type === 'addToCart') {
      addToCart(command.productId, command.cartId)
    } else {
      throw new ShoppingError(`Unrecognized command: ${command}`)
    }
  }
}

Keep in mind this is a trivial example but when someone needs to add a new command how will they know that the entire program handles the new case properly? They may start by writing some unit tests and devising some way to make assertions that the case is handled. Can you trust that they’ve hit all of the places in your code where it needs to be handled? Do you know what all those cases are?

What if we could query the type checker and ask? Using Flow to annotate our example:

type Command
  = { type: 'registerUser', username: string, email: string}
  | { type: 'addToCart', productId: string, cartId: string}

const dispatchCommands = (commands: Array<Command>): void => {
  for (let command of commands) {
    if (command.type === 'registerUser') {
      registerUser(command.username, command.email)
    } else if (command.type === 'addToCart') {
      addToCart(command.productId, command.cartId)
    } else {
      (command: empty)
      throw new ShoppingError(`Unrecognized command: ${command}`)
    }
  }
}

So long as our team remembers to use that convention, (command: empty) in our else branch of this conditional statement when a programmer adds a new command type the type checker will tell them all of the places in the codebase where they have to add a new branch to handle the new command!

It also prevents those same programmers from merging a typo when comparing the type field of the command objects when adding the new case in.

Thanks to I’Boss for the tip!

Reason-ability

These type checkers for dynamic languages are not as powerful as those exposed by languages like Haskell that are statically typed since our type systems in Javascript and Python are much less strict and able to cast objects of different types to other types at will.

However by struggling to add and utilize those type annotations we gain the ability to expose those underlying types which helps us discover the hidden structures in our programs. Understanding what those structures are and how to manipulate them is called, reasoning.

When you begin to annotate your code with types you can start seeing common patterns. It may become necessary to encapsulate some details about these patterns into a module. This is what we call creating an abstraction: we hide unnecessary details to allow us to reason about bigger problems. Types allow you to see those patterns clearly and you can use a type checker to verify your abstractions hold. If someone tries to break the rules they’ll get a warning from the type checker before they have to write all of their unit tests.

Conclusion

Type theory is awesome and statically typed program analysis is the long-awaited future. These tools have been around a long time and the theory has evolved. Type theory has opened up whole new branches of formal mathematics that have allowed us to model and develop proofs of programs we were never able to before. It is becoming so good that proudly dynamic languages like Javascript can no longer afford to ignore them and are starting to adopt type checking tools to help programmers document, maintain, and reason about large programs.

The purpose of these tools isn’t to merely ensure that another programmer doesn’t give your function a string when you need a number (although it does do that). They are to help you work with your code and manage complexity. They do that by ensuring your code is documented and that documentation is correct and never out of date. They ensure that your code will be maintainable by indicating unhandled cases of new types that may be added over time and that programmers can’t introduce type errors by accident. And they allow you to reason about your code using familiar tools like algebra to substitute patterns in your code with strong, verified abstractions.

As Cal Newport once proclaimed: be so good they can’t ignore you. I think type theory has done that for programming. It is becoming difficult to justify not using a static type checking tool when one is available for your language.

It may take extra effort to learn all of the new terminology and concepts if you’ve been drinking the dynamic programming kool-aid for a decade or more as I had. However it’s no harder than learning your first programming language and with perseverance and tenacity you can master a new tool that will help you in the years to come with type checkers become the norm.