Nathaniel Brown's

Python Lessons Learned From Go

Written April 06, 2018

tl;dr Go is cool. Python type annotations with mypy are cool. Writing Go has lead me to be more disciplined with my Python development.


Introduction

Go is one of the hottest new programming languages, so a few months ago I spent some time checking out the hype and writing some Go. I had a great time and learned a few tricks that have influenced the way I do Python development. In general I found that Go felt "safer" than Python which depending on what you're doing can be a very nice feeling.

I found the following advantages when programming in Go as compared to Python:

  • Static typing eliminated some runtime errors
  • My text editor gave me auto-completion and documentation I could rely on
  • Code formatting was defined by the language, so I never had to think about it
  • Parsing/Marshaling JSON validated format and types out of the box
  • The language had a limited number of constructs and was easy to learn

Now let me say right now that I know that for each one of these "advantages" there will be a Python fanboy who does not see it as an advantage. So let me try to diffuse any tension right now by saying that this post is about how Python can give us all these things, if we want or need them. If you don't want to declare types for all you variables, you don't have too!

Motivation (skip if you don't care, this post is long)

Lately I've been working on a Python project that connects two other projects, both under active development, where the interfaces between projects are very complex CSV files and a JSON API. In this scenario data integrity is very important, and our code was not doing a great job keeping things straight. We had NoneType and KeyErrors cropping up like pigweed. From this perspective, Go looked very attractive and I thought about switching to Go but decided not to because:

  • I work with other people and I would prefer they not hate me for forcing them to learn yet another language
  • Joel Spolsky told me never to rewrite code in this blog post: Things You Should Never do Part I
  • Go is a new language and the third party libraries are nowhere near as mature as Python's
  • I don't know enough about Go yet to be confident that this would be a good idea

Instead I decided to try to change my development style and Python toolkit to mimic some of the things I liked about Go. With that idea in mind, I set out to make my development more Go-like, starting with...

Static typing

Go is statically typed and Python is dynamically typed. Static typing can be really nice. So can dynamic typing. Luckily Python now supports the best of both worlds with type annotations. Type annotations allow you to add "hints" about what type a variable might be, without any runtime type checking. They support duck typing and in general are quite flexible (much more flexible than Go, I might add). You can do things like declare Union[int, str]. These type hints have been enhanced in the new Python 3.6 syntax. As a first step towards Go-ifying my Python code, I went around annotating all my function and method calls.

def goify(code: PythonCode) -> BetterPythonCode:
    # I guess it's not *quite* this easy
    return simplify(annotate(pep8(code)))

That had great implications for my code's readability, but to get the full benefit of all those annotations I really needed to enforce that they were correct. Otherwise they would be little better than docstrings. The solution was mypy, a static analysis type checker for Python. You can run mypy <my package/module> and if you've annotated your code it spits out some type related errors and warnings. If you've annotated most of your code or at least a section of your code, these messages can be pretty useful.

However, this does take some configuration and experience to get things running smoothly. I run mypy --ignore-missing-imports --strict-optional --mypy-path=stubs (actually I use a mypy.ini file, but those are the equivalent command line options). I then created a stubs folder where I annotate a few key third-party library APIs, you can read more about stub files here.

Lately my team is experimenting with turning on the --check-untyped-defs option which forces you to annotate everything. This feels pretty hard-core to me, but is still fairly flexible as you can declare a variable to have type Any which allows it to hold any type. Now our entire codebase is annotated and so far no one has complained...

In comparison to Go, this all takes some effort and it can be easy to backslide. With Go, the language enforces all of these type constraints. If you violate them your program doesn't compile. With Python you need to make sure you are annotating everything properly and then checking the annotations properly, which you can somewhat easily mess up. To ensure that the type annotations were kept up-to-date and checked in a consistent manner, I added mypy type checking both as a pre-commit git hook and as part of our continuous integration process.

Text editor auto-completion and documentation

Once my Python code was annotated, my text editor (Visual Studio Code) started to give me much better auto-completions and docstrings. The auto-completions were so good that I could pretty reliably assume that if I started typing and no auto-completion came up, I'd made an error. This was on-par with the development environment I was using with Go, and really improved my productivity.

Code style enforcement and auto-formatting

Go comes with the gofmt tool, which can automatically format your go code. This means there is one, and only one way a Go program should be formatted. This has great benefits for readability, particularly when switching projects, as all Go projects are using the same formatting. It is also very easy to have your editor apply gofmt on every save so you don't have to bother fixing formatting manually.

Python pioneered strict formatting by making indentation part of the language syntax. I think this was a great move, but Go has definitely taken this to the next level. How do we bring Python back up to snuff? Well, we can conform to pep8 and use a linter to enforce this. I chose flake8 as my linter and installed it in my pre-commit hook and CI process, alongside mypy. I also used autopep8 and yapf to help bring the existing code base up to pep8 standards. These saved a bunch of time, but I still had to do a fair amount of manual work and some things were formatted in weird ways I didn't like (but I guess were pep8 compliant?), so I'm not about to set my editor up to run them automatically.

How does this compare to Go? Once again Python requires more effort, but we can get close to Go by selecting the right tools and setting them up to run automatically with continuous integration and commit hooks.

Update: Since writing this post black has taken Python code formatting by storm. It is safe to run automatically and as part of your CI. It has essentially eliminated reformatting work from my team's commits. I highly recommend it.

Parsing and validating JSON

Type annotations and mypy are good at keeping track of your data's format inside your application, but what about data from outside, like JSON APIs? Typically in Python you convert JSON into lists and dicts, which mypy really can't help you much with. In Go you would typically define some structs, and then a marshaller would handle loading everything into objects and validating types for you. At first this feels like boilerplate, but when people keep changing an API on you, it's really nice to validate everything upfront.

In Python we have marshaling libraries like marshmallow that can help with the validation, but they require that you define the format of your data twice. Once in the marshaling code and once in the classes you are trying to marshal to. This is way more verbose than Go, and particularly dangerous if you are using type annotations as you might change a type in your marshaller but forget to update the annotation on the class you are marshaling to, or vice-versa.

I started wondering if I could use my type annotations to do marshaling, just like in Go. It turns out there is a great library called pydantic to do just that. With pydantic you define your model once using normal type annotations and then pydantic handles marshaling, un-marshaling and your __init__ by doing runtime introspection of the annotations. Fantastic! The project is still a little immature, but I expect it and other similar projects to catch on as type annotations become more popular. One huge gotcha, unless you define some type stubs for pydantic, mypy will not perform any type checking of your pydantic models! I've written some stubs, and you can get them here.

Here is an example of how you would typically parse JSON in Go:

package main

import (
    "encoding/json"
)

type ProgrammingLanguage struct {
    Name             string
    Statically_typed bool
    Compiled         bool
    Keywords         []string
}

func main() {
    json_input := `
    {
        "name": "Go",
        "statically_typed": true,
        "compiled": true,
        "keywords": ["small", "fast"]
    }`
    golang := ProgrammingLanguage{}
    json.Unmarshal([]byte(json_input), &golang)
}

...and how we can do essentially the same thing in Python using pydantic:

from typing import List
from pydantic import BaseModel

class ProgrammingLanguage(BaseModel):
    name: str
    statically_typed: bool
    compiled: bool
    keywords: List[str]

json_input = """
{
    "name": "Python",
    "statically_typed": false,
    "compiled": false,
    "keywords": ["flexible", "slow"]
}
"""

python = ProgrammingLanguage.parse_raw(json_input)

Python's new type annotations can be used to do fancy runtime tricks AND provide type checking, just like you would do in Go. This doesn't seem to have a lot of traction with the community yet, but given how beautiful the concept is, I think it might catch on. I'm also looking forward to seeing if similarly clever things can be done with ORMs.

Limited language features

Python prides itself on being readable and easy to use, but there are a lot of Python features and until you get to a certain level of mastery some features can be confusing. The Go team has decided that many common programming language constructs are harmful or just too confusing to be allowed in the Go language. These include:

  • Inheritance (use composition instead)
  • Ternary operators
  • Exceptions (although panics seem similar)
  • Optional function arguments
  • Technically, while loops

Sometimes these features are very nice, but often they can lead to confusing code. Exploring Go has pushed me to make my Python code simpler. It's made me consider whether complicated list comprehensions are really so slick, and how often I need inheritance.

I think the moral here is to Keep It Simple, Stupid! Just because we have loads of amazing features in Python, doesn't mean that we should use them everywhere. Clever tricks aren't really so clever if no one else understands them.

Summary

One of the big attractions of Go is the discipline it imposes on the programmer. The language has a small feature set and forces you to do things the "right way" (or at least the Go way!). I personally find these restrictions very attractive, particularly for the work I've been doing recently.

I've found I don't necessarily need to start programming in Go to benefit from some of the lessons Go has to teach. Python and the Python community have the tools I need to write code with the same discipline. It took a little effort to learn the tools and setup the necessary automation, but I feel it has really helped my project.

At some point I'll actually sit down to write a real project in Go, and I expect I may learn a few more things when I do.