October 2008: First draft of escaping self hell.
January 2009: More stateful functions.

Original First draft of escaping self hell. More stateful functions. Editable
version 1 & 2 of 2

A good friend of mine once described two complaints about Python, the more serious of the two being self hell. Self hell is usually something Pythonistas grumble about from time to time. Basically, you find that every line in your program begins with self. Every other word in the program is self. For a language without needless braces, begins, ends or other useless syntax, the over abundance of a single token really stands out. This is self hell.

There is an obvious trick to avoiding self hell: don't ever use self.

Chain of implications: No self means no classes. No classes means no object orientedness. If you were a Java or C++ programmer once, you probably think I am a crackpot.

There is more to life than OOP. Here is a simple little class, whose only purpose is window comparison.

class Window(object):
    def __init__(self, minimum, maximum):
        self.minimum = minimum
        self.maximum = maximum
    def __call__(self, x):
        return self.minimum <= x <= self.maximum

Does not seem so bad.
Self requires on average five characters per line, nothing horrible.

And here is it in use:

>>> between = Window(1,5)
>>> between(2.4)
>>> between(-7)
>>> between(12)

Now, let's say you need to add a feature. Defensively protect the input so that minimum and maximum may arrive out of order.

class Window(object):
    def __init__(self, minimum, maximum):
        self.minimum = minimum
        self.maximum = maximum
        self.minimum, self.maximum = min(self.minimum, self.maximum), max(self.minimum, self.maximum)
    def __call__(self, x):
        return self.minimum <= x <= self.maximum

That newest line sums up what self hell is all about. It could be mitigated by swapping min & max before saving them to self.min & self.max, but eventually you'll need a line like this outside of init, and there will be no other course but the gratuitous use of self.

Let's remake it using closures. Closures, in a nutshell, allow functions to be defined on the fly while remembering the function's scope. Note this code already includes the defensive variable swap.

def Window(minimum, maximum):
    def w(x):
        return minimum <= x <= maximum
    minimum, maximum = min(minimum, maximum), max(minimum, maximum)
    return w

Self never appears once. The use case is identical. It is smaller (quantitatively) and more elegant (qualitatively).

But what the heck is up with the nested defs?
w() is called an inner function. If you've never heard of it before, don't feel too bad, lots of really smart people have not either. Why, until recently Vim did not know how to automatically indent such code.

The closure part is not having to explicitly pass minimum or maximum to w(). When python tries to execute w(), it looks for min and max inside of w(). It does not see them, so it checks the next scope up, Window(). The double checking costs a few CPU cycles but saves a lot of keystrokes.

Some claim objects are needed for storing changing state between calls. Not so. For example, say you have a prime number generator. Every time you instance it, the generator starts over from 2, the first prime. Every time next() is called, the generator returns the next prime. Simple enough.

A smarter prime number generator would cache the previous results, so each new instance does not have to recompute every number from the very beginning. Normally this screams "Use an object!". It can all be done with a single function:

def primes(p_cache=[2]):
    for i in p_cache:
        yield i
    for i in count(p_cache[-1] + 1):
        sqr = int(i**0.5) + 1
        if all(i%p for p in takewhile(lambda x: x<sqr, p_cache)):
            yield i

Ever been bitten by a Python's default argument behavior? This hack uses that classic gotcha to update the p_cache list. It is not the most idiomatic code. While it proves you can do almost anything with just functions, it is also proof I should have my Python Licence revoked.

There are two cases were defining a class is pretty much unavoidable: defining new data structures and changing Python's syntax. If you want to make a Tree type, you'll need to class it. If you want to do stuff like this:

>>> 'Do a barrel shift!' << 5
'barrel shift!Do a '

You'll have to make a new String object inheriting from the old String, extended with lshift and rshift.

I've found that a hybrid approach works best. All of your class methods that actually do computation should be moved into closures, and extensions to clean up syntax should be kept in classes. Stuff the closures into classes, treating the class much like a namespace. The closures are essentially static functions, and so do not require selfs*, even in the class.

  • Selfs, plural. Functions still need one self, in the function definition.