January 2009: More stateful functions.
A good friend of mine once described two complaints about Python, the more serious of the two being self hell. Self hell is usually something Pythonistas grumble about from time to time. Basically, you find that every line in your program begins with
self. Every other word in the program is
self. For a language without needless braces, begins, ends or other useless syntax, the over abundance of a single token really stands out. This is self hell.
There is an obvious trick to avoiding self hell: don't ever use
Chain of implications: No
self means no classes. No classes means no object orientedness. If you were a Java or C++ programmer once, you probably think I am a crackpot.
There is more to life than OOP. Here is a simple little class, whose only purpose is window comparison.
class Window(object): def __init__(self, minimum, maximum): self.minimum = minimum self.maximum = maximum def __call__(self, x): return self.minimum <= x <= self.maximum
Does not seem so bad.
Self requires on average five characters per line, nothing horrible.
And here is it in use:
>>> between = Window(1,5) >>> between(2.4) True >>> between(-7) False >>> between(12) False
Now, let's say you need to add a feature. Defensively protect the input so that
maximum may arrive out of order.
class Window(object): def __init__(self, minimum, maximum): self.minimum = minimum self.maximum = maximum self.minimum, self.maximum = min(self.minimum, self.maximum), max(self.minimum, self.maximum) def __call__(self, x): return self.minimum <= x <= self.maximum
That newest line sums up what self hell is all about. It could be mitigated by swapping
max before saving them to
self.max, but eventually you'll need a line like this outside of
init, and there will be no other course but the gratuitous use of
Let's remake it using closures. Closures, in a nutshell, allow functions to be defined on the fly while remembering the function's scope. Note this code already includes the defensive variable swap.
def Window(minimum, maximum): def w(x): return minimum <= x <= maximum minimum, maximum = min(minimum, maximum), max(minimum, maximum) return w
Self never appears once. The use case is identical. It is smaller (quantitatively) and more elegant (qualitatively).
But what the heck is up with the nested
w() is called an inner function. If you've never heard of it before, don't feel too bad, lots of really smart people have not either. Why, until recently Vim did not know how to automatically indent such code.
The closure part is not having to explicitly pass
w(). When python tries to execute
w(), it looks for
max inside of
w(). It does not see them, so it checks the next scope up,
Window(). The double checking costs a few CPU cycles but saves a lot of keystrokes.
Some claim objects are needed for storing changing state between calls. Not so. For example, say you have a prime number generator. Every time you instance it, the generator starts over from 2, the first prime. Every time
next() is called, the generator returns the next prime. Simple enough.
A smarter prime number generator would cache the previous results, so each new instance does not have to recompute every number from the very beginning. Normally this screams "Use an object!". It can all be done with a single function:
def primes(p_cache=): for i in p_cache: yield i for i in count(p_cache[-1] + 1): sqr = int(i**0.5) + 1 if all(i%p for p in takewhile(lambda x: x<sqr, p_cache)): p_cache.append(i) yield i
Ever been bitten by a Python's default argument behavior? This hack uses that classic gotcha to update the
p_cache list. It is not the most idiomatic code. While it proves you can do almost anything with just functions, it is also proof I should have my Python Licence revoked.
There are two cases were defining a class is pretty much unavoidable: defining new data structures and changing Python's syntax. If you want to make a Tree type, you'll need to class it. If you want to do stuff like this:
>>> 'Do a barrel shift!' << 5 'barrel shift!Do a '
You'll have to make a new
String object inheriting from the old
String, extended with
I've found that a hybrid approach works best. All of your class methods that actually do computation should be moved into closures, and extensions to clean up syntax should be kept in classes. Stuff the closures into classes, treating the class much like a namespace. The closures are essentially static functions, and so do not require
selfs*, even in the class.
Selfs, plural. Functions still need one
self, in the function definition.