in Development

Black magic with metaclasses in Python

As you might have noticed, we rolled out a new notification system a couple of weeks ago. I’m not going to write about its interface, but rather one of it’s cool backend feature: auto-discovery of notification modules and self-registering of notification classes. What do I mean by that? Well, lets take a practical example.

Internally we hold all the notification classes in a registry. This allows easily retrieval of all the notification types that are currently available to the system without having to search and import the classes from wherever they are scattered when we need them in different parts of the system.

Let’s assume that the registry is empty and we don’t have any notification classes defined.

Now, let’s define a new notification class for our comment application. This new notification type will be in charge of letting a designer know when a new comment is posted about one of his icon sets. The class has some predefined attributes and a method that handles the delivery of the notification. The details of the notification class implementation and handling of notification event don’t matter that much — what I want to emphasise is the automatic registration.

Let’s check the registry again.

Woooaa wooa, back up there! That’s some serious voodoo magic — how did our class end up in the registry?

We’re not initializing it anywhere or calling any methods, and what can that mysterious __metaclass__ attribute be and its reference to the Notification class?

Enough suspense, like Linus says “Talk is cheap. Show me the code,” so let’s just take a look at the Notification class:

So what is going on here? We can see a method to add the class to the registry, but how is it being called?

Before we continue let’s talk a bit more about objects and classes in Python. Python has this peculiar idea that everything is an object and that every object has an identity, a type and a value — this also extends to classes. What this means is, that we can assign a class to a variable, dynamically add attributes to it and even pass it as a function parameters, just like any other normal object. Let me demonstrate.

So what is a class definition and what is the difference between a user defined class and the built in types? Practically in day-to-day Python they are the same thing, we use the class keyword to define a new type that will be used as a template when creating new object instances and the build in types are classes themselves. Let’s inspect some of the build in types and our defined class using the type function.

Since class definitions are objects, they also have a type and that type is type (yes, it’s confusing in the beginning). Let’s look at the built in type function’s manual for a bit.

Aha! So since classes are objects themselves, they can be created dynamically. Let’s look at what a dynamic definition of our test class would look like:

Whenever Python encounters the class keyword it will automatically create a new object for us behind the scenes just to make our life a tiny bit easier.

So what if we want to define a new class that dictates how new class objects are created — could we do that?

Sure we can. A class that creates new classes is called a metaclass which is a fancier word for a ‘class factory’, if we where to make an analogy. Python uses the built-in type metaclass to create all types of object like classes, ints, strings, etc.

Now let’s get back to our Notification metaclass again and let’s explain a bit better what it does. In python type is actually a class just like str, int, bool, etc. so we can inherit from it.

This allows us to override the __new__ method (http://docs.python.org/2/reference/datamodel.html#object.__new__), the __new__ method is the first method to be called during instance creation and is responsible for the actual creation of the new object and for returning it. By contrast, __init__ doesn’t create anything, it is only responsible for initializing the instance after it has been created.

The logic behind it is simple:
* Intercept the class object creation. Remember, Python creates a class object whenever it meets the class Foo(..) declaration.
* Modify the class so that it will add itself to the registry.
* Return the newly modified class.

Now, there’s only one piece of the puzzle left. How do we indicate to Python that we want to use this type (template) instead of the built in type when creating new class objects? By using the __metaclass__ attribute.

What happens is that Python will look for __metaclass__ in the class definition. If it finds it, it will use it to create the object class for CommentPostedNotification using the declared class (in this case Notification). If it doesn’t, it will use the type metaclass to create the object class.

Ending note

Yes, I do realise that metaclasses are a well known feature for advanced Python programmers. But, coming from a statically typed language like Java this feels like magic. Also, this is an oversimplification of the overall design and in 99% of the cases you won’t need metaclasses but when you do it simplifies a complex problem.

And to end it with another weirdness, you might have wondered that’s the metaclass for type. You’d be surprised to find out that type is actually its own metaclass. This is done by some even weirder voodoo at the implementation level and is not something that you could reproduce in pure Python and is written in C.

Write a Comment