Python Features

~16mins read

I have to start with a note that if you don’t read this and go “that solved my problem immediately” then don’t use it, and that if you’re using more than one of these then you probably need to re-architect your solution.

  1. Operator Overloading
  2. Callables
  3. Classes and Types
  4. Dynamically defined classes
  5. Metaclasses
  6. Static object attributes

Open the full notebook in Google Colab.

Operator Overloading

Python has a few “double underscore” or “dunder” methods. These methods are special as they usually aren’t called explicitly but are the implementations of operators. The most well known one is __init__ which is called on the instantiation of a new object.

Although they aren’t called explicitly we can override them to change operator behaviour.

class Complex:

    # Override the initialisation of this object
    def __init__(self, r, i):
        self.r = r
        self.i = i

    # Override the "+" operator of this object
    def __add__(self, other):
        return Complex(self.r + other.r, self.i + other.i)

    # Override the "to_string" method of this object
    # This method is for user readable output.
    def __str__(self):
        return f'{self.r} {"+" if self.i >= 0 else "-"} {abs(self.i)}i'

    # Override the representation method of this object
    # This method is for debugging and should unambiguous.
    # It should contain all the info needed to recreate this object.
    # a.__repr__() == b.__repr__() iff a == b
    def __repr__(self):
        return f'real: {self.r}, im: {self.i}'
c1 = Complex(1, 2)
c2 = Complex(5, -3.4)

z = c1 + c2

Note that print() uses __str__ while the notebook prints __repr__

print(z)
z
6 - 1.4i
real: 6, im: -1.4

This is nice but it will break if we try to add an int or a float to a complex number.

c1 + 1
---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-6-df3ccba53f9f> in <module>
----> 1 c1 + 1


<ipython-input-3-7dd1fdb54608> in __add__(self, other)
      8     # Override the "+" operator of this object
      9     def __add__(self, other):
---> 10         return Complex(self.r + other.r, self.i + other.i)
     11 
     12     # Override the "to_string" method of this object


AttributeError: 'int' object has no attribute 'r'

So lets fix that by checking the type of other when we add.

def complex_add(self, other):
    if type(other) == int or type(other) == float:
        z = Complex(self.r + other, self.i)  
    else:
        z = Complex(self.r + other.r, self.i + other.i)
    return z

As python is interpreted we can modify classes on the fly. Objects store the their data in themselves but to save duplication the methods are stored in the class, hence the need to have a self parameter to access the member variables unlike in C.

Thus, changing the class’s methods now changes the behaviour of previously created objects.

Complex.__add__ = complex_add

print(c1 + 1.3)
2.3 + 2i

We still have a couple more methods to implement before we have finished with addition.

Firstly, we have only implemented int and float support in our Complex class, this means we can add ints, floats to a complex number; but we can’t add a complex number to an int or float.

The + operator calls the __add__ function on the the left operand, so for c1 + 1.3 it will call Complex.__add__(c1, 1.3) which we have implemented. But for 1.3 + c1 it will call float.__add__(1.3, c1) which will error as the float class has no concept of our Complex class.

print(1.3 + c1)

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-9-a762ebd331d6> in <module>
 ----> 1 print(1.3 + c1)


TypeError: unsupported operand type(s) for +: 'float' and 'Complex'

We could override the __add__ of both float and int, which, apart from breaking encapsulation, will become instantly unmanageable when someother class needs addition with ints supported.

Helpfully, Python doesn’t stop as soon as __add__ fails. If float.__add__(1.3, c1) fails it will call Complex.__radd__(c1, 1.3) (right-add). This is useful for when your data structures have non-commutative operations, for example matrix multiplication or rotations in 3d space.

def complex_radd(self, other):
    return Complex.__add__(self, other)

Complex.__radd__ = complex_radd
print(1.3 + c1)
2.3 + 2i

Lastly, although the increment operator appears to be working, it is not happening inplace as you’d expect:

tmp = c1
print(c1)
c1 += 2
print(c1)
print(tmp == c1)
1 + 2i
3 + 2i
False

Python has seen that there is no increment operator defined and has expanded the line c1 += 2 to c1 = c1 + 2

Python, unsurprisingly, has a dunder method to override the increment operator: __iadd__; it should have the “same” implementation as __add__ but instead of returning a new instance it should modify and then return self.

def complex_iadd(self, other):
    if type(other) == int or type(other) == float:
        self.r += other
    else:
        self.r += other.r
        self.i += other.i
    return self

Complex.__iadd__ = complex_iadd
tmp = c1
print(c1)
c1 += Complex(1, 1)
print(c1)
print(tmp == c1)
3 + 2i
4 + 3i
True

And now you have to do this all over again for subtraction __sub__, division __div__, xor __xor__ etc.

But don’t do this for complex numbers as they’re built in to Python!

Callables

Now you know a bit about operator overloading with dunder methods, lets move on to the method __call__.

Lets say we want to create a function that gets us the point on the line $y = mx + c$. The $m$ and $c$ are worked out at run time, say from some user input so we can’t hard code the values in. We could implement it like this:

def linear(x, m, c):
    return x * m + c

linear(3, 2, 1)
7

But we would have to specify the gradient and intersect of the line every time, which is not ideal. We could add the equivalent of a static variable to the method:

def linear(x):
    return x * linear.m + linear.c

# Some code to work out m & c ...

linear.m = 2
linear.c = 1

linear(3)
7

But I really don’t think this is a good idea, as part of the function can now be defined in any part of the code, and may not even be defined the first time its called.

However, you can use these function variables for the function to keep track of its state itself, e.g:

(Ignore the __dict__ for now, we’ll look at it later)

def count():
    if "n" not in count.__dict__: count.n = 0
    count.n += 1
    print(f'count: {count.n}')

count()
count()
count()
count: 1
count: 2
count: 3

So how do we create a parameterised function where we can set the parameters and then just use it as a normal function? Here’s where the __call__ function comes in. It’s basically an overload for the (...) “operator”. If we define it for a class then we can “call” an object of that class and get the return value of the __call__ function.

class LinearFunc:

    def __init__(self, m, c):
        self.m = m
        self.c = c

    def __call__(self, x):
        return x * self.m + self.c
my_func = LinearFunc(2,1)

my_func(3)
7

In fact this is basically what a function is, a blank object with a __call__ function; hence, why we can set those “static” member variables!

Ever seen the error message TypeError: ‘xxx’ object is not callable? Well now you know this error is the result of using the () “operator” on an object with no __call__ method. Methods with this function are “Callables”.

For those that have used PyTorch this is how the call to the forward pass on a model works.

Classes & Types

What’s the difference between a class and a type in Python?

Absolutely everything inherits from object. Everything also has a type, all vanilla objects have the supertype type (We can create objects that have a custom type, as we’ll see later)

class A: pass

a = A()

print(type("hi"))
print(type(a))
<class 'str'>
<class '__main__.A'>

As we’d expect the type of a string literal is the class str and the type of an instance of A is the class A. But what is the type of these classes themselves?

print(type(type("hi")))
print(type(type(a))) # == type(A)
<class 'type'>
<class 'type'>

Class type? Types are classes that instantiate to other classes and are also called metaclasses. type is the only default / inbuilt metaclass.

Interestingly, because type inherits from object and object has type type we have this little quirk:

print(isinstance(type, object))
print(isinstance(object, type))
True
True

They are both instances of each other, but do not confuse this by thinking that they inherit from each other.

Dynamically-defined classes

We create an instance of a class by using the __call__ method.

class A:
    def __init__(self):
        self.is_initialised = True

    def foo(self):
        print('hello from foo')
a = A.__call__()

a.foo()
print(a.is_initialised)
hello from foo
True

A class object is a callable and when it’s called it returns a new initialised instance.

For a class (ClassX) the ClassX.__call__ method actually calls ClassX.__new__ and ClassX.__init__ passing on the respective parameters.

a = A.__new__(A)

# Here we have an 'A' object but it is uninitialised.
a.foo()
print(a.is_initialised)
hello from foo

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-25-f972ed196c71> in <module>
        3 # Here we have an 'A' object but it is uninitialised.
        4 a.foo()
----> 5 print(a.is_initialised)


AttributeError: 'A' object has no attribute 'is_initialised'
a.__init__()

a.foo()
print(a.is_initialised)

So if we can call a class to get an object, can we call type to get a class?

Yup! The __call__ function of type takes three arguments and returns a new class object. Its signature is type(name, bases, dict).

type’s name argument

The name argument is the internal name of the new class, when we define a class normally using the class keyword its name is the same as its external symbol, we can obviously change these after the fact but when creating classes with type the name and symbols don’t ever have to match.

class A: pass

print(f'The class symbol is A and its name is {A.__name__}.')

B = A

print(f'We can create a new symbol B but it still refers to {B.__name__}.')
The class symbol is A and its name is A.
We can create a new symbol B but it still refers to A.
C = type('D',(),{})

print(f'Using type() we can have a symbol C with name {C.__name__}')
print('And no symbol D!')
print(D)
Using type() we can have a symbol C with name D
And no symbol D!

---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-27-eae03b0cd175> in <module>
        3 print(f'Using type() we can have a symbol C with name {C.__name__}')
        4 print('And no symbol D!')
----> 5 print(D)


NameError: name 'D' is not defined

type’s bases argument

The bases argument is a tuple of all the superclasses this class should inherit from.

A = type('A',(),{})
B = type('B',(A,),{})

print(A)
print(f'{B} with superclasses {B.__bases__}')

b = B()

isinstance(b, A)
<class '__main__.A'>
<class '__main__.B'> with superclasses (<class '__main__.A'>,)

True

type’s dict argument

The dict argument will be included into the __dict__ class property which holds its dynamic properties.

class A:
    foo = 1

    def __init__(self):
        self.bar = 2

a = A()

pp.pprint(A.__dict__)
pp.pprint(a.__dict__)
mappingproxy({   '__dict__': <attribute '__dict__' of 'A' objects>,
                    '__doc__': None,
                    '__init__': <function A.__init__ at 0x7ff2fc260a60>,
                    '__module__': '__main__',
                    '__weakref__': <attribute '__weakref__' of 'A' objects>,
                    'foo': 1})
{'bar': 2}

Both classes and objects have a __dict__ object that holds the dynamic attributes.

x.name is equivalent to trying: x.__dict__['name'], type(x).__dict__['name'], then type(type(x)).__dict__['name']. In other words it looks for the variable in the current object, then the class variables, then the parent’s class variables and so on. (for better detail look up __getattribute__, __getattr__ and the descriptor protocol.)

Its the same for classes, but their __dict__ is a proxy object that acts as a dictionary but doesn’t allow you to change it or replace it.

Let’s see what happens when we add an item to the dict argument.

B = type('B', (), {"baz": 1})

b = B()

pp.pprint(B.__dict__)
pp.pprint(b.__dict__)
mappingproxy({   '__dict__': <attribute '__dict__' of 'B' objects>,
                    '__doc__': None,
                    '__module__': '__main__',
                    '__weakref__': <attribute '__weakref__' of 'B' objects>,
                    'baz': 1})
{}

See how baz here is equivalent to the foo when we used the class keyword. We don’t get the equivalent of bar because bar isn’t defined by on class itself but by one of the class’s methods (__init__) on an instance of the class (self).

We can copy this behaviour by defining the same method (the __init__ method) on the class using the dict argument.

def B__init__(self):
    self.bar = 2

B = type('B',(),{
    "baz":1,
    "__init__": B__init__
})

b = B()

pp.pprint(B.__dict__)
pp.pprint(b.__dict__)
mappingproxy({   '__dict__': <attribute '__dict__' of 'B' objects>,
                    '__doc__': None,
                    '__init__': <function B__init__ at 0x7ff2fc260f28>,
                    '__module__': '__main__',
                    '__weakref__': <attribute '__weakref__' of 'B' objects>,
                    'baz': 1})
{'bar': 2}

Metaclasses

But hang on, we can make many classes to get different objects but what’s the point in every class having type type?

Well they don’t have to!

Using types we can define metaclasses, which, like a class allows us to regulate object behaviour, a metaclass allows us to regulate a class’s behaviour.

Let’s create a profiling metaclass that will modify any class so that all of its functions are timed.

import time
from functools import wraps


class Profiler(type):
    """ A Metaclass inherits from type."""

    @staticmethod
    def profile_call(func):
        """ Decorator for timing any function."""

        @wraps(func)
        def wrapper(*args, **kwargs):
            start = time.time()
            val = func(*args, **kwargs)
            end = time.time()

            ms = 1000 * (end - start)

            # Online mean
            if "avg_time" not in wrapper.__dict__:
                wrapper.avg_time = ms
            if "calls" not in wrapper.__dict__:
                wrapper.calls = 0
            wrapper.calls += 1
            wrapper.avg_time += (ms - wrapper.avg_time) / wrapper.calls

            print(f'{func.__name__} took {ms:.3f}ms')
            return val


        return wrapper

    def __new__(cls, clsname, superclasses, attributedict):
        """ 
        The class will call this __new__ method instead of type's new method.

        We overrite all the class's non-dunder callables
        with the profiling decorated versions.
        """
        for attr in attributedict:
            if callable(attributedict[attr]) and not attr.startswith("__"):
                attributedict[attr] = cls.profile_call(attributedict[attr])

        return type.__new__(cls, clsname, superclasses, attributedict)
class A(metaclass=Profiler):

    def foo(self):
        x = 0
        for i in range(10_000):
            x += i

    def bar(self):
        x = 0
        for i in range(10_000):
            x *= i

a = A()
a.foo()
a.foo()
a.bar()
a.bar()
a.bar()
a.bar()

print(f'foo was called {a.foo.calls} times with an average time of {a.foo.avg_time:.3f}ms')
print(f'bar was called {a.bar.calls} times with an average time of {a.bar.avg_time:.3f}ms')
foo took 0.344ms
foo took 0.342ms
bar took 0.294ms
bar took 0.290ms
bar took 0.291ms
bar took 0.309ms
foo was called 2 times with an average time of 0.343ms
bar was called 4 times with an average time of 0.296ms

A note on decorators:

See how it looks like the original functions are listed in A.__dict__ this is because of the @wraps(func) in line 12 of Profiler’s cell which is designed to transfer the docstring, name etc. from the original function to the wrapper so that functions like help(A) are still useful.

Remove that line and see that foo and bar are replaced by a much less intelligible version.

pp.pprint(A.__dict__)
mappingproxy({   '__dict__': <attribute '__dict__' of 'A' objects>,
                    '__doc__': None,
                    '__module__': '__main__',
                    '__weakref__': <attribute '__weakref__' of 'A' objects>,
                    'bar': <function A.bar at 0x7ff2fc260ea0>,
                    'foo': <function A.foo at 0x7ff2fc260c80>})

Static Object Attributes

We’ve seen in the section on __dict__ about how Python stores its dynamic attributes. But what if, as in our Complex number example, we know at ‘write’ time exactly what attributes we want and we don’t want any one adding any extra? or we know that a lot of our objects will be created and so we want so save the overhead of storing the variables in a dynamic container?

Indeed, we get this behaviour with the inbuilt classes int, float etc.

x = 12

print(type(x))

x.a = 0
<class 'int'>

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-35-ae7d3ed93095> in <module>
        3 print(type(x))
        4
----> 5 x.a = 0


AttributeError: 'int' object has no attribute 'a'

The way we can accomplish this is by pre-declaring our variables in the class’s __slots__ attribute.

class Complex2:

    __slots__ = ['r', 'i']

    def __init__(self, r, i):
        self.r = r
        self.i = i

    def __add__(self, other):
        return Complex(self.r + other.r, self.i + other.i)

    def __str__(self):
        return f'{self.r} {"+" if self.i >= 0 else "-"} {abs(self.i)}i'


    def __repr__(self):
        return f'real: {self.r}, im: {self.i}'
z = Complex2(1,2)

z.a = 12
---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-37-ce97f7498cf3> in <module>
        1 z = Complex2(1,2)
        2
----> 3 z.a = 12


AttributeError: 'Complex2' object has no attribute 'a'

Note we can still add items to the class itself but they become readonly for the objects.

def complex2_add(self, other):
    if type(other) == int or type(other) == float:
        z = Complex2(self.r + other, self.i)  
    else:
        z = Complex2(self.r + other.r, self.i + other.i)
    return z
Complex2.__add__ = complex2_add

def complex2_radd(self, other):
    return Complex2.__add__(self, other)

Complex2.__radd__ = complex2_radd
Complex2.a = 12

print(z.a)

z.a = 12
12

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-39-f557b59c31b9> in <module>
        3 print(z.a)
        4
----> 5 z.a = 12


AttributeError: 'Complex2' object attribute 'a' is read-only

Note how when using __slots__ the values are referred to as “members of objects” in the class instead of being stored in the object’s __dict__. Remember the call order of __getattribute__; so if there is not obj.__dict__ then the the class’s __dict__ is the next place to look, and here it “points” back to the object.

pp.pprint(Complex.__dict__)
print()
pp.pprint(Complex2.__dict__)
mappingproxy({   '__add__': <function complex_add at 0x7ff2fc2602f0>,
                    '__dict__': <attribute '__dict__' of 'Complex' objects>,
                    '__doc__': None,
                    '__iadd__': <function complex_iadd at 0x7ff2fc2608c8>,
                    '__init__': <function Complex.__init__ at 0x7ff2fc3b4bf8>,
                    '__module__': '__main__',
                    '__radd__': <function complex_radd at 0x7ff2fc260488>,
                    '__repr__': <function Complex.__repr__ at 0x7ff2fc2e7d08>,
                    '__str__': <function Complex.__str__ at 0x7ff2fc2e7c80>,
                    '__weakref__': <attribute '__weakref__' of 'Complex' objects>})

mappingproxy({   '__add__': <function complex_add at 0x7ff2fc260b70>,
                    '__doc__': None,
                    '__init__': <function Complex2.__init__ at 0x7ff2fc3b4c80>,
                    '__module__': '__main__',
                    '__radd__': <function complex_radd at 0x7ff2fc27e048>,
                    '__repr__': <function Complex2.__repr__ at 0x7ff2fc27e158>,
                    '__slots__': ['r', 'i'],
                    '__str__': <function Complex2.__str__ at 0x7ff2fc27e0d0>,
                    'a': 12,
                    'i': <member 'i' of 'Complex2' objects>,
                    'r': <member 'r' of 'Complex2' objects>})
c1 = Complex(1,2)
c2 = Complex2(1,2)

pp.pprint(c1.__dict__)
pp.pprint(c2.__dict__)
{'i': 2, 'r': 1}



---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-41-d18fd2fed581> in <module>
        3 
        4 pp.pprint(c1.__dict__)
----> 5 pp.pprint(c2.__dict__)


AttributeError: 'Complex2' object has no attribute '__dict__'
— Mark Tuddenham, Southampton, May 2018