Generics are common across multiple programming languages, though sometimes they are known with different terms according to e.g. different programming languages: templates in C++, parametric polymorphism in Scala, and Haskell. They are mentioned as parameterised types in the Gang of Four’s Design Patterns book1.
However, they seem to be a bit overlooked in Python. One explanation could be that Python is a scripting language and less importance is given to concepts and ideas such as e.g. Don’t Repeat Yourself. Another explanation could be that Python is a dynamically typed programming language, therefore, less emphasis is put on types definition.
This is where generics can be useful in Python: generics can help us write reusable, stronger code and better communicate what our code is about.
What are generics?
First thing first, the term generics can sometimes be used instead of generic types whereas, other times, it can refer to both generic types and generic functions: that is, software entities. For the remainder of this post, I will use generics in the latter interpretation.
Generics are software entities that operate on a type variable, where the type variable represents an abstraction ranging over a set of concrete types. In other words, they can be types or functions that hold the same behaviour for all types represented by the parameterised type.
The interesting aspect is that the type variable is not instantiated: one does not assign a concrete type to it. Instead, the type variable represents a type-to-be.
It is the task of the type checker to find the right type needed for a specific execution.
A generic type can be thought of as a type constructor which takes a type and “returns” a type. For example, the Python built-in container list
behaves as a generic type constructor.
A generic function is one that makes use of generic types in the function declaration to make it behave in a generic-type manner. Again, type variables are not concrete types, but types-to-be.
from typing import List, TypeVar | |
T = TypeVar('T') | |
def return_first(l: List[T]) -> T: | |
return l[0] | |
int_list: List[int] = [0, 1, 2] | |
return_first(int_list) |
In the example above, list
becomes a concrete type once T
is evaluated to be an int
. At the same time, when return_first
is executed for the specific list, the type checker knows T
is of type int
.
Benefits of generics
What sort of benefits can we get from using generics?
Preventing code repetition: a solution implemented over a set of types can be reused without the need to duplicate the code for every type-specific situation.
Stronger type checks: a static type checker (in the case of Python e.g.
mypy
) can find errors quicker, thus avoiding potential unsafe code from raising runtime errors.Eliminating the requirement for type casting: if we write a piece of code that manipulates an object, we must cast the object to the type we expect to make it work. This is not necessary for a dynamically typed language such as Python, though its absence makes the code unsafe and can trigger a type check error.
Generics in Python
In Python, a generic type is a class or an interface that behaves as a generic type constructor. To build a generic type, we need to use Generic
and TypeVar
from the typing
module:
from typing import TypeVar, Generic | |
T = TypeVar("T") | |
class Height(Generic[T]): | |
def __init__(self, height: T): | |
self.height = height |
Two steps are required to define a user-defined class as generic:
We define a type variable, and
We define the implementation for the class using the special building block
Generic
.
As a rule of thumb, most generic classes use single uppercase letters to represent type variables, making it clear which variables are types and which ones are other variables. This is specified in the Notational convention for type hints in PEP 483.
Applications
Let’s take the example above and expand on it:
from typing import TypeVar, Generic, List, get_args | |
T = TypeVar("T") | |
cm = int | |
ft = float | |
class Height(Generic[T]): | |
def __init__(self, height: T): | |
self.height = height | |
class FamilyHeight(Generic[T]): | |
def __init__(self): | |
self.family_heights: List[Height[T]] = [] | |
def push(self, height: Height[T]): | |
self.family_heights.append(height) | |
def pop(self) -> Height[T]: | |
return self.family_heights.pop() | |
x = Height[cm](187) | |
y = Height[ft](6.0) | |
z: Height[cm] = Height[cm](6.0) # Type Error | |
family_heights_cm = FamilyHeight[cm]() | |
family_heights_cm.push(x) | |
family_heights_cm.push(y) # Type Error | |
my_height = family_heights_cm.pop() | |
get_args(my_height.__orig_class__) # Type is int |
There is some action going on:
we define a generic class for height measurement;
we define a generic class for a family of heights: in fact, this is a container;
we try to push and pop different heights in a concrete family of heights.
So, where are the benefits? 🤔
line 27 will trigger the static type checker:
T
is here of typeint
, but the input argument at class initialisation is of typefloat
. The construction of instances of generic types is type checked.line 31 will trigger the static type checker: it is not allowed to push
Height[ft]
into a container that acceptsHeight[cm]
;accessing an element from the family of heights will return the right type without the need to cast.
I hear the question: is this really necessary? 😣
Well, technically no.
But, take a look at the example below:
from typing import Union | |
class FamilyHeight(): | |
def __init__(self): | |
self.family_heights: List[Union[int, float]] = [] | |
def push(self, height: Union[int, float]): | |
self.family_heights.append(height) | |
def pop(self) -> Union[int, float]: | |
return self.family_heights.pop() | |
family_heights_cm = FamilyHeight() | |
family_heights_cm.push(187) | |
family_heights_cm.push(6.0) # No complain | |
def return_first(container: List[Union[int, float]) -> Union[int, float]: | |
return container[0] | |
my_height: int = return_first(family_heights_cm) # Type Error |
The code is not as safe as before: we can ask the type checker to check whether the input argument is numeric, but there will be no complaint if we push different types into self.family_heights
.
Furthermore, we now get a type check error on line 22. It does make sense: how do we make sure that it is always an int
? We cannot. The only solution is to extend to any possible type:
from typing import Any | |
class FamilyHeight(): | |
def __init__(self): | |
self.family_heights: List[Any] = [] | |
def push(self, height: Any): | |
self.family_heights.append(height) | |
def pop(self) -> Any: | |
return self.family_heights.pop() | |
family_heights_cm = FamilyHeight() | |
family_heights_cm.push(187) | |
def return_first(container: List[Any]) -> Any: | |
return container[0] | |
my_height: int = return_first(family_heights_cm) # OK |
However, we lose introspection power - and the code is probably less safe.
A second approach would be to create ad-hoc classes that accept and work on one type only, but then we would potentially have as many classes as the type we have to deal with: not great.
Conclusion
I hope I convinced you by now that there actually are benefits in using generics even in a dynamically typed language such as Python.
The main advantages come in combination with the use of a static type checker: defining the right types, the type checker is able to find potential pitfalls in the code before it gets executed - think about the production environment!
Finally, generics can reduce duplicate code and, at the same time, provide a good source of introspection to the code. This is appreciated especially if your code will be read by many different people at many different points in time.