Generics and Python

A story of forgetfulness and wishes.

Jan 17, 2023

Generics are common across multiple programming languages, though sometimes they are known with different terms according to e.g. different programming languages: templates in C++, parametric polymorphism in Scala, and Haskell. They are mentioned as parameterised types in the Gang of Four’s Design Patterns book1.

However, they seem to be a bit overlooked in Python. One explanation could be that Python is a scripting language and less importance is given to concepts and ideas such as e.g. Don’t Repeat Yourself. Another explanation could be that Python is a dynamically typed programming language, therefore, less emphasis is put on types definition.

This is where generics can be useful in Python: generics can help us write reusable, stronger code and better communicate what our code is about.

What are generics?

First thing first, the term generics can sometimes be used instead of generic types whereas, other times, it can refer to both generic types and generic functions: that is, software entities. For the remainder of this post, I will use generics in the latter interpretation.

Generics are software entities that operate on a type variable, where the type variable represents an abstraction ranging over a set of concrete types. In other words, they can be types or functions that hold the same behaviour for all types represented by the parameterised type.

The interesting aspect is that the type variable is not instantiated: one does not assign a concrete type to it. Instead, the type variable represents a type-to-be.
It is the task of the type checker to find the right type needed for a specific execution.

A generic type can be thought of as a type constructor which takes a type and “returns” a type. For example, the Python built-in container list behaves as a generic type constructor.
A generic function is one that makes use of generic types in the function declaration to make it behave in a generic-type manner. Again, type variables are not concrete types, but types-to-be.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

Show hidden characters

	from typing import List, TypeVar


	T = TypeVar('T')

	def return_first(l: List[T]) -> T:
	return l[0]


	int_list: List[int] = [0, 1, 2]
	return_first(int_list)

view raw generic_type.py hosted with ❤ by GitHub

In the example above, list becomes a concrete type once T is evaluated to be an int. At the same time, when return_first is executed for the specific list, the type checker knows T is of type int.

Benefits of generics

What sort of benefits can we get from using generics?

Preventing code repetition: a solution implemented over a set of types can be reused without the need to duplicate the code for every type-specific situation.
Stronger type checks: a static type checker (in the case of Python e.g. mypy) can find errors quicker, thus avoiding potential unsafe code from raising runtime errors.
Eliminating the requirement for type casting: if we write a piece of code that manipulates an object, we must cast the object to the type we expect to make it work. This is not necessary for a dynamically typed language such as Python, though its absence makes the code unsafe and can trigger a type check error.

Generics in Python

In Python, a generic type is a class or an interface that behaves as a generic type constructor. To build a generic type, we need to use Generic and TypeVar from the typing module:

Show hidden characters

	from typing import TypeVar, Generic

	T = TypeVar("T")

	class Height(Generic[T]):
	def __init__(self, height: T):
	self.height = height

view raw generic_heigth.py hosted with ❤ by GitHub

Two steps are required to define a user-defined class as generic:

We define a type variable, and
We define the implementation for the class using the special building block Generic.

As a rule of thumb, most generic classes use single uppercase letters to represent type variables, making it clear which variables are types and which ones are other variables. This is specified in the Notational convention for type hints in PEP 483.

Applications

Let’s take the example above and expand on it:

Show hidden characters

	from typing import TypeVar, Generic, List, get_args

	T = TypeVar("T")
	cm = int
	ft = float


	class Height(Generic[T]):
	def __init__(self, height: T):
	self.height = height


	class FamilyHeight(Generic[T]):
	def __init__(self):
	self.family_heights: List[Height[T]] = []

	def push(self, height: Height[T]):
	self.family_heights.append(height)

	def pop(self) -> Height[T]:
	return self.family_heights.pop()


	x = Height[cm](187)
	y = Height[ft](6.0)

	z: Height[cm] = Height[cm](6.0) # Type Error

	family_heights_cm = FamilyHeight[cm]()
	family_heights_cm.push(x)
	family_heights_cm.push(y) # Type Error

	my_height = family_heights_cm.pop()
	get_args(my_height.__orig_class__) # Type is int

view raw cm_vs_ft.py hosted with ❤ by GitHub

There is some action going on:

we define a generic class for height measurement;
we define a generic class for a family of heights: in fact, this is a container;
we try to push and pop different heights in a concrete family of heights.

So, where are the benefits? 🤔

line 27 will trigger the static type checker: T is here of type int, but the input argument at class initialisation is of type float. The construction of instances of generic types is type checked.
line 31 will trigger the static type checker: it is not allowed to push Height[ft] into a container that accepts Height[cm];
accessing an element from the family of heights will return the right type without the need to cast.

I hear the question: is this really necessary? 😣

Well, technically no.
But, take a look at the example below:

Show hidden characters

	from typing import Union

	class FamilyHeight():
	def __init__(self):
	self.family_heights: List[Union[int, float]] = []

	def push(self, height: Union[int, float]):
	self.family_heights.append(height)

	def pop(self) -> Union[int, float]:
	return self.family_heights.pop()


	family_heights_cm = FamilyHeight()
	family_heights_cm.push(187)
	family_heights_cm.push(6.0) # No complain


	def return_first(container: List[Union[int, float]) -> Union[int, float]:
	return container[0]

	my_height: int = return_first(family_heights_cm) # Type Error

view raw using_union.py hosted with ❤ by GitHub

The code is not as safe as before: we can ask the type checker to check whether the input argument is numeric, but there will be no complaint if we push different types into self.family_heights.
Furthermore, we now get a type check error on line 22. It does make sense: how do we make sure that it is always an int? We cannot. The only solution is to extend to any possible type:

Show hidden characters

	from typing import Any

	class FamilyHeight():
	def __init__(self):
	self.family_heights: List[Any] = []

	def push(self, height: Any):
	self.family_heights.append(height)

	def pop(self) -> Any:
	return self.family_heights.pop()


	family_heights_cm = FamilyHeight()
	family_heights_cm.push(187)


	def return_first(container: List[Any]) -> Any:
	return container[0]

	my_height: int = return_first(family_heights_cm) # OK

view raw using_any.py hosted with ❤ by GitHub

However, we lose introspection power - and the code is probably less safe.

A second approach would be to create ad-hoc classes that accept and work on one type only, but then we would potentially have as many classes as the type we have to deal with: not great.

Conclusion

I hope I convinced you by now that there actually are benefits in using generics even in a dynamically typed language such as Python.

The main advantages come in combination with the use of a static type checker: defining the right types, the type checker is able to find potential pitfalls in the code before it gets executed - think about the production environment!

Finally, generics can reduce duplicate code and, at the same time, provide a good source of introspection to the code. This is appreciated especially if your code will be read by many different people at many different points in time.

Wikipedia entry on Generic Programming

Divergent tinkering

Discussion about this post