Python containers explained

In whatever programming language you have different types to regroup multiple values, this is what is called containers. Each of those language have different way to use those containers, depending on historical availability, design choices, simplicity of implementation, etc.

As an example, in C, the most used container is for historical reasons the C array, which in fine is juste a pointer to the allocated memory for a continuous sequence of on particular type of data. This approach is deeply linked to the fact that C has been created as a low level langage (meaning very close to how processor and memory actually works in a typical computer).

Not all language have the same organisation. In this document we will see the different containers available in Python and explain what are the idiomatic python way of using them.

Mutable vs Immutable containers

A very important distinction between the different containers is whether they are mutable or immutable. This concept comes from functional programming which is one of the paradigm used by the python programming language.

  • A mutable container can have its value modified in it.
  • An immutable container can’t have its value modified in it. It is still possible to affect a new container to the same variable containing modified values.

Example of mutable container: in a list you can modify any element:

>>> grades = ['a', 'b', 'c', 'd']
>>> print(grades)
['a', 'b', 'c', 'd']
>>> grades[1] = 'd'
>>> print(grades)
['a', 'd', 'c', 'd']

Example of immutable container: in a tuple you can’t modify an element:

>>> grades = ('a', 'b', 'c', 'd')
>>> print(grades)
('a', 'b', 'c', 'd')
>>> grades[1] = 'd'  # modification of the elements in the tuple is impossible
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> grade = ('a', 'd', 'c', 'd')  # reaffectation is possible though
>>> print(grades)
('a', 'd', 'c', 'd')
Immutable is not the same as constant

Immutable container are not constant: if a element of an immutable container is a mutable container it is perfectly possible to change the content of this mutable container since this does not reaffect it.

>>> mutable_or_not = ([1, 2, 3], [4, 5, 6])
>>> mutable_or_not[0] = [7, 8, 9]  # tuple is immutable but...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> mutable_or_not[0][0] = 7  # ...the list in the tuple is mutable!
>>> print(mutable_or_not)
([7, 2, 3], [4, 5, 6])
Important

Most of the time immutable container are more memory efficient but they need new affectation every time you need to modify them. So choose wisely.

Important

Only fully immutable container (immutable and with all its elements them selves immutable and so on recursively) are hashable, meaning that they can be used as a key in a dictionnary (or an element in a set).

That’s because mutable container would have their hash varying depending on the value of their elements… so they would need to be re-indexed everytime one of their element is altered. That would be very complicated and computationnaly intensive.

Built-ins containers

Built-ins container are the ones available in Python without any import.

List

This is the most used container in Python. Most beginner think it is an array because it is easilly indexable but in fact it’s a very optimised double linked list.

1>>> fruits = ["apple", 24, 1.234, "bananana"]
2>>> fruits[3] = "banana"
>>> fruits
['apple', 24, 1.234, 'banana']
3>>> fruits[3]
banana
4>>> fruits.append("pineapple")
>>> fruits.insert(4, "strawberry")
>>> fruits.remove(1.234)
>>> fruits.pop(1)  # remove an element and return it
24
>>> fruits
['apple', 'banana', 'strawberry', 'pineapple']
5>>> fruits[-1]
'pineapple'
>>> fruits[1:]
['banana', 'strawberry', 'pineapple']
>>> fruits[::2]
['apple', 'strawberry']
6>>> breakfast, lunch, diner = fruits  # unpacking
7>>> for fruit in fruits:
...   print(fruit)
...
banana
apple
strawberry
pineapple
8>>> for index, fruit in enumerate(fruits):
...   print(f"{fruit} is at index {index}")
...
banana is at index 0
apple is at index 1
strawberry is at index 2
pineapple is at index 3
9>>> "strawberry" in fruits
True
>>> "watermelon" in fruits
False
1
Lists can have elements of different types in the same list.
2
List are mutable so you can do direct modification to elements.
3
Lists are indexed with ints starting from 0.
4
You can insert new element or delete them anywhere (at the beginning, at the end or anywhere inside the list) without any reallocation cost.
5
List can be manipulated using Common sequence operations.
6
You can unpack list values into a tuple for easy var assignment.
7
You can loop through the element of a list using a for ... in ... loop.
8
You can loop through the element and their index usinig the enumerate() function.
9
To test if an element is present in a list use the in operator

List are used really everywhere because they are very versatile. But often other containers are more adapted/optimised for those tasks: tuple, set, array, numpy.ndarray, etc.

Dict

This is the second most used container in Python. Technically it’s a very optimised general purpose hashmap/mapping so each element, called an item, is the association of a key and a value. You can use the key as an index to get the value but not the other way around.

1>>> grades = {"bob": 10, "sam": "no grade", "the dog": 10}
2>>> grades["sam"] = 6
3>>> grades["sam"]
6
4>>> grades["john"] = 7
>>> del grades["the dog"]
>>> grades
{'bob': 10, 'sam': 6, 'john': 7}
5>>> for name in grades.keys():
...     print(name)
...
bob
sam
john
6>>> for grade in grades.values():
...     print(grade)
...
10
6
7
7>>> for name, grade in grades.items():
...     print(f"student {name} got {grade}/10")
...
student bob got 10/10
student sam got 6/10
student bob john 7/10
8>>> "bob" in grades
True
>>> "elvis" in grades
False
1
Dicts can have values of different types in the same dict. And dicts can have keys of different types in the same dict.
2
Dicts are mutable so you can do direct modification to elements.
3
Dicts are indexed with their keys and those keys are unique. Anything hashable can be used as key not only strings but there is no limitation to what can be used has value.
4
You can insert or delete new element without any reallocation cost.
5
You can access the keys through the keys() method
6
You can access the values through the values() method
7
You can access to a tuple (key, value) through the items() method
8
To test if a key is present in a dict use the in operator
Warning

The current implementation keep the element in the order they were inserted, but this is implementation detail and is subject to change so DO NOT RELLY ON THAT ORDER.

Dicts are used really everywhere because they are very versatile. But sometimes other containers are more adapted/optimised for those tasks: sets, counters, multidict, bag, etc.

Tuple

This is the third most used container in Python. Tuples are very similar to lists except for their mutability. They are also very close to the concept of C structs except they don’t have named attributes (for that see namedtuples). Tuple are way more memory efficient than list and should be used instead of them when mutability is not issue.

1>>> animals = ("tiger", 1.234, "bear")
2>>> animals[1] = "dolphin"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
3>>> animals = ("tiger", "dolphin", "bear")  # modification by replacement
>>> animals
('tiger', 'dolphin', 'bear')
1
Tuples can have values of different types in the same tuple.
2
Tuples are immutable: you can’t replace its elements in place, you can’t add or remove element from it either.
3
If you want to modify a tuple you have to create a new one to replace the old one.

For all other intent, tuple behave just like lists: indexing, loops, etc.

The number one usage for tuples is in functions to return multiple values at once:

>>> def get_position() -> (int, int)
...  return 2, 3  # implicit tuple
...
>>> pos = get_position()  # Store result in a tuple
>>> x, y = get_position()  # Store result in 2 variables using unpacking

The lack of easy way to access the different element of a tuple except int indexing makes it sometimes preferable to use more advanced containers like namedtuple and dataclass that servce a very similar purpose.

Set

WIP

Array

WIP

Standard library containers

Named tuples

WIP

Dataclass

WIP

frozen set

WIP

Counter

WIP

External library containers

Numpy.ndarray

WIP

Bag

WIP

Multidict

WIP