Published on

Sequences

Authors

I recently started to re-learn parts of Python, and wanted to write this shorts blog post on a few interesting things I have learnt.

Lists

Python lists and numpy arrays are both used to store collections of items, but they have some important differences. Every Python object in memory has metadata attached to it. Alongside the header (i.e. the location), Python stores the object's reference count, a pointer to the object's type and a C double holding the value.

Lists can perform mathematical operations, but they are not optimized for this task. Numpy arrays, on the other hand, are specifically designed for mathematical operations, and they provide a wide range of built-in functions and methods for performing mathematical operations on arrays.

Meanwhile, an array of floats (for example, numpy arrays), only stores the value of the contents, and not as a sequence of Python objects. This is why arrays are much more compact and utilises less memory space.

Performing a concatenation, such as a += b, and depending on the object type, would either perform a.extend(b) for mutable object, or a = a + b for immutable objects. This sort of augmented assignment (and others such as *= and -=) is only possible of the objects has an __iadd__ ("in-place addition") method otherwise Python falls back to __add__ method.

Tuples

Repeated augmented assignment for immutable sequences (tuples) is inefficient because the items are not appending to the end of the item but a copy of the target sequence is made with the new items concatenated.

Another type of sequence in Python is the tuple. Tuples are similar to lists but they are immutable, meaning that once they are created, you cannot change the items inside them. So try answer the question below where we use augmented assignment.

Question:

What happens when we run the below python code?

>>> t = (0, 1, [30, 40])
>>> t[2] += [50, 60]

Answers?

A: No issues.

>>> print(t)
(1, 2, [30, 40, 50, 60])

B: Type Error

>>> print(t)
TypeError: 'tuple' object does not support item assignment

C:. Neither of the above.

D: Both 1 and 2.

The answer is actually both A and B. This occurs because when we access the object within the tuple it is a list which is mutable, however, when we assign this new list to the immutable tuple object Python realises that this is an immutable object, hence, fails.

Dictionaries

Dictionaries are really interesting. The key value pair object must have keys which are hashable. Implementing __hash__ and __eq__ methods. Python has extremely fast access to millions of keys because it directly computes the hash code of the key alongside an index offset to locate the matching entry in a few number of attempts.

Key ordering is now part of Python 3.7 onwards. Dictionaries have significant memory overhead since a hash table needs to be created with Python keeping at least one-third of the row empty to remain efficient. In case you are wondering what the most compact data structure is, it is a tuple. Which is a container with an array of pointers to the items.

When creating an instance attribute, avoid doing so outside of the __init__ method because another dict needs to be created on top of the existing instance - further increasing the memory requirement since hash tables have empty rows.

Creating an instance attribute, with attributes only being created within __init__, it is stored within a special __dict__ attribute and is attached to each instance. When multiple instances are created a common hash table is shared by the __dict__. PEP 412 says that this optimization reduces memory by 10% to 20% for object orientated programs.

Sets

Sets are amazing.

Sets operations are faster, simpler and offer faster solutions to many tasks. Set classes have an intuitive API design and can be used for many applications. We can use operator overloading to simplify our code to make it shorter and faster.

The methods that sets have prevent unnecessary copies of iterables, and for loops. The set operations and methods either produce a new set or update the target set in place (if it's mutable).

I will continue to add to this blog as more interesting facts come along.

Further Reading: