Effective Python 36 - 40

Click here for the first post, which contains the context of this series.

Item #36: Consider itertools for working with iterators and generators.

These are the most important ones:

chain
repeat
cycle
tree
zip_longest
islice
takewhile
dropwhile
filterfalse
accumulate
product
permutations
combinations
combinations_with_replacement

Item #37: Compose classes instead of nesting many levels of built-in types.

Consider

class Gradebook:

def __init__(self):

self._grades = {}

def add_student(self, name):

self._grades[name] = defaultdict(list)

def report_grade(self, name, subject, score, weight, notes):

self._grades[name][subject].append((score, weight, notes))

def average_grade(self, name):

return sum(sum(score * weight for score, weight, _ in grades) for grades in self._grades[name].values()) / len(self._grades[name])

Note that it composes dictionaries and long tuples. This is confusing. Instead, do this:

Grade = namedtuple('Grade', 'score weight')

class Subject:

def __init__(self):

self._grades = []

def report_grade(self, score, weight):

self._grades.append(Grade(score, weight))

def average_grade(self):

return sum(grade.score * grade.weight for grade in self._grades)

class Student:

def __init__(self):

self._subjects = defaultdict(Subject)

def get_subject(self, name):

return self._subjects[name]

def average_grade(self):

return sum(subject.average_grade() for subject in self._subjects.values()) / len(self._subjects)

class Gradebook:

def __init__(self):

self._students = defaultdict(Student)

def get_student(self, name):

return self._students[name]

Although it is longer, it is easier to read and extend.

Item #38: Accept functions instead of classes for simple interfaces.

Python has first-class functions, which means that "functions and methods can be passed around and referenced like any other value in the language":

def my_key(x):

return len(x)

my_list = ['Socrates', 'Archimedes', 'Plato', 'Aristotle']

my_list.sort(key=my_key)

But if you want to maintain state, then you can do this:

class MyKey:

def __init__(self):

self.count = 0

def __call__(self, x):

self.count += 1

return len(x)

my_key = MyKey()

my_list = ['Socrates', 'Archimedes', 'Plato', 'Aristotle']

my_list.sort(key=my_key)

Item #39: Use @classmethod polymorphism to construct objects generically.

The following script is self-explanatory:

class Animal:

def __init__(self, name):

self.name = name

def sound(self):

raise NotImplementedError

@classmethod

def create_animals(cls):

raise NotImplementedError

class Dog(Animal):

def sound(self):

return f'{self.name} says woof'

@classmethod

def create_animals(cls):

return [cls(name) for name in ['Max', 'Buddy', 'Charlie']]

class Cat(Animal):

def sound(self):

return f'{self.name} says meow'

@classmethod

def create_animals(cls):

return [cls(name) for name in ['Simba', 'Milo', 'Tiger']]

for animal in Dog.create_animals() + Cat.create_animals():

print(animal.sound())

Item #40: Initialize parent classes with super.

Although multiple inheritance is not good, consider the following script, which is an example of diamond inheritance:

class MyBaseClass:

def __init__(self, value):

self.value = value

class TimesTwo(MyBaseClass):

def __init__(self, value):

MyBaseClass.__init__(self, value)

self.value *= 2

class PlusFive(MyBaseClass):

def __init__(self, value):

MyBaseClass.__init__(self, value)

self.value += 5

class MyClass(TimesTwo, PlusFive):

def __init__(self, value):

TimesTwo.__init__(self, value) # !

PlusFive.__init__(self, value)

print(MyClass(3).value)

This does not work as expected; the indicated line is redundant. The correct way to achieve this is to use super:

class MyBaseClass:

def __init__(self, value):

self.value = value

class TimesTwo(MyBaseClass):

def __init__(self, value):

super().__init__(value)

self.value *= 2

class PlusFive(MyBaseClass):

def __init__(self, value):

super().__init__(value)

self.value += 5

class MyClass(TimesTwo, PlusFive):

def __init__(self, value):

super().__init__(value)

print(MyClass(3).value)

Effective Python 31 - 35

Click here for the first post, which contains the context of this series.

Item #31: Be defensive when iterating over arguments.

Consider

def normalize(X):

s = sum(X)

return [x / s for x in X]

normalize works as expected if X is a container and does not work as expected if X is a generator; this is because sum(X) exhausts the generator. Address this by checking whether X is a generator with iter(X) == X or isinstance(X, Iterator), where Iterator is imported from collections.abc.

Item #32: Consider generator expressions for large list comprehensions.

Let X be an extraordinarily large iterable. Then

for y in [f(x) for x in X]: pass

will load an extraordinarily large object into memory. On the other hand,

for y in (f(x) for x in X): pass

does not have this problem.

Item #33: Compose multiple generators with yield from.

def my_gen():

yield from gen_1()

yield from gen_2()

yield from gen_3()

is shorthand for and performs better than

def my_gen():

for i in gen_1():

yield i

for i in gen_2():

yield i

for i in gen_3():

yield i

Item #34: Avoid injecting data into generators with send.

Consider

def double_inputs():

while True:

x = yield

yield x * 2

gen = double_inputs()

next(gen)

print(gen.send(10))

next(gen)

print(gen.send(6))

next(gen)

print(gen.send(94.3))

>>>

188.6

Avoid doing this.

Item #35: Avoid causing state transitions in generators with throw.

Consider

def my_gen():

i = 0

while i < 10:

try:

i += 1

yield i

except GeneratorExit:

return

except BaseException:

i = -1

it = my_gen()

print(next(it))

it.throw(BaseException())

print(next(it))

>>>

Avoid doing this.

Effective Python 26 - 30

Click here for the first post, which contains the context of this series.

Item #26: Define function decorators with functools.wraps.

Function decorators add functionality before and after the execution of the functions that they decorate. For example,

def fib(n):

if n < 3:

return 1

return fib(n - 1) + fib(n - 2)

fib(500)

will probably never terminate. An idea is to cache:

cache = {}

def fib(n):

if n in cache:

return cache[n]

if n < 3:

return 1

cache[n] = fib(n - 1) + fib(n - 2)

return cache[n]

fib(500)

But what if you want to do this to multiple functions? An idea is to use decorators:

from functools import wraps

def my_cache(func):

cache = {}

@wraps(func)

def wrapper(n):

if n in cache:

return cache[n]

cache[n] = func(n)

return cache[n]

return wrapper

@my_cache # decorator

def fib(n):

if n < 3:

return 1

return fib(n - 1) + fib(n - 2)

@my_cache

def new_fib(n):

if n < 4:

return 1

return new_fib(n - 1) + new_fib(n - 2) + new_fib(n - 3)

fib(500)

new_fib(500)

You can add more than one decorator to a function; note the order in which they are applied.

Item #27: Use comprehensions instead of map and filter.

Let

a = [1, 2, 3, 4, 5, 6, 7, 8, 9]

Suppose that you are doing this:

b = []

for c in a:

if c % 2:

b.append(c ** 2)

A better way is to use a comprehension:

b = [c ** 2 for c in a if c % 2]

Avoid using map and filter:

b = list(map(lambda c: c ** 2, filter(lambda c: c % 2, a)))

Comprehensions can also be used with dict and set.

Item #28: Avoid more than two control sub-expressions in comprehensions.

You can stack comprehensions:

a = [1, 2, 3, 4, 5, 6, 7, 8, 9]

b = [[x, y] for x in a if x % 2 for y in a if not y % 2]

Avoid stacking more than two, and note the order in which the for loops are executed.

Item #29: Avoid repeated work in comprehensions by using assignment expressions.

Consider

[expensive_function(x) for x in X if meets_condition(expensive_function(x))]

This is repeated work because expensive_function could be called twice for each x. Instead, use an assignment expression:

[result for x in X if meets_condition(result := expensive_function(x))]

Note that result leaks out of the scope of the comprehension.

Item #30: Consider generators instead of returning lists.

Consider a function that returns a list of anagrams of a word:

def get_anagrams(word, anagram='', is_free=None):

if is_free is None:

is_free = [True for _ in word]

if not any(is_free):

return [anagram]

else:

anagrams = []

for i, _ in enumerate(is_free):

if is_free[i]:

is_free[i] = False

anagrams += get_anagrams(word, anagram + word[i], is_free)

is_free[i] = True

return anagrams

Suppose that you want to print the first 10 anagrams of the word "incomprehensible":

for i, anagram in enumerate(get_anagrams('incomprehensible')):

print(anagram)

if i == 9:

break

This will probably never terminate: the problem is that get_anagrams('incomprehensible') attempts to create a list of the 2.092279e+13 anagrams of this word. A generator does not have this problem:

def get_anagrams(word, anagram='', is_free=None):

if is_free is None:

is_free = [True for _ in word]

if not any(is_free):

yield anagram

else:

for i, _ in enumerate(is_free):

if is_free[i]:

is_free[i] = False

yield from get_anagrams(word, anagram + word[i], is_free)

is_free[i] = True

Effective Python 21 - 25

Click here for the first post, which contains the context of this series.

Item #21: Understand how closures interact with variable scope.

Suppose that you have a list L of numbers, have a list G of important numbers, want to sort L while giving priority to the numbers of L that are in G, and want to know if a number in L is in G:

def my_sort(L, G):

flag = False

def helper(x):

if x in G:

flag = True

return (0, x)

return (1, x)

L.sort(key=helper)

return flag

This sorts L as expected but returns False: the problem is that flag = True in helper is a new variable due to how scoping works in Python. The fix is to use the keyword nonlocal:

def my_sort(L, G):

flag = False

def helper(x):

nonlocal flag

if x in G:

flag = True

return (0, x)

return (1, x)

L.sort(key=helper)

return flag

However, it is better practice to wrap states in classes:

class MySort:

def __init__(self, G):

self.G = G

self.flag = False

def __call__(self, x):

if x in self.G:

self.flag = True

return (0, x)

return (1, x)

def my_sort(L, G):

my_sort = MySort(G)

L.sort(key=my_sort)

return my_sort.flag

Item #22: Reduce visual noise with variable positional arguments.

Suppose that you have the following function:

def my_logger(message, items):

return f'{message}{", ".join([str(x) for x in items])}'

numbers = [1, 2, 3]

print(my_logger('I like these numbers: ', numbers))

print(my_logger('I like no numbers.', []))

Passing an empty list is noisy. Instead, use positional arguments:

def my_logger(message, *items):

return f'{message}{", ".join([str(x) for x in items])}'

numbers = [1, 2, 3]

print(my_logger('I like these numbers: ', *numbers))

print(my_logger('I like no numbers.'))

Note that *numbers converts numbers into a tuple, which means that if numbers were a massive generator, this would be resource-intensive.

Moreover, if you were to update the signature of my_logger to something like def my_logger(date, message, *items):, then not updating all of the calls to my_logger would introduce bugs that are hard to detect.

Item #23: Provide optional behavior with keyword arguments.

Note the use of **:

def flow_rate(weight_diff, time_diff, period=1, units_per_kg=1):

return weight_diff * units_per_kg * period / time_diff

kwargs = {

'weight_diff': 0.5,

'time_diff': 3,

'period': 3600,

'units_per_kg': 2.2

}

print(flow_rate(**kwargs))

With optional arguments, do not do this: flow_rate(0.5, 3, 3600, 2.2). Instead, do this: flow_rate(0.5, 3, period=3600, units_per_kg=2.2).

Item #24: Use None and Docstring to specify dynamic default arguments.

Suppose that you run the following script:

def append_zero(x=[]):

x.append(0)

return x

a = append_zero()

b = append_zero()

It turns out that a and b are the same list, so both look like [0, 0]. This is because lists are dynamic and not static, like strings.

Initialize keyword arguments that have dynamic values with None, and document this in the docstring:

def append_zero(x=None):

'''Append a zero to a list.

Args:

x: list. Defaults to an empty list.

'''

if not x:

x = []

x.append(0)

return x

Item #25: Enforce clarity with keyword-only and positional-only arguments.

Suppose that you have the following division function:

def safe_division(number, divisor, ignore_overflow, ignore_zero_division):

try:

return number / divisor

except OverflowError:

if ignore_overflow:

return 0

else:

raise

except ZeroDivisionError:

if ignore_zero_division:

return float('inf')

else:

raise

safe_division(12, 3, True, False)

This is noisy. An improvement would be to change the signature to

def safe_division(number, divisor, ignore_overflow=False, ignore_zero_division=False):

# ...

safe_division(12, 3, ignore_overflow=True)

The problem is that this is still possible:

safe_division(12, 3, True, False)

Keyword-only arguments cannot be passed by position:

def safe_division(number, divisor, *, ignore_overflow=False, ignore_zero_division=False):

# ...

safe_division(12, 3, True, False) # This gives an error.

Now, suppose that we change the signature to

def safe_division(numerator, denominator, *, ignore_overflow=False, ignore_zero_division=False):

# ...

Then this name change could break multiple existing calls to the function.

Positional-only arguments cannot be passed by keyword:

def safe_division(numerator, denominator, /, *, ignore_overflow=False, ignore_zero_division=False):

# ...

safe_division(numerator=10, denominator=2) # This gives an error.

Math Crumbs

A Blog About Mathematics and Programming • 🎉 Celebrating 10 Years 🎉

Menu

Effective Python 36 - 40

Item #36: Consider itertools for working with iterators and generators.

Item #37: Compose classes instead of nesting many levels of built-in types.

Item #38: Accept functions instead of classes for simple interfaces.

Item #39: Use @classmethod polymorphism to construct objects generically.

Item #40: Initialize parent classes with super.

Effective Python 31 - 35

Item #31: Be defensive when iterating over arguments.

Item #32: Consider generator expressions for large list comprehensions.

Item #33: Compose multiple generators with yield from.

Item #34: Avoid injecting data into generators with send.

Item #35: Avoid causing state transitions in generators with throw.

Effective Python 26 - 30

Item #26: Define function decorators with functools.wraps.

Item #27: Use comprehensions instead of map and filter.

Item #28: Avoid more than two control sub-expressions in comprehensions.

Item #29: Avoid repeated work in comprehensions by using assignment expressions.

Item #30: Consider generators instead of returning lists.

Effective Python 21 - 25

Item #21: Understand how closures interact with variable scope.

Item #22: Reduce visual noise with variable positional arguments.

Item #23: Provide optional behavior with keyword arguments.

Item #24: Use None and Docstring to specify dynamic default arguments.

Item #25: Enforce clarity with keyword-only and positional-only arguments.

Blog Archive