Pythonic Code: Best Practices to Make Your Python More Readable

All veteran Python developers (Pythonistas) preach about writing Pythonic code. If you’re someone who has spent some time writing Pythonic code, you will have come across the best practices. But what exactly is Pythonic code, and how should you remember the major pain points/avoid obvious (bad) practices?

Fortunately, the Python community is blessed with a relatively simple and complete set of code style guidelines and “Pythonic” idioms. These are one of the key reasons for the high readability of Pythonic code. Readability and simplistic syntax is at the heart of Python.

In this post, we are going to talk about a few very important style guidelines and Pythonic idioms, and how to deal with legacy code.

Visit Codementor Events

One Statement of Code per Line

If you’re writing disjointed statements in a single line, you’re violating the essence of Python. The only exception is list comprehensions and a few other compound statements. These are allowed and accepted for their brevity and their expressiveness.

Bad practice

print 'foo'; print 'bar'

if x == 1: print 'foo'

if <complex comparison> and <other complex comparison>:
    # do something

Best practice

print 'foo'
print 'bar'

if x == 1:
    print 'foo'

cond1 = <complex comparison>
cond2 = <other complex comparison>
if cond1 and cond2:
    # do something

Explicit code

The simplest (and easiest to understand) way of writing code is always the best.

Bad practice

def make_complex(*args):
    x, y = args
    return dict(**locals())

The above code returns:
{'args': (1, 2), 'x': 1, 'y': 2}
This is redundant as we needed just x and y whereas it returns ‘args’ as well.
Also, if we had:

def make_complex(*args):
    x,y = args
    z = x+y
    return dict(**locals())

The above function will return ‘z’:3 too in the locals dict. Best practice is to specify what’s needed, and follow the most straight-forward approach. Remember to keep things simple and explicit.
Another pitfall of this bad practice is that if you pass in more than 2 parameters while calling the function: make_complex(1,2,3), it would throw a valueError like this:
Pythonic code error

Best practice

def make_complex(x, y):
    return {'x': x, 'y': y}

Passing args to Functions

There are four different ways of passing arguments to a function:

  1. Positional arguments: These are the simplest form of arguments. Positional arguments are fully part of the function’s meaning, and their order will be the order in which they are defined. For instance, in cal_area(length, breadth) or send_msg(message, recipient), the developer doesn’t have to worry about remembering that these two functions require 2 arguments, or their order.

Note: In the above two examples, you can also call functions with different orders using keywords like: cal_area(breadth = 40.0, length=90) or send_msg(recipient='Mak', message='Hello there!').

  1. Keyword arguments: Also known as kwargs, these are often used as optional parameters passed to the function. When a function has more than two or three positional parameters, its signature gets difficult to remember.
    Kwargs come in handy with default value. For example, a better way of writing a send_msg function would be: send_message(message, recipient, cc=None, bcc=None). Here, cc and bcc are optional, and would be returning as “None” if no value is passed.

  2. Arbitrary arguments list: If the business logic of the function requires an extensible number of positional arguments, it can be defined with the *args constructs. Inside the function, args will be a tuple of all the remaining positional arguments. For example, send_msg(message, *args) can be called with each recipient as an argument:
    send_msg('Hello there!', 'God', 'Mom', 'Cthulhu'), and the function scope will have args equal to ('God', 'Mom', 'Cthulhu').

  3. The arbitrary keyword argument dictionary: If your function requires an undetermined series of named arguments, it is possible to use the **kwargs construct. In the function body, kwargs will be a dictionary of all the passed named arguments that have not been caught by other keyword arguments in the function signature.
    Using *args, Python passes variable length non-keyword argument to the function — but what if we want to pass keyword argument? Using **kwargs, we can pass the variable length of keyword arguments to the function.
    For example, for the function below:

def introduction(**data):
    print("\nData type of argument:",type(data))
    for key, value in data.items():
        print("{} is {}".format(key,value))
introduction(Firstname="Sita", Lastname="Sharma", Age=22, Phone=1234567890)
introduction(Firstname="John", Lastname="Wood", Email="[email protected]", Country="Wakanda", Age=25, Phone=9876543210)

The output will be:

Data type of argument: <class 'dict'>
Firstname is Sita
Lastname is Sharma
Age is 22
Phone is 1234567890

Data type of argument: <class 'dict'>
Firstname is John
Lastname is Wood
Email is [email protected]
Country is Wakanda
Age is 25
Phone is 9876543210

Note: The same caution as for arbitrary argument lists is necessary. The reasons are similar: these powerful techniques are only to be used when there is a proven necessity, and should not be used if the simpler and clearer construct is sufficient to express the function’s intention.

If the coding style guide is followed wisely, your Python functions will be:

  • easy to read (the name and arguments need no explanations)
  • easy to change (adding a new keyword argument does not break other parts of the code)

Return Statements

As a function grows in complexity, it becomes susceptible to having multiple return statements inside the function’s body. However, in order to keep a clear intent and a sustained readability level, it is preferable to avoid returning meaningful values at multiple output points in the function body.

For instance, take a look at the example below (explained by the inline comments) on how to avoid adding multiple output points and raise exceptions instead:

Bad practice

def complex_function(a, b, c):
  if not a:
    return None
  if not b:
    return None
  # Some complex code trying to compute x from a, b and c
  if x:
    return x
  if not x:  
    # Some Plan-B computation of x
    return x

Best practice

def complex_function(a, b, c):
  if not a or not b or not c:
  raise ValueError("The args can't be None") 

  # Raising an exception is better

  # Some complex code trying to compute x from a, b and c
  # Resist temptation to return x if succeeded
  if not x:
  # Some Plan-B computation of x

  return x # One single exit point for the returned value x will help when maintaining the code.

Writing Idiomatic Python

An idiom is a phrase that doesn’t make literal sense, but makes sense once you’re acquainted with the culture in which it arose. Programming idioms are no different. They are the little things you do daily in a particular programming language or paradigm that only make sense to a person familiar with its culture.

Python beginners can be unaware of writing idiomatic Python, so we’ve listed some common Python idioms:

Unpacking

If you want to assign names or references to the elements of a list while unpacking it, try using enumerate():

for index, item in enumerate(some_list):
    # do something with index and item

You can use swap variables:

a, b = b, a

Nested unpacking works too:

a, (b, c) = 1, (2, 3)

In Python 3, PEP 3132 has introduced a new method of extended unpacking:

a, *rest = [1, 2, 3]
# a = 1, rest = [2, 3]
a, *middle, c = [1, 2, 3, 4]
# a = 1, middle = [2, 3], c = 4

Creating throwaway variables

If you need to assign something (for instance, in unpacking), but will not need that variable, use __:

filename = 'foobar.txt'
basename, __, ext = filename.rpartition('.')

Note:
Many Python style guides recommend the use of a single underscore _ for throwaway variables rather than the double underscore __ recommended here. The issue is that _ is commonly used as an alias for the gettext() function, and is also used at the interactive prompt to hold the value of the last operation.

Using a double underscore instead is just as clear and almost as convenient. The benefit of this practice is eliminating the risk of accidentally interfering with either of these other use cases.

Create a length-N list of the same thing

Use the Python list * operator to create simple lists and nested lists as well:

nones = [None]*4
foures_of_fours = [[4]]*5

Output:
[None, None, None, None]
[[4], [4], [4], [4], [4]]

Search for an item in a collection

Sometimes we need to search through a collection. Let’s look at two options: lists and sets. Take the following code for example:

def in_test(iterable): 
  for i in range(1000):
    if i in iterable:
    pass
from timeit import timeit
timeit(
  "in_test(iterable)",
  setup="from __main__ import in_test; iterable = set(range(1000))",
  number=10000)

Output: 0.5591847896575928

timeit(
  "in_test(iterable)",
  setup="from __main__ import in_test; iterable = list(range(1000))",
  number=10000)

Output: 50.18339991569519

timeit(
  "in_test(iterable)",
  setup="from __main__ import in_test; iterable = tuple(range(1000))",
  number=10000)

Output: 51.597304821014404

Both functions look identical, because the lookup_set() is utilizing the fact that sets in Python are hashtables. However, the lookup performances of the two are different — i.e. sets use O(log n), whereas list has a time complexity of O(n).

To determine whether an item is in a list, Python will have to go through each item until it finds a matching item. This is time consuming, especially for long lists. In a set, on the other hand, the hash of the item will tell Python where in the set to look for a matching item. As a result, the search can be done quickly, even if the set is large.

Because of these differences in performance, it is often a good idea to use sets or dictionaries instead of lists in cases where:

  • the collection will contain a large number of items
  • you will be repeatedly searching for items in the collection
  • you do not have duplicate items

Access a Dictionary Element

Don’t use the dict.has_key() method. Instead, use x in d syntax, or pass a default argument to dict.get(), as it is more Pythonic and is removed in Python 3.x.

Note: Python2 is about to be retired in 2020. It is advised to use Python 3.x for any sort of development, as most of the Python packages have/will stop releasing updates for Python 2.x. Read more here.

Bad practice

d = {'foo': 'bar'}
if d.has_key('foo'):
    print d['foo']    # prints 'bar'
else:
    print 'default_value'

Best practice

d = {'foo': 'bar'}

print d.get('foo', 'default_value') # prints 'bar'
print d.get('thingy', 'default_value') # prints 'default_value'

# alternative
if 'hello' in d:
    print d['foo']

Filtering a List

Never remove items from a list while you are iterating it. Why? If your list is accessed via multiple references, the fact that you’re just reseating one of the references (and NOT altering the list object itself) can lead to subtle, disastrous bugs. Read more about it here.

Bad practice

# Filter elements greater than 4
num_list = [1, 2, 3]
for i in num_list:
    if i > 2:
        num_list.remove(i)

Don’t make multiple passes through the list.

while i in num_list:
    num_list.remove(i)

Best practice

Use a list comprehension or generator expression:

# comprehensions create a new list object
filtered_values = [value for value in sequence if value != x]

# generators don't create another list
filtered_values = (value for value in sequence if value != x)

Updating Values in a List

Remember that assignment never creates a new object. If two or more variables refer to the same list, changing one of them changes them all.

Bad practice

# Add three to all list members.
a = [3, 4, 5]
b = a                     # a and b refer to the same list object

for i in range(len(a)):
    a[i] += 3             # b[i] also changes

Best practice

a = [3, 4, 5]
b = a

# assign the variable "a" to a new list without changing "b"
a = [i + 3 for i in a]
b = a[:]  # even better way to copy a list

Use the with open syntax to read from files. This will automatically close files for you.

Bad practice

f = open('file.txt')
a = f.read()
print a
f.close()

Best practice

with open('file.txt') as f:
    for line in f:
        print line

The with method is better because it ensures you always close the file, even if an exception is raised inside the block.

Dealing with Legacy Code

We’ve covered the basics of writing good code in Python. It’s now worth looking at the art of handling big projects in Python. How can you take up new open-source or closed-source projects? What are the steps to refactor legacy code? What are the best practices to get yourself up to speed on a new project?

Often when you join a new organization, you’re given a codebase to comprehend and refactor, or you need to take up legacy code to refactor. Sometimes, thanks to this situation you’ll find yourself in deep distress, and unable to figure out the starting point.

At this point, it’s important to define “legacy code/project” so that we’re all on the same page. Here’s what you’ll come across:

  • an “older” project that has been around forever
  • a code base without any kind of tests
  • the project that no one wants to work on
  • “Everyone who worked on this left the company years ago…”

All of the above are somewhat right, but sometimes projects are done in haste and put into production before everyone realizes that there is a lot of scope for improvement. So, how shall we tackle a legacy project?

Below is a quick list of steps you should follow in order to make your journey of refactoring simpler and smoother:

  1. First and foremost, make sure the project is in a version control system.
  2. Delete commented out code. (Once the project is in production, always make sure that you remove the commented code.)
  3. Run tests/add tests. Make sure that you have at least 80% test coverage. Use pytest or similar Python packages to track test coverage.
  4. Use Pylint/Vulture. Always consider running some type of linter over the code to see how “healthy” it is. Try to look for:
    • Unused variables
    • Anything that is noted as a potential bug
  5. Use formatters like Flake8 or PEP8. These guidelines can be used to reformat Python code to make it more PEP8 complaint.
  6. Write more idiomatic Python (as described above).

Conclusion

With the exploding Python community and budding Pythonistas, we have Python in almost all development fields such as data science, web development, mobile development, and AI, etc. As such, it is increasingly important to make sure we always ship enterprise-grade code following proper guidelines.

Thanks to these basic tools — and the beauty of the Python language itself — producing awesome code and products doesn’t have to be a scary proposition. Now that you’ve gone through these guidelines, go ahead and try these on an open source Python project!

Free Python Projects .png

For more Python best practices, check out these posts:

Notes and references:

[1] One Statement of Code per line – Code Style from The Hitchhiker’s Guide to Python

[2] Passing args to function – Code Style from The Hitchhiker’s Guide to Python