Python notes

ch4.1.coding_style

Ch 4.1. Coding style

Importance of Readability

  • Readability:

    • Python emphasizes code readability, understanding that code is read more often than it is written.
    • The design philosophy behind Python encourages writing clear and understandable code.
  • Pythonic Code:

    • The term "Pythonic" refers to code that follows Python's idioms and style guidelines, expressing intent in the most readable way.
    • Deviations from these conventions are often considered less readable or harder to maintain.

PEP 8: The Python Style Guide

  • Overview:
    • PEP 8 is the de facto code style guide for Python.
    • It covers a range of topics, including naming conventions, code layout, and whitespace usage (e.g., tabs vs. spaces).
    • The Python community widely adheres to PEP 8, promoting consistency across projects.
    • Some projects may modify or extend PEP 8 recommendations to fit specific needs.
    • Conforming to PEP 8 helps ensure code consistency, especially in collaborative projects with multiple developers.

PEP 20: The Zen of Python

  • Guiding Principles:

    • PEP 20, also known as "The Zen of Python," is a set of guiding principles for writing Python code.
    • It is available in any Python shell by executing import this.
    • PEP 20 contains 19 aphorisms that capture the philosophy of Python development, despite its name suggesting there are 20.
  • Key Aphorisms:

    • Some important aphorisms include:
      • "Simple is better than complex."
      • "Readability counts."
      • "Explicit is better than implicit."
      • "There should be one—and preferably only one—obvious way to do it."

The Zen of Python by Tim Peters

  • Beautiful is better than ugly.

  • Explicit is better than implicit.

  • Simple is better than complex.

  • Complex is better than complicated.

  • Flat is better than nested.

  • Sparse is better than dense.

  • Readability counts.

  • Special cases aren’t special enough to break the rules.

  • Although practicality beats purity.

  • Errors should never pass silently.

  • Unless explicitly silenced.

  • In the face of ambiguity, refuse the temptation to guess.

  • There should be one—and preferably only one—obvious way to do it.

  • Although that way may not be obvious at first unless you’re Dutch.

  • Now is better than never.

  • Although never is often better than right now.

  • If the implementation is hard to explain, it’s a bad idea.

  • If the implementation is easy to explain, it may be a good idea.

  • Namespaces are one honking great idea—let’s do more of those!

  • Pep20 by example

General Advice

Explicit is Better Than Implicit

  • Clarity Over Complexity:

    • While Python allows for powerful and complex coding techniques (often referred to as "black magic"), it prioritizes clear and straightforward code.
    • The simplest and most explicit way to express an idea is preferred. Code should be easy to read and understand, avoiding unnecessary complexity.
  • Readability Rule of Thumb:

    • A good practice is that another developer should be able to read the first and last lines of your function and understand what it does.
    • Avoid hidden or implicit logic that could confuse others who read your code.

Sparse is Better Than Dense

  • One Statement Per Line:

    • It is generally good practice to make only one statement per line.
    • Compound statements like list comprehensions are appreciated for their brevity and expressiveness but should be used judiciously.
  • Separate Lines for Disjoint Statements:

    • Keep disjoint or unrelated statements on separate lines to enhance readability.
    • This practice helps create more understandable diffs when revising code, as changes to one line are less likely to affect others.
    • Readable code is easier to maintain and modify, especially in collaborative environments like open-source projects.
  • Benefits in Collaborative Development:

    • When contributing to open source, having clear and readable code makes the revision history easier to decipher.
    • A change on one line should ideally affect only one aspect of the code, simplifying version control and collaboration.

Errors Should Never Pass Silently / Unless Explicitly Silenced

  • In Python, error handling is performed using the try statement.
  • when an error occurs, it should be handled appropriately, either by logging it, raising a custom exception, or explicitly choosing to ignore it for specific cases.

Best Practices for Error Handling

  • Avoid Silent Failures:

    • Avoid silently passing over errors unless there is a clear justification for doing so. Always consider logging or raising a custom exception to make error handling explicit.
  • Explicit Error Handling:

    • Use explicit error handling to ensure that exceptions are managed in a way that maintains code readability and reliability.
  • Use Specific Exceptions:

    • Catch specific exceptions rather than using a generic except block to prevent unintentional handling of unexpected errors.

Function Arguments Should Be Intuitive to Use

Python supports four different ways to pass arguments to a function:

def func(positional, keyword=value, *args, **kwargs):
    pass
  • Positional Arguments: Mandatory, without default values.
  • Keyword Arguments: Optional, with default values.
  • Arbitrary Argument List: Optional, no default values, captured as a tuple.
  • Arbitrary Keyword Argument Dictionary: Optional, no default values, captured as a dictionary.

Positional Arguments

  • When to Use:

    • Use positional arguments when the function has only a few arguments that are integral to its meaning and have a natural order. Examples include send(message, recipient) or point(x, y), where the order is easy to remember.
  • Usage Antipattern:

    • Avoid naming arguments when calling functions unless necessary, as it can reduce readability.
    • For instance, instead of send(recipient="World", message="The answer is 42."), use the simpler send("The answer is 42", "World").

Keyword Arguments

  • When to Use:

    • Use keyword arguments when a function has more than two or three positional parameters, making the signature easier to remember.

    • Keyword arguments with default values make the function more flexible.

      def send(message, to, cc=None, bcc=None):
          pass
    • Here, cc and bcc are optional and default to None if not provided.

  • Usage Antipattern:

    • Avoid mixing positional and keyword arguments in ways that reduce clarity.
    • For example, instead of send("42", "Frankie", bcc="Trillian", cc="Benjy"), prefer send("42", "Frankie", cc="Benjy", bcc="Trillian"), which follows the function definition order.
  • Best Practice:

    • Only add optional arguments when necessary, as removing them later can be more challenging than adding them when needed. "Never is often better than right now."

Arbitrary Argument List

  • When to Use:

    • Defined with *args, this allows functions to accept a variable number of positional arguments. The args will be a tuple of additional positional arguments.

    • For example:

      def send(message, *args):
          pass
      • This can be called as send("42", "Frankie", "Benjy", "Trillian"), where args will be ("Frankie", "Benjy", "Trillian").
  • Caveat:

    • If the function receives multiple arguments of the same type, consider using a list or sequence for clarity. Instead of *args, define it explicitly, like send(message, recipients), and call it as send("42", ["Benjy", "Frankie", "Trillian"]).

Arbitrary Keyword Argument Dictionary

  • When to Use:

    • Defined with **kwargs, this allows passing an unspecified number of named arguments, captured as a dictionary.

    • This is useful when functions need to accept flexible arguments, such as logging functions.

    • Example:

      def log_message(message, **kwargs):
          pass
  • Caveat:

    • Similar to *args, use **kwargs only when necessary, as simplicity and clarity should take precedence. If the simpler approach suffices, avoid using complex constructs.
  • Custom Naming:

    • The names *args and **kwargs can be replaced with more meaningful names that better represent their purpose in the context of the function.

Best Practices for Function Design

  • Clarity and Simplicity:

    • Strive to make functions easy to read and understand. The function name and arguments should convey the function's purpose without requiring additional explanation.
  • Flexibility:

    • Ensure that functions are easy to modify without breaking existing code. Adding a new keyword argument should not cause issues in other parts of the codebase.
  • Obviousness:

    • Align with Python’s philosophy: "There should be one—and preferably only one—obvious way to do it."

If the Implementation is Hard to Explain, It’s a Bad Idea

Advanced Features and Their Drawbacks

Python allows developers to perform a variety of advanced operations, including:

  • Custom Object Creation:

    • Changing how objects are created and instantiated.
  • Custom Import Behavior:

    • Modifying how the Python interpreter imports modules.
  • Embedding C Routines:

    • Integrating C code within Python for performance optimization.

While these capabilities provide flexibility, they often come with significant drawbacks:

  • Reduced Readability:

    • The use of complex or "magical" constructs can make code difficult to understand for other developers.
    • Readability should be prioritized, and any benefits gained from using advanced features must outweigh the loss in clarity.
  • Code Analysis Challenges:

    • Tools like pylint or pyflakes may struggle to parse and analyze "magic" code, leading to potential issues in automated code review and maintenance.

The Balance Between Power and Readability

  • Awareness and Caution:

    • Python developers should be aware of the extensive capabilities available to them, as this knowledge provides confidence in tackling complex problems.
    • However, knowing when not to use these advanced features is crucial.
  • Simplicity and Clarity:

    • Whenever possible, choose the simplest and most straightforward approach to achieve your goals.
    • If a piece of code is difficult to explain, it may be a sign that a simpler solution should be sought.
  • The Philosophy of Restraint

    • The ability to solve problems with complex techniques should not lead to their indiscriminate use.
  • Striking the Right Balance:

    • Strive to balance the use of Python's powerful features with the need for maintainable and understandable code.
    • Prioritize solutions that make code more accessible to others, ensuring that future developers can easily understand and extend it.

We Are All Responsible Users

  • Python offers flexibility and power, allowing developers to perform a wide range of operations, even potentially dangerous ones.
  • This principle reflects a culture of trust and accountability within the Python community.

Flexibility and Responsibility

  • No "Private" Keyword:

    • Python does not have a private keyword to enforce strict access control over an object's properties and methods.
    • This design choice is different from more defensive languages like Java, which provide mechanisms to prevent misuse by strictly enforcing encapsulation.
  • Encapsulation through Conventions:

    • In Python, encapsulation is achieved through conventions rather than enforced access controls.
    • Convention: Underscore Prefix:
      • The main convention for indicating private properties and methods is to prefix their names with an underscore (e.g., _internal_method or _private_property).
      • This convention signals to developers that these elements are intended for internal use and should not be accessed directly by client code.
  • Example:

    class Example:
        def __init__(self):
            self._internal_data = "This is private"
    
        def _internal_method(self):
            print("This is a private method")
  • Client Code Responsibility:

    • If client code chooses to access or modify these underscored elements, any resulting issues or misbehavior are considered the responsibility of the client code.

Benefits of Using Conventions

  • Clear Separation of Duties:

    • Prefixing internal methods and properties with an underscore helps maintain a clear separation between public and private interfaces.
    • It allows developers to signal which parts of the code are stable and intended for external use versus those that are subject to change.
  • Ease of Modification:

    • Adopting this convention makes it easier to modify existing code.
    • Developers can always decide to make a private property public if needed, but making a public property private is more challenging and can break client code.

Return Values from One Place

  • When writing complex functions, it is common to use multiple return statements.
  • However, to maintain clarity and readability, it's best to return meaningful values from as few points in the function as possible.

Two Ways to Exit a Function

Functions can typically exit in two ways:

  1. Upon Error:

    • If a function encounters an error or cannot perform its intended task, it may return a value like None or False.
    • In such cases, it's often appropriate to return early as soon as the incorrect context is detected. This helps flatten the function structure and makes it clear that subsequent code assumes the condition for successful execution has been met.
  2. With a Return Value:

    • When a function completes its task successfully, it returns a meaningful value that represents the result of its operations.

The Importance of a Single Exit Point

  • Clarity and Debugging:

    • Having a single return point at the end of a function enhances clarity and makes debugging easier.
    • When multiple return statements are scattered throughout a function, it can be difficult to determine which one is responsible for producing the final result.
  • Code Maintenance:

    • A single exit point often indicates that the function's logic is clear and well-organized.
    • It helps factor out common code paths and can signal when refactoring is needed to improve the function's design.

Conventions

Alternatives to Checking for Equality

  • When evaluating conditions in Python, you don't always need to explicitly compare a value to True, None, or 0.

  • Instead, you can leverage Python's built-in truth value testing.

  • Check the Value:

    if attr:
        print('attr is truthy!')
  • Check for the Opposite:

    if not attr:
        print('attr is falsey!')
  • Check for Explicit True:

    if attr is True:
        print('attr is True')
  • Check for Explicit None:

    if attr is None:
        print('attr is None!')

Accessing Dictionary Elements

When accessing dictionary elements, use the in keyword or dict.get() instead of the deprecated dict.has_key() method:

  • Using dict.get() with a Default:

    d = {'hello': 'world'}
    
    print(d.get('hello', 'default_value'))  # Output: world
    print(d.get('howdy', 'default_value'))  # Output: default_value
  • Using in Keyword:

    if 'hello' in d:
        print(d['hello'])  # Output: world

Manipulating Lists

Python provides several ways to manipulate lists, with list comprehensions being the most concise and readable. Additionally, the map() and filter() functions offer alternative approaches:

  • Using List Comprehension:

    a = [3, 4, 5]
    b = [i for i in a if i > 4]  # List comprehension is clearer
  • Using filter():

    b = filter(lambda x: x > 4, a)
  • List Comprehension for Mapping:

    a = [3, 4, 5]
    a = [i + 3 for i in a]
  • Using map():

    a = map(lambda i: i + 3, a)
  • Use enumerate() for Indexing:

    a = ["icky", "icky", "icky", "p-tang"]
    for i, item in enumerate(a):
        print(f"{i}: {item}")

    Output:

    0: icky
    1: icky
    2: icky
    3: p-tang
    

Continuing a Long Line of Code

When a line of code exceeds the accepted length limit, it's important to split it across multiple lines. Instead of using a backslash for line continuation, which can be error-prone, use parentheses:

  • Using Parentheses for Line Continuation:

    french_insult = (
        "Your mother was a hamster, and "
        "your father smelt of elderberries!"
    )
  • Parentheses, Braces, or Brackets:

    • The same behavior applies to curly braces {} and square brackets []. If a line ends with an open bracket or brace, the interpreter will continue reading until it finds the matching closing bracket or brace.

Idioms

Unpacking

Unpacking allows you to assign names to elements of a list or tuple when you know its length. This technique can simplify code and enhance readability.

  • Basic Unpacking:

    filename, ext = "my_photo.orig.png".rsplit(".", 1)
    print(filename, "is a", ext, "file.")  # Output: my_photo.orig is a png file.
  • Swapping Variables:

    a, b = b, a
  • Nested Unpacking:

    a, (b, c) = 1, (2, 3)
  • Extended Unpacking (Python 3):

    a, *rest = [1, 2, 3]
    # a = 1, rest = [2, 3]
    
    a, *middle, c = [1, 2, 3, 4]
    # a = 1, middle = [2, 3], c = 4

Ignoring a Value

When unpacking, if you need to ignore certain values, use a double underscore (__) as a throwaway variable.

  • Ignore a Value:

    filename = 'foobar.txt'
    basename, __, ext = filename.rpartition('.')
  • Why Double Underscore?

    • A single underscore (_) is commonly used as an alias for the gettext.gettext() function and holds the last operation's result in interactive prompts. Using a double underscore reduces the risk of overwriting these uses.

Creating a Length-N List of the Same Thing

Use the Python list * operator to create a list of identical immutable items:

  • Creating a List of Nones:

    four_nones = [None] * 4
    print(four_nones)  # Output: [None, None, None, None]
  • Caution with Mutable Objects:

    • The * operator will create a list of N references to the same mutable object, which can lead to unintended behavior. Instead, use a list comprehension for mutable items:
    four_lists = [[] for __ in range(4)]
    four_lists[0].append("Ni")
    print(four_lists)  # Output: [['Ni'], [], [], []]

Joining Strings

A common idiom for creating strings is to use str.join():

  • Joining Strings:

    letters = ['s', 'p', 'a', 'm']
    word = ''.join(letters)
    print(word)  # Output: spam

Searching in Collections: Lists vs. Sets

When searching for elements, use sets for better performance due to their hash table implementation.

  • Example:

    x = list(('foo', 'foo', 'bar', 'baz'))
    y = set(('foo', 'foo', 'bar', 'baz'))
    
    print(x)  # Output: ['foo', 'foo', 'bar', 'baz']
    print(y)  # Output: {'foo', 'bar', 'baz'}
    
    'foo' in x  # Linear search
    'foo' in y  # Hash table lookup
  • Performance Difference:

    • Lists perform linear searches, which can be slow for large collections. Sets and dictionaries use hash lookups, offering faster performance and automatically removing duplicates.

Exception-Safe Contexts

Python's with statement and context manager protocol, introduced by PEP 343, provides a cleaner, more readable way to manage resources like files or thread locks compared to try/finally clauses.

  • Using with Statement:

    import threading
    some_lock = threading.Lock()
    
    with some_lock:
        # Execute code safely within the lock
        print(
            "Look at me: I design coastlines.\n"
            "I got an award for Norway."
        )
  • Traditional try/finally:

    some_lock.acquire()
    try:
        # Execute code safely within the lock
        print(
            "Look at me: I design coastlines.\n"
            "I got an award for Norway."
        )
    finally:
        some_lock.release()
  • Contextlib Module:

    • The contextlib module in the standard library offers additional tools for context management, including contextlib.closing() to ensure that an object's close() method is called, and contextlib.suppress() to suppress exceptions in specific cases.
  • Example of contextlib.closing():

    from contextlib import closing
    with closing(open("outfile.txt", "w")) as output:
        output.write("Well, he's...he's, ah...probably pining for the fjords.")
    • However, using the with statement directly with file I/O is often simpler since file objects already implement the context manager protocol:
    with open("outfile.txt", "w") as output:
        output.write(
            "PININ' for the FJORDS?!?!?!? "
            "What kind of talk is that?, look, why did he fall "
            "flat on his back the moment I got 'im home?\n"
        )

Common Gotchas in Python

Mutable Default Arguments

  • What You Wrote:

    def append_to(element, to=[]):
        to.append(element)
        return to
  • What You Might Have Expected:

    my_list = append_to(12)
    print(my_list)  # Output: [12]
    
    my_other_list = append_to(42)
    print(my_other_list)  # Output: [42]
  • What Actually Happens:

    print(my_list)  # Output: [12]
    print(my_other_list)  # Output: [12, 42]
    • Explanation: A new list is created only once when the function is defined, not each time the function is called. This means the same list is used for each successive call, leading to unexpected results when the list is mutated.
  • What You Should Do Instead:

    Create a new object each time the function is called by using a default argument to signal that no argument was provided:

    def append_to(element, to=None):
        if to is None:
            to = []
        to.append(element)
        return to
  • When This Gotcha Isn’t a Gotcha:

    Sometimes, this behavior is used intentionally to maintain state between function calls, such as in a caching function:

    def time_consuming_function(x, y, cache={}):
        args = (x, y)
        if args in cache:
            return cache[args]
        # Perform time-consuming operation...
        result = some_heavy_operation(x, y)
        cache[args] = result
        return result

Late Binding Closures

Another common source of confusion is Python’s late binding of variables in closures, which can lead to unexpected results.

  • What You Wrote:

    def create_multipliers():
        return [lambda x: i * x for i in range(5)]
  • What You Might Have Expected:

    for multiplier in create_multipliers():
        print(multiplier(2), end=" ... ")
    # Expected Output: 0 ... 2 ... 4 ... 6 ... 8 ...
  • What Actually Happens:

    # Actual Output: 8 ... 8 ... 8 ... 8 ... 8 ...
    • Explanation: Python's closures are late binding, meaning the values of variables used in closures are looked up at the time the inner function is called, not when it is defined. In this case, by the time any of the returned functions are called, the loop has completed, and i is left with its final value of 4.
  • What You Should Do Instead:

    • Solution Using Default Arguments:

      def create_multipliers():
          return [lambda x, i=i: i * x for i in range(5)]
    • Solution Using functools.partial:

      from functools import partial
      from operator import mul
      
      def create_multipliers():
          return [partial(mul, i) for i in range(5)]
  • When This Gotcha Isn’t a Gotcha:

    Late binding is beneficial in many scenarios where you want your closures to reference the latest values of variables in the enclosing scope. Looping to create unique functions, however, is a scenario where it can lead to confusion.