How to write testable python code

David Parry Developer Advocate, Speaker, Writer May 05, 2025 5 min

Software development has to be one of the most complex endeavors we humans have ever undertaken. As Brian Kernighan astutely noted, the soul of software development lies in controlling complexity. Yet the reality of production software often betrays this principle — we create systems with layers of both intentional and accidental complexity that gradually become resistant to change, testing, and maintenance. The cost? Sporadic failures, mounting technical debt, and increasingly fragile systems.

Python is not immune to this problem. Untestable Python code lurks in even the most sophisticated codebases. You might spot it in that 200-line function mixing business logic with database calls, or that seemingly innocent module slowly accumulating global state. Despite the language’s rich testing ecosystem, many developers still struggle with code that actively resists testing.

Great Python tests emerge from thoughtful architecture, not testing frameworks. When your code respects boundaries and embraces modularity, tests flow naturally. No more mocking entire systems just to test a single function. No more debugging test failures caused by hidden state changes three modules away. Through careful design choices and strategic patterns, even complex Python applications can become a joy to test.

This guide breaks down battle-tested patterns for building testable Python systems, backed by examples from real-world applications.

Use pure functions to avoid side effects

Pure functions form the bedrock of testable Python code given how they operate like mathematical equations — given the same input, they always produce the same output, without touching anything else in the system. This makes them predictable, isolated, and free from side effects. When a function depends only on its inputs and consistently produces the same output, testing becomes straightforward and reliable.

Consider this common anti-pattern:

total_items = 0

def add_item(item):
    global total_items
    total_items += 1
    return item

This function modifies global state, making it difficult to test in isolation. In this case, each test run could produce different results based on the global variable’s value. The pure alternative elegantly sidesteps these issues:

def add_item(items, item):
    return items + [item]

But we can do more than just avoid global state. Pure functions create new objects instead of mutating existing ones and thus ensure input parameters remain unchanged. This immutability makes tests more reliable and easier to reason about. Here’s an example:

# Impure: Modifies input directly
def process_user_data(user):
    user['status'] = 'active'
    user['last_login'] = datetime.now()
    return user

# Pure: Creates new data without side effects
def process_user_data(user, timestamp):
    return {
        **user,
        'status': 'active',
        'last_login': timestamp
    }

While pure functions may demand more memory by creating new objects rather than modifying existing ones, this trade-off pays dividends in system reliability as each test becomes a simple contract verification.

Implement deterministic behavior

Testing becomes meaningless when your code behaves differently on each run. Python’s dynamic nature makes this particularly challenging. A function that works perfectly in development might fail mysteriously in production due to hidden time dependencies, random elements, or system state.

def process_order(order_id):
    timestamp = datetime.now()
    return {
        'id': order_id,
        'processed_at': timestamp,
        'status': 'completed'
    }

Each run of this function creates a unique timestamp, making verification impossible. But with a bit of smart designing, we can easily eliminate such uncertainties:

def process_order(order_id, timestamp=None):
    processed_at = timestamp or datetime.now()
    return {
        'id': order_id,
        'processed_at': processed_at,
        'status': 'completed'
    }

Property-based testing takes this concept further, verifying that deterministic behavior holds true across a wide range of inputs. Rather than testing specific cases, we define properties that must remain consistent regardless of input variations so that testing reveals true bugs rather than environmental noise.

Employ the Single Responsibility Principle

Most production codebases suffer from the “God class” syndrome — classes that try to do everything from data manipulation to business logic to persistence. This tight coupling creates a cascade of changes whenever requirements shift, turning simple updates into regression nightmares. Take this Python code for example:

class UserManager:
    def __init__(self, user_data):
        self.user = user_data
        
    def validate(self):
        self._check_email()
        self._store_in_db()
        self._send_welcome_email()
        
    def _check_email(self):
        # Email validation logic
        Pass

See how validation is intertwined with storage and notification logic? A change in email validation rules forces us to touch code that handles database operations. The solution lies in surgical separation:

class UserValidator:
    def validate_email(self, email):
        return bool(re.match(r"[^@]+@[^@]+\.[^@]+", email))

class UserNotifier:
    def __init__(self, messaging_service):
        self.messenger = messaging_service

True single responsibility transcends simple class separation. It demands careful consideration of boundaries between different domains of your application. When validation rules change, only validation code should need updates. When storage requirements evolve, only persistence logic should shift. It’s that simple.

Master dependency injection

Dependency injection in Python transcends the basic constructor parameter passing that most tutorials focus on. While passing dependencies through init works for simple cases, apps in the real world demand more sophisticated patterns.

Modern Python applications often juggle multiple external services, from databases to message queues to third-party APIs and each service represents a potential point of failure, a testing challenge. Dependency injection transforms these challenges into manageable interfaces.

Consider the traditional approach:

class OrderProcessor:
    def __init__(self):
        self.db = Database()
        self.payment = PaymentGateway()
        self.notifier = NotificationService()

When code directly creates its own dependencies — like database connections or API clients — it becomes hard to test and maintain. Dependency injection flips this around: instead of letting code create what it needs, we pass those dependencies to it.

Python’s flexible nature makes this particularly powerful. We can swap out real databases for test versions, replace API clients with mocks, or even change entire service implementations based on different scenarios. This flexibility, combined with proper abstraction, creates systems that are both robust and adaptable.

This approach makes testing straightforward: need to test how your code handles a database failure? Just pass in a mock database that simulates errors. Want to verify payment processing? Use a test payment service instead of hitting real payment APIs. The code remains the same; only the dependencies change.

Where to go from here

When code embraces testability through careful design, testing shifts from a burdensome afterthought to a natural part of the development flow. The principles outlined above—pure functions, deterministic behavior, single responsibility, and dependency injection—form the foundation for building systems that welcome change rather than resist it. Yet putting these practices into play across real-world, evolving codebases remains a challenge.

This is where tools like Qodo can make a meaningful impact. Qodo’s multi-agent code integrity platform brings test generation into the IDE through Qodo Gen. It offers a chat-based, semi-agentic workflow designed for deep context awareness. Developers receive structured guidance to create strong initial tests, making it easier to build on and expand test coverage. With tests running directly in the IDE, developers stay in full control of their code environment. The platform also recommends relevant mocks and frameworks, helping improve the quality of the first test. Designed to scale, Qodo continues to evolve, supporting more types of tests with minimal overhead. Its integration with Anthropic’s Model-Context-Protocol (MCP) adds additional context by pulling in data such as database structure, enabling even smarter suggestions.

By complementing good architecture with intelligent tooling, teams can move beyond just writing tests—they can build testable systems by design. Qodo helps bridge the gap between principle and practice, making testability not only achievable but sustainable. The result is software that’s more robust, easier to maintain, and ready to grow.

About the author

David Parry Developer Advocate, Speaker, Writer David Parry is a speaker, writer and YouTube influencer talking about AI in software development. David is an experienced Java architect with more than 25 years of hands-on experience. He’s known for giving sharp, practical talks at conferences like Jfokus and Devnexus, where he covers topics like AI-assisted testing, Spock, and how to breathe life into legacy codebases.