Sharding PostgreSQL for Django Applications with Citus

Scaling a Django application often leads developers to a crossroads: continue scaling vertically by adding more RAM and CPU, or move toward horizontal scaling. While database sharding is a powerful solution for horizontal growth, it is often avoided due to its perceived complexity. However, you can achieve database sharding without leaving the PostgreSQL ecosystem or abandoning the Django ORM by using Citus.

What is Citus?

Citus is an open-source extension for PostgreSQL that transforms it into a distributed database. Rather than keeping all data on a single instance, Citus distributes your data across multiple machines, known as nodes.

Key features of Citus include:

Sharding: Data is split into smaller chunks called shards.
Distributed Storage: Shards are spread across multiple worker nodes.
Parallel Query Execution: Queries are executed in parallel across nodes for faster performance.
Horizontal Scalability: You can add more nodes to handle increased load.

To your Django application, Citus appears as a standard PostgreSQL database, allowing you to scale without switching to a different database engine.

Why Django and Citus Work Well Together

The primary advantage of using Citus with Django is that it requires no special drivers, SDKs, or heavy ORM modifications. Since Citus is a PostgreSQL extension, Django continues to communicate using standard SQL.

From the application's perspective, it is simply connecting to PostgreSQL. The Citus coordinator node receives the query and determines exactly which shard should handle the request. This separation of concerns allows you to maintain a clean codebase while Citus manages the underlying distributed logic.

The Key Concept: The Distribution Key

Citus does not automatically determine how to shard your data. You must define a distribution key (also known as a shard key), which is a column that dictates where each row is stored.

In a typical multi-tenant SaaS application, the distribution key is usually:

tenant_id
org_id
account_id

Example Django Model

Consider a standard Django model where we want to shard data by a tenant:

from django.db import models

class User(models.Model):
    tenant_id = models.UUIDField()
    email = models.EmailField()
    created_at = models.DateTimeField(auto_now_add=True)

After running your standard Django migrations, you distribute the table using a single SQL command in the Citus coordinator:

SELECT create_distributed_table('users', 'tenant_id');

Once this is configured, Citus handles the complexity of data placement.

How Sharding Works Internally

When your Django application executes a command, the workflow follows a specific path to ensure efficiency:

Django ORM: Executes a standard command, such as User.objects.create().
Citus Coordinator: Receives the SQL query.
Hashing: The coordinator hashes the tenant_id value.
Routing: The query is routed to the specific worker node containing the corresponding shard.

This process ensures that no custom routing logic is required within your Django views or models, and no custom ORM hacks are necessary.

UUIDs and Sharding Strategy

A common concern involves using UUIDs in distributed systems. If you shard by a unique user_id (where every row has a different ID), you lose the benefits of data locality.

The rule of thumb is to shard by a group, not an individual row.

Recommended: Use a UUID for the tenant_id. All users belonging to that tenant will live on the same shard, making joins and lookups efficient.
Avoid: Sharding by a unique row ID (like a primary key id), as this forces queries to scan multiple shards unnecessarily.

Practical Use Cases for Citus

Citus is specifically designed for high-growth scenarios where a single PostgreSQL instance becomes a bottleneck. It is a perfect fit for:

Multi-tenant SaaS Architecture: Keeping all data for one customer on a single shard.
Large Datasets: Tables containing millions or billions of rows.
Write-Heavy Workloads: Distributing write operations across multiple nodes.
Real-time Analytics: Running complex queries over event logs, audit logs, or telemetry data.

Common Mistakes and Limitations

While Citus is powerful, it is not a "magic button" for every application. You should avoid Citus if:

The App is a Simple CRUD App: If your dataset fits comfortably on a single node, the overhead of a distributed system is unnecessary.
Early Stage MVPs: Vertical scaling is usually sufficient and easier to manage in the early stages.
Heavy Random Joins: Performing joins across tables that are not sharded by the same key can be slow, as it requires moving data between nodes.
Vertical Scaling is Still Viable: If you can solve your performance issues by adding more CPU or RAM, that remains the simplest path.

Conclusion

Citus allows Django developers to scale PostgreSQL horizontally while remaining within the familiar Django ecosystem. By carefully choosing a distribution key and designing your schema for multi-tenancy, you can handle massive datasets and high-traffic workloads without the risk of a custom sharding implementation.

Key Takeaways:

Citus extends PostgreSQL to support distributed tables and parallel queries.
Django interacts with Citus as if it were a standard PostgreSQL database.
The choice of a distribution key (like tenant_id) is the most critical design decision.
Use Citus for multi-tenant SaaS or high-volume write workloads, but stick to plain PostgreSQL for smaller, simpler applications.