# PYTHON-DJANGO-SEC-003: Django SQL Injection via QuerySet.extra()

> **Severity:** HIGH | **CWE:** CWE-89 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** Django
- **URL:** https://codepathfinder.dev/registry/python/django/PYTHON-DJANGO-SEC-003
- **Detection:** `pathfinder scan --ruleset python/PYTHON-DJANGO-SEC-003 --project .`

## Description

This rule detects SQL injection vulnerabilities in Django applications where untrusted
user input from HTTP request parameters flows into the SQL string arguments of
QuerySet.extra() without proper parameterization.

Django's QuerySet.extra() is a legacy method that allows injecting raw SQL fragments
into queryset generation. It accepts params arguments for safe parameterization of
the where, select, and tables arguments. When developers embed request parameters
directly into these SQL strings, attackers can manipulate the generated SQL query.
The extra() method is particularly risky because its SQL fragments are injected
into different parts of the generated query (WHERE, SELECT, FROM clauses), giving
attackers multiple injection points.

Django's own documentation marks extra() as a last resort and recommends migrating to
annotate(), filter(), and RawSQL expressions. This rule helps identify both injection
vulnerabilities and opportunities to modernize the codebase.


## Vulnerable Code

```python
from django.db import connection
from django.db.models.expressions import RawSQL
from django.http import HttpRequest

# SEC-003: ORM .extra() with request data
def vulnerable_extra(request):
    where_clause = request.GET.get('filter')
    results = Article.objects.extra(where=[where_clause])
    return results
```

## Secure Code

```python
from django.http import JsonResponse
from django.db.models import F, Q
from django.db.models.expressions import RawSQL
from myapp.models import Product

def search_by_score(request):
    min_score = request.GET.get('min_score', '0')
    try:
        min_score = float(min_score)
    except ValueError:
        return JsonResponse({'error': 'Invalid score'}, status=400)
    # SECURE: Use annotate() with RawSQL and params instead of extra()
    products = Product.objects.annotate(
        computed_score=RawSQL("score * %s", (min_score,))
    ).filter(computed_score__gte=min_score)
    return JsonResponse({'products': list(products.values('id', 'name'))})

def filter_by_category(request):
    category = request.GET.get('category', '')
    # SECURE: Use standard ORM filter instead of extra(where=...)
    products = Product.objects.filter(category=category).values('id', 'name', 'price')
    return JsonResponse({'products': list(products)})

def filter_with_extra_safe(request):
    status = request.GET.get('status', '')
    # SECURE: Use extra() with params argument for parameterized SQL fragments
    products = Product.objects.extra(
        where=["status = %s"],
        params=[status]
    ).values('id', 'name')
    return JsonResponse({'products': list(products)})

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class DjangoORM(QueryType):
    fqns = ["django.db.models.Manager", "django.db.models.QuerySet"]
    patterns = ["*Manager", "*QuerySet"]
    match_subclasses = True

# Common Django request sources
_DJANGO_SOURCES = [
    calls("request.GET.get"),
    calls("request.POST.get"),
    calls("request.GET"),
    calls("request.POST"),
    calls("request.COOKIES.get"),
    calls("request.FILES.get"),
    calls("*.GET.get"),
    calls("*.POST.get"),
]


@python_rule(
    id="PYTHON-DJANGO-SEC-003",
    name="Django SQL Injection via ORM .extra()",
    severity="HIGH",
    category="django",
    cwe="CWE-89",
    tags="python,django,sql-injection,orm-extra,OWASP-A03,CWE-89",
    message="User input flows to .extra() query. Use .annotate() or parameterized queries instead.",
    owasp="A03:2021",
)
def detect_django_extra_sqli():
    """Detects request data flowing to QuerySet.extra()."""
    return flows(
        from_sources=_DJANGO_SOURCES,
        to_sinks=[
            DjangoORM.method("extra").tracks(0),
            calls("*.objects.extra"),
            calls("*.extra"),
        ],
        sanitized_by=[
            calls("escape"),
            calls("escape_string"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Migrate extra() usages to modern ORM equivalents: filter() for WHERE conditions, annotate() for computed columns, and select_related() or prefetch_related() for joins.
- When extra() must be used, always pass user input through the params argument rather than embedding it in the SQL string arguments.
- Treat the where, select, and tables arguments of extra() as SQL templates that must never contain user-controlled content directly.
- Use Django's annotate() with RawSQL expressions as a safer alternative to extra(select=...) since RawSQL explicitly supports parameterization.
- Review all extra() calls during security audits as they represent legacy patterns that warrant modernization and carry heightened injection risk.

## Security Implications

- **Multi-Point SQL Injection:** Unlike cursor.execute() which has a single SQL string, extra() has multiple
injection points: where, select, tables, and order_by. Each argument that accepts
user input without parameterization is a separate injection vector. An attacker
who controls any of these can manipulate the generated query.

- **WHERE Clause Injection:** Injecting into the where argument allows attackers to append OR conditions to
bypass filtering, add subqueries to exfiltrate data from other tables, or use
UNION-based attacks to return arbitrary database content disguised as model data.

- **SELECT Injection and Data Leakage:** The select argument adds extra columns to the queryset. An attacker who controls
this argument can make Django return sensitive columns from other tables or
subquery results embedded in the queryset response.

- **FROM Clause Injection via Tables Parameter:** The tables argument adds extra tables to the FROM clause. An attacker can use
this to perform cross-join attacks, access tables outside the application's normal
data scope, or exploit implicit joins to exfiltrate data.


## FAQ

**Q: Why is extra() considered higher risk than cursor.execute() or raw()?**

extra() has multiple SQL injection surfaces: the where, select, tables, and
order_by arguments can all contain injected SQL. When a developer embeds user
input into any of these, the injection point is in a different part of the
generated query (WHERE, SELECT, or FROM clause). This multiplicity of injection
points makes extra() more dangerous than single-string methods like raw() or
cursor.execute().


**Q: Can I migrate from extra() to standard ORM to fix this finding?**

Yes, and that is the preferred fix. Most extra(where=...) patterns can be
replaced with filter() or Q() objects. extra(select=...) patterns typically
migrate to annotate() with Value(), F() expressions, or RawSQL with params.
The Django documentation includes a migration guide from extra() to modern ORM
alternatives. This migration eliminates the vulnerability and improves code
maintainability.


**Q: Does this rule trigger on extra() calls that use the params argument?**

No. When user input is passed as the params argument (not embedded in the SQL
strings), the taint analysis confirms the SQL strings are static literals and
will not flag the call. For example, extra(where=["col = %s"], params=[user_val])
is safe and will not be flagged.


**Q: Are there cases where extra() is the only option in Django?**

In rare cases involving database-specific SQL features not exposed through the
ORM, extra() may still be used. In these cases, ensure all user-controlled values
go through the params argument. Consider whether the database-specific feature
can be moved to a stored procedure or handled with RawSQL expressions instead.


**Q: What Django version introduced better alternatives to extra()?**

Django has been providing alternatives since version 1.8 with conditional
expressions and 2.0 with window functions. By Django 3.x, the ORM can handle
virtually all use cases that previously required extra(). The Django documentation
explicitly states that extra() may be deprecated in a future release, making
migration a maintenance priority in addition to a security priority.


**Q: How do I handle dynamic ORDER BY direction (ASC/DESC) safely without extra()?**

Use an allowlist: validate that the direction string equals 'asc' or 'desc' (case
insensitive), then construct a string like '-field_name' for descending or 'field_name'
for ascending and pass it to order_by(). Never pass user-controlled column names or
directions directly to order_by() or extra() without allowlist validation.


## References

- [CWE-89: SQL Injection](https://cwe.mitre.org/data/definitions/89.html)
- [OWASP SQL Injection Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html)
- [Django QuerySet.extra() documentation](https://docs.djangoproject.com/en/stable/ref/models/querysets/#extra)
- [Django Security - SQL Injection Protection](https://docs.djangoproject.com/en/stable/topics/security/#sql-injection-protection)
- [Django RawSQL expression](https://docs.djangoproject.com/en/stable/ref/models/expressions/#django.db.models.expressions.RawSQL)
- [OWASP SQL Injection](https://owasp.org/www-community/attacks/SQL_Injection)

---

Source: https://codepathfinder.dev/registry/python/django/PYTHON-DJANGO-SEC-003
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
