# PYTHON-DJANGO-SEC-004: Django SQL Injection via RawSQL Expression

> **Severity:** CRITICAL | **CWE:** CWE-89 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** Django
- **URL:** https://codepathfinder.dev/registry/python/django/PYTHON-DJANGO-SEC-004
- **Detection:** `pathfinder scan --ruleset python/PYTHON-DJANGO-SEC-004 --project .`

## Description

This rule detects SQL injection vulnerabilities in Django applications where untrusted
user input from HTTP request parameters flows into the SQL string argument of
RawSQL() expressions without proper parameterization.

Django's RawSQL() expression allows embedding raw SQL fragments within ORM querysets
using annotate(), filter(), and order_by(). It is designed as a safe replacement for
the deprecated extra() method and explicitly supports a params argument for
parameterized values. When developers embed request parameters directly into the SQL
string of a RawSQL() expression instead of using the params argument, they introduce
SQL injection at the ORM annotation level.

RawSQL injection is particularly insidious because the expression appears in a modern,
ORM-integrated pattern, leading developers to assume ORM-level protection exists.
The SQL string is passed verbatim to the database query builder without escaping.


## Vulnerable Code

```python
from django.db import connection
from django.db.models.expressions import RawSQL
from django.http import HttpRequest

# SEC-004: RawSQL expression with request data
def vulnerable_rawsql(request):
    order = request.GET.get('order')
    expr = RawSQL(f"SELECT * FROM products ORDER BY {order}", [])
    return expr
```

## Secure Code

```python
from django.http import JsonResponse
from django.db.models.expressions import RawSQL
from myapp.models import Product, Order

def get_products_with_discount(request):
    discount_rate = request.GET.get('discount', '0.1')
    try:
        discount_rate = float(discount_rate)
        if not 0.0 <= discount_rate <= 1.0:
            raise ValueError("Out of range")
    except ValueError:
        return JsonResponse({'error': 'Invalid discount rate'}, status=400)
    # SECURE: Pass user input via the params tuple, not in the SQL string
    products = Product.objects.annotate(
        discounted_price=RawSQL("price * (1 - %s)", (discount_rate,))
    ).values('id', 'name', 'discounted_price')
    return JsonResponse({'products': list(products)})

def get_orders_summary(request):
    currency = request.GET.get('currency', 'USD')
    # SECURE ALTERNATIVE: Use Django ORM expressions instead of RawSQL
    from django.db.models import F, ExpressionWrapper, DecimalField
    orders = Order.objects.annotate(
        converted=ExpressionWrapper(F('total'), output_field=DecimalField())
    ).values('id', 'converted')
    return JsonResponse({'orders': list(orders)})

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class DjangoExpressions(QueryType):
    fqns = ["django.db.models.expressions"]

# Common Django request sources
_DJANGO_SOURCES = [
    calls("request.GET.get"),
    calls("request.POST.get"),
    calls("request.GET"),
    calls("request.POST"),
    calls("request.COOKIES.get"),
    calls("request.FILES.get"),
    calls("*.GET.get"),
    calls("*.POST.get"),
]


@python_rule(
    id="PYTHON-DJANGO-SEC-004",
    name="Django SQL Injection via RawSQL Expression",
    severity="CRITICAL",
    category="django",
    cwe="CWE-89",
    tags="python,django,sql-injection,rawsql,OWASP-A03,CWE-89",
    message="User input flows to RawSQL() expression. Use parameterized queries.",
    owasp="A03:2021",
)
def detect_django_rawsql_sqli():
    """Detects request data flowing to RawSQL()."""
    return flows(
        from_sources=_DJANGO_SOURCES,
        to_sinks=[
            DjangoExpressions.method("RawSQL"),
            calls("RawSQL"),
            calls("*.RawSQL"),
        ],
        sanitized_by=[
            calls("escape"),
        ],
        propagates_through=PropagationPresets.standard(),
        scope="global",
    )
```

## How to Fix

- Always pass user-controlled values via the params tuple (second argument) of RawSQL(), never by embedding them in the SQL string.
- Consider using Django ORM expressions (F(), Value(), Case(), When(), ExpressionWrapper()) as type-safe alternatives to RawSQL() that require no parameterization.
- Validate and cast numeric parameters with int() or float() before using them in queries, even when they will be parameterized.
- Treat the SQL string argument of RawSQL() as a static template constant -- if it contains runtime values, those must go in params.
- Review all RawSQL() usages during code review to verify the params argument is used for any values that could be user-influenced.

## Security Implications

- **Annotation-Level Data Exfiltration:** An attacker who controls the SQL string of a RawSQL() annotation can use
subqueries to exfiltrate data from arbitrary tables, embedding the results as
annotation values in the queryset response. The application may serialize these
attacker-controlled annotation values back to the client.

- **WHERE Clause Manipulation via Filtered Annotations:** When RawSQL() is used in filter() conditions, injection into the SQL string allows
attackers to append OR conditions, bypass access controls, or read records belonging
to other users by escaping the intended filter logic.

- **ORDER BY Injection:** RawSQL() used in order_by() with user-controlled SQL strings enables ORDER BY
injection attacks. On some databases, injection here can trigger time-based blind
SQL injection using CASE WHEN and sleep-like functions, allowing data extraction
character by character.

- **Stacked Query Execution:** On database backends that support multiple statements (e.g., PostgreSQL with
certain drivers), injection into RawSQL() may enable stacked queries, allowing
attackers to execute INSERT, UPDATE, DELETE, or DDL statements alongside the
intended query.


## FAQ

**Q: RawSQL() was introduced to replace extra() for safety. Can it still be unsafe?**

Yes. RawSQL() is designed to be used safely by putting user values in the params
tuple. The SQL string itself is still passed as a raw fragment to the database query
builder. If you embed user input into that string via f-strings or concatenation
instead of using params, you have the same SQL injection vulnerability as extra().
RawSQL() is only safe when the SQL string is a static literal.


**Q: How does RawSQL() injection differ from other Django SQL injection vectors?**

RawSQL() operates within the ORM annotation system, meaning injection here
affects annotations, filters, and ordering computed within queryset chain calls.
Unlike cursor.execute(), the injected SQL is a fragment embedded inside a
larger ORM-generated query, which can make extraction more complex for attackers
but does not reduce the fundamental risk.


**Q: Does using RawSQL() in annotate() give any additional protection?**

No. annotate() processes the RawSQL() expression object, which includes the
SQL string verbatim in the generated query. The ORM does not inspect or escape
the string. All safety guarantees come from proper use of the params argument.


**Q: What is the performance impact of replacing RawSQL() with ORM expressions?**

Django's built-in ORM expressions (F(), Value(), Case(), When(), Coalesce()) are
compiled to the same SQL as equivalent RawSQL() calls, often with identical
execution plans. For complex database-specific functions not in Django's built-in
set, use database functions (Func subclasses) which are also parameterizable.
Profile both approaches in your specific environment if performance is a concern.


**Q: Can I use RawSQL() safely with ORDER BY for user-specified sort columns?**

For sort direction (ASC/DESC), use Django's standard order_by() with F() and the
descending() method. For user-specified column names, validate against an allowlist
of permitted field names before including them in the RawSQL() SQL string -- column
names cannot be parameterized in SQL. Never pass raw user input as a column name.


**Q: How do I detect all RawSQL() usages in my codebase beyond this rule?**

Run: pathfinder scan --ruleset python/django/PYTHON-DJANGO-SEC-004 for injection
flows. For a full audit of all RawSQL() calls regardless of taint, use
PYTHON-DJANGO-SEC-005 (raw SQL audit rule) which performs a pattern match on
DjangoExpressions.method("RawSQL") without requiring taint analysis.


## References

- [CWE-89: SQL Injection](https://cwe.mitre.org/data/definitions/89.html)
- [OWASP SQL Injection Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html)
- [Django RawSQL Expression](https://docs.djangoproject.com/en/stable/ref/models/expressions/#django.db.models.expressions.RawSQL)
- [Django Security - SQL Injection Protection](https://docs.djangoproject.com/en/stable/topics/security/#sql-injection-protection)
- [OWASP SQL Injection](https://owasp.org/www-community/attacks/SQL_Injection)
- [Django Database Functions](https://docs.djangoproject.com/en/stable/ref/models/database-functions/)

---

Source: https://codepathfinder.dev/registry/python/django/PYTHON-DJANGO-SEC-004
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
