# PYTHON-DJANGO-SEC-051: Django mark_safe() Usage Audit

> **Severity:** MEDIUM | **CWE:** CWE-79 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** Django
- **URL:** https://codepathfinder.dev/registry/python/django/PYTHON-DJANGO-SEC-051
- **Detection:** `pathfinder scan --ruleset python/PYTHON-DJANGO-SEC-051 --project .`

## Description

This audit rule flags all usages of django.utils.html.mark_safe() in Django
applications regardless of whether user-controlled data is detected flowing into it.
It is a visibility rule designed to surface all mark_safe() call sites for manual
security review.

Django's template engine automatically escapes variables ({{ variable }}) to prevent
XSS. mark_safe() is a signal to the template engine that a string is already safe
HTML and should not be escaped. When mark_safe() is called on a string that contains
unescaped user input, the protection is bypassed and XSS becomes possible.

mark_safe() is legitimately used in custom template tags and filters that generate
controlled HTML, but it is frequently misused by developers who apply it to strings
containing user data without prior escaping. This audit rule identifies all call sites
so security reviewers can verify each one is used correctly.


## Vulnerable Code

```python
from django.http import HttpResponse, HttpResponseBadRequest
from django.utils.safestring import mark_safe, SafeString
from django.utils.html import html_safe

# SEC-051: mark_safe (audit)
def risky_mark_safe():
    content = "<script>alert(1)</script>"
    return mark_safe(content)
```

## Secure Code

```python
from django.utils.html import mark_safe, escape, format_html

# SECURE: mark_safe() on a fully static string with no user data
LOADING_SPINNER = mark_safe('<div class="spinner" aria-label="Loading"></div>')

# SECURE: Use format_html() which escapes interpolated values automatically
def render_user_badge(username, role):
    # format_html() escapes all {} arguments, returns a SafeString
    return format_html('<span class="badge badge-{}">{}</span>', role, username)

# SECURE: Escape first, then mark safe
def render_highlighted(user_text):
    escaped = escape(user_text)
    return mark_safe(f'<mark>{escaped}</mark>')

# SECURE: Use bleach to allow a safe subset of HTML tags
import bleach
ALLOWED_TAGS = ['b', 'i', 'em', 'strong', 'a']
ALLOWED_ATTRS = {'a': ['href', 'title']}

def render_comment(raw_html):
    cleaned = bleach.clean(raw_html, tags=ALLOWED_TAGS, attributes=ALLOWED_ATTRS)
    return mark_safe(cleaned)

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, flows, QueryType
from codepathfinder.presets import PropagationPresets

class DjangoSafeString(QueryType):
    fqns = ["django.utils.safestring", "django.utils.html"]

_DJANGO_SOURCES = [
    calls("request.GET.get"),
    calls("request.POST.get"),
    calls("request.GET"),
    calls("request.POST"),
    calls("request.COOKIES.get"),
    calls("request.FILES.get"),
    calls("*.GET.get"),
    calls("*.POST.get"),
]


@python_rule(
    id="PYTHON-DJANGO-SEC-051",
    name="Django mark_safe() Usage (Audit)",
    severity="MEDIUM",
    category="django",
    cwe="CWE-79",
    tags="python,django,xss,mark-safe,audit,CWE-79",
    message="mark_safe() bypasses Django auto-escaping. Ensure input is properly sanitized.",
    owasp="A03:2021",
)
def detect_django_mark_safe():
    """Audit: detects mark_safe() usage that bypasses auto-escaping."""
    return DjangoSafeString.method("mark_safe")
```

## How to Fix

- Use format_html() instead of mark_safe() with f-strings; format_html() automatically escapes all interpolated arguments.
- When mark_safe() must be used, ensure all user-controlled values are passed through escape() first, or use bleach.clean() for rich text that allows a controlled subset of HTML.
- Never apply mark_safe() directly to request.GET or request.POST values without prior escaping.
- Review all custom template tags and filters that return mark_safe() values to verify they escape any context-provided data.
- Prefer Django's |escape template filter for in-template escaping and format_html() for Python-side HTML construction over manual mark_safe() patterns.

## Security Implications

- **Auto-escaping Bypass Leading to XSS:** Django templates auto-escape {{ variable }} to prevent XSS. mark_safe() tells
the template engine to skip escaping. If a mark_safe() call is applied to a
string that contains unescaped user input, that input is rendered raw in the
browser and can contain malicious script tags or event handler attributes.

- **Latent Risk from Refactoring:** A mark_safe() call that is currently safe (applied to a hardcoded HTML string)
becomes unsafe if a developer later adds user input to the string before the
mark_safe() call. This rule flags all mark_safe() usages so they are reviewed
whenever the surrounding code changes.

- **Template Tag and Filter Vulnerabilities:** Custom template tags and filters that use mark_safe() to return HTML are a
common location for XSS vulnerabilities. If the tag or filter incorporates
arguments passed from template context (which may originate from user input)
without escaping them, the result is XSS.

- **Chained mark_safe() and String Concatenation:** mark_safe() on a safe string followed by concatenation with an unsafe string
creates a SafeString that propagates the safety flag. The resulting concatenation
will not be escaped in templates, even though the appended string may be unsafe.


## FAQ

**Q: Why does this rule flag mark_safe() calls that are clearly safe (applied to static strings)?**

This is an audit rule that provides visibility into all mark_safe() usages,
not just unsafe ones. The purpose is to create a reviewable inventory. A
mark_safe() call on a static string is safe today, but if a developer later
modifies the code to include a user-controlled value in that string, the
mark_safe() call makes it unsafe. Auditing all call sites catches such
regressions before they reach production.


**Q: What is the difference between mark_safe() and format_html()?**

format_html() is like Python's str.format() but it escapes all interpolated
arguments using escape() before substituting them, and returns a SafeString.
It is the recommended way to construct HTML strings from user data in Django.
mark_safe() simply tags an existing string as safe without performing any
escaping -- it is appropriate only when the string has already been escaped
or when it is a static HTML literal.


**Q: Can I use mark_safe() with bleach.clean() output?**

Yes. bleach.clean() sanitizes HTML by stripping or escaping disallowed tags
and attributes. Its output is safe to pass to mark_safe(). Ensure your
bleach.clean() call uses a restrictive allowlist appropriate for your use
case and that the strip_comments=True option is set to prevent comment-based
injection.


**Q: Are there performance implications to replacing mark_safe() with format_html()?**

No meaningful performance difference. format_html() calls the same escape()
function that you would call manually before mark_safe(). The escaping is a
simple string replacement operation with negligible performance impact.
Use format_html() as the default approach for all HTML construction.


**Q: How do I find all mark_safe() usages that are actually unsafe in my codebase?**

Run PYTHON-DJANGO-SEC-050 (taint-based XSS detection) to find confirmed flows
from user input to HttpResponse. Run this audit rule (SEC-051) to find all
mark_safe() usages for manual review. Cross-reference: any mark_safe() call
where the argument contains user-controlled data without prior escape() or
bleach.clean() processing is an XSS vulnerability.


## References

- [CWE-79: Cross-site Scripting](https://cwe.mitre.org/data/definitions/79.html)
- [OWASP XSS Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html)
- [Django mark_safe() documentation](https://docs.djangoproject.com/en/stable/ref/utils/#django.utils.html.mark_safe)
- [Django format_html() documentation](https://docs.djangoproject.com/en/stable/ref/utils/#django.utils.html.format_html)
- [Django Template Auto-escaping](https://docs.djangoproject.com/en/stable/ref/templates/language/#automatic-html-escaping)
- [bleach HTML sanitization library](https://bleach.readthedocs.io/)

---

Source: https://codepathfinder.dev/registry/python/django/PYTHON-DJANGO-SEC-051
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
