# PYTHON-FLASK-XSS-002: Flask Explicit Unescape with Markup

> **Severity:** MEDIUM | **CWE:** CWE-79 | **OWASP:** A03:2021

- **Language:** Python
- **Category:** Flask
- **URL:** https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-XSS-002
- **Detection:** `pathfinder scan --ruleset python/PYTHON-FLASK-XSS-002 --project .`

## Description

This rule detects calls to Markup() or markupsafe.Markup() in Flask applications. Markup
is a string subclass from the markupsafe library (which Flask and Jinja2 depend on) that
marks its content as safe HTML. When Jinja2 renders a template variable, it checks whether
the value is a Markup instance. If it is, autoescaping is skipped and the raw string is
inserted into the HTML output. If it is a plain string, HTML metacharacters are escaped.

Markup() is a legitimate API for generating trusted HTML programmatically (e.g., building
HTML tags in Python helpers). It becomes a vulnerability when applied to user-controlled
content. Markup(user_input) tells Jinja2 "this string is safe HTML" -- but if user_input
contains <script> tags or event handlers, those will be inserted verbatim into the page
and executed in the browser.

This is an audit-grade rule. Not every Markup() call is vulnerable -- wrapping a hardcoded
HTML string like Markup("<br>") is safe. The vulnerability arises when user-controlled data
flows into Markup() without prior sanitization. Every use of Markup() warrants review to
confirm the string being wrapped is developer-controlled, sanitized, or already escaped.

The detection uses Or(calls("Markup"), calls("markupsafe.Markup")) to catch both the
directly imported form (from markupsafe import Markup; Markup(...)) and the module-qualified
form (markupsafe.Markup(...)). Flask re-exports Markup via flask.Markup, but that form is
deprecated; this rule covers the two primary import paths.


## Vulnerable Code

```python
from markupsafe import Markup

html = Markup("<b>hello</b>")
```

## Secure Code

```python
from flask import Flask, render_template, request
from markupsafe import Markup, escape

app = Flask(__name__)

# SAFE approach 1: Pass user input as a plain string to the template.
# Jinja2 autoescaping handles HTML encoding of 'name'.
@app.route('/greet')
def greet():
    name = request.args.get('name', '')
    return render_template('greet.html', name=name)
    # In greet.html: <h1>Hello, {{ name }}!</h1>  -- autoescaped

# SAFE approach 2: If you must build HTML in Python, escape user input first,
# then wrap in Markup only after escaping.
def build_greeting(name: str) -> Markup:
    # escape() returns a Markup instance with HTML metacharacters escaped
    safe_name = escape(name)
    return Markup(f"<strong>{safe_name}</strong>")

# UNSAFE (do not do this):
# def build_greeting_unsafe(name: str) -> Markup:
#     return Markup(f"<strong>{name}</strong>")  # name not escaped -- XSS risk

```

## Detection Rule (Python SDK)

```python
from rules.python_decorators import python_rule
from codepathfinder import calls, Or, QueryType


@python_rule(
    id="PYTHON-FLASK-XSS-002",
    name="Flask Explicit Unescape with Markup",
    severity="MEDIUM",
    category="flask",
    cwe="CWE-79",
    tags="python,flask,markup,xss,audit,CWE-79",
    message="Markup() bypasses auto-escaping. Ensure input is trusted before wrapping in Markup().",
    owasp="A07:2021",
)
def detect_flask_markup_usage():
    """Detects Markup() usage which bypasses escaping."""
    return Or(
        calls("Markup"),
        calls("markupsafe.Markup"),
    )
```

## How to Fix

- Prefer passing user input as plain context variables to render_template() rather than wrapping in Markup(). Jinja2 autoescaping handles HTML encoding automatically.
- When building HTML in Python code (e.g., in template helper functions), always escape user-supplied strings with markupsafe.escape() before concatenating into a Markup instance.
- Use Markup() only for hardcoded HTML strings that are entirely developer-controlled and contain no user input, even indirectly.
- Review every Markup() call to trace the source of its argument. If any part of the argument can be influenced by user input, apply escape() first.
- Consider using a dedicated HTML sanitization library (bleach) for user-provided rich text content rather than wrapping unsanitized HTML in Markup().

## Security Implications

- **Reflected XSS via Markup-Wrapped User Input:** If user input is passed to Markup() and the result is rendered in a Jinja2 template,
the user's HTML/JavaScript is inserted verbatim into the page. An attacker can craft
a request with a payload like <img src=x onerror=alert(1)> that executes in the
victim's browser immediately on page load.

- **Stored XSS via Markup-Wrapped Database Content:** Applications that retrieve content from a database, wrap it in Markup(), and render
it in templates are vulnerable to stored XSS if an attacker can write HTML content
to the database through any input path. The Markup() call silently suppresses the
escaping that would otherwise protect against stored XSS.

- **Confused Developer Intent Propagation:** Markup instances propagate through string operations: Markup("safe") + user_input
returns a Markup instance containing the user input. Developers who build HTML strings
by concatenating Markup with plain strings may inadvertently mark unsafe content as
safe, especially across function boundaries or after code refactoring.

- **Bypass of Defense-in-Depth Escaping:** Even if other layers (input validation, CSP) partially mitigate XSS, Markup() removes
the last line of defense at the template rendering layer. An attacker who finds any
way to get malicious content into a Markup()-wrapped variable can bypass all other
controls at the output stage.


## FAQ

**Q: Is every Markup() call a vulnerability?**

No. Markup("<br>") or Markup("<strong>Bold text</strong>") with hardcoded strings are
perfectly safe. The vulnerability only arises when user-controlled content is wrapped
in Markup() without prior escaping. This rule flags all uses for review because the
call site alone does not reveal whether the argument is safe -- the data origin must
be traced.


**Q: What is the difference between Markup() and Markup.escape()?**

Markup(s) marks the string s as safe without modification -- whatever HTML is in s
will be rendered as HTML. Markup.escape(s) (equivalent to markupsafe.escape(s))
first HTML-encodes the string and then wraps the result in Markup, so the output is
safe to render. Always use escape() for user-supplied content and Markup() only for
developer-controlled HTML.


**Q: How does Markup propagate through string operations?**

markupsafe is designed so that string operations involving at least one Markup instance
return a Markup instance. For example, Markup("Hello, ") + name returns a Markup if
Markup("Hello, ") is the left operand -- but name is not escaped. This means XSS can
be introduced through string concatenation that mixes Markup and plain strings.


**Q: What is the right way to render user-provided rich text (HTML)?**

Use a dedicated HTML sanitization library like bleach to allow a specific allow-list
of safe HTML tags and attributes, then wrap the sanitized output in Markup(). For example:
  safe_html = bleach.clean(user_html, tags=['p', 'b', 'i', 'a'], strip=True)
  return Markup(safe_html)
Never render raw user HTML in a template without sanitization.


**Q: How do I run this rule in CI/CD?**

Run: pathfinder ci --ruleset python/flask/PYTHON-FLASK-XSS-002 --project .
The rule outputs SARIF, JSON, or CSV and can post inline pull request comments on GitHub.


**Q: Does this rule catch flask.Markup()?**

No. flask.Markup was deprecated in Flask 2.0 and removed in Flask 2.3. The current
canonical import path is markupsafe.Markup. This rule matches calls("Markup") and
calls("markupsafe.Markup"). If your codebase uses the deprecated flask.Markup import,
add calls("flask.Markup") to the Or pattern in the rule.


**Q: What is the difference between this rule and PYTHON-FLASK-XSS-001?**

PYTHON-FLASK-XSS-001 detects direct use of jinja2.Environment, which bypasses
Flask's autoescaping at the environment level -- affecting all variables. This rule
(XSS-002) detects Markup(), which bypasses autoescaping at the individual value level --
only the specific Markup-wrapped value is unescaped. Both are XSS risks, but through
different mechanisms. Together they provide comprehensive coverage of Flask's autoescaping
bypass vectors.


## References

- [CWE-79: Improper Neutralization of Input During Web Page Generation (XSS)](https://cwe.mitre.org/data/definitions/79.html)
- [markupsafe Markup Documentation](https://markupsafe.palletsprojects.com/en/stable/)
- [Jinja2 Autoescaping Documentation](https://jinja.palletsprojects.com/en/stable/api/#autoescaping)
- [OWASP XSS Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html)
- [OWASP A03:2021 Injection](https://owasp.org/Top10/A03_2021-Injection/)
- [bleach HTML Sanitization Library](https://bleach.readthedocs.io/en/latest/)

---

Source: https://codepathfinder.dev/registry/python/flask/PYTHON-FLASK-XSS-002
Code Pathfinder — Open source, type-aware SAST with cross-file dataflow analysis
